基于用戶行為數(shù)據(jù)的P2P網(wǎng)貸違約預(yù)測
本文選題:違約預(yù)測 + P2P網(wǎng)貸 ; 參考:《上海師范大學(xué)》2017年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)金融的發(fā)展、P2P平臺數(shù)量的壯大以及網(wǎng)貸需求的迅速發(fā)展,基于網(wǎng)貸用戶的信用風(fēng)險評定以及違約預(yù)測變得尤為重要。網(wǎng)貸業(yè)務(wù)場景下,貸款額度通常較低,貸款量龐大,傳統(tǒng)的人工審批已不能滿足網(wǎng)貸業(yè)務(wù)場景的需求;并且,網(wǎng)貸客戶群體大多屬于無征信人群,僅憑基本信息對用戶進(jìn)行信用評定的方法亦難以有效界定用戶違約風(fēng)險。但實際上,網(wǎng)貸平臺依托于互聯(lián)網(wǎng),其天然存在一定的數(shù)據(jù)優(yōu)勢,充分利用好網(wǎng)貸平臺的現(xiàn)有數(shù)據(jù)并整合第三方數(shù)據(jù),同時深入挖掘用戶行為對其違約進(jìn)行預(yù)判是未來發(fā)展的一大方向。本文基于含用戶登錄日志以及用戶信息更新日志的貸款交易數(shù)據(jù),將其劃分為基本信息、第三方數(shù)據(jù)、地理信息、登錄日志、信息更新日志等六個分模塊進(jìn)行深入地挖掘與分析,并引入機器學(xué)習(xí)中特征工程的研究模型,對數(shù)據(jù)進(jìn)行拓展與提煉。經(jīng)本文研究發(fā)現(xiàn),貸款前頻繁更新個人信息的用戶較其他用戶而言更傾向于貸款違約。之后,本文基于特征工程提煉的信息,采用包裹式選擇與過濾式選擇相結(jié)合的方法,對其進(jìn)行進(jìn)一步的篩選與精簡,構(gòu)造出對于用戶違約最具預(yù)測能力的特征子集,并利用Xgboost算法框架進(jìn)行模型訓(xùn)練,得到準(zhǔn)確性及穩(wěn)定性均達(dá)預(yù)期水平的違約預(yù)測模型。通過深入分析本文所涉貸款數(shù)據(jù)的業(yè)務(wù)場景,結(jié)合模型搭建的分析流程,本文對于用戶行為日志數(shù)據(jù)在用戶違約預(yù)測上的應(yīng)用提出建議,認(rèn)為該數(shù)據(jù)適合作為反欺詐的規(guī)則提煉樣本,通過對數(shù)據(jù)的分析與建模,獲取預(yù)警指標(biāo),并將其部署于風(fēng)控模型主體的后端,用以對用戶風(fēng)險等級進(jìn)行調(diào)整或?qū)τ脩暨`約情況進(jìn)行預(yù)警并引入人工干預(yù)。
[Abstract]:With the development of Internet finance and the rapid development of P2P platform and the rapid development of network loan demand, credit risk assessment and default prediction based on Internet loan users become more and more important. Under the network loan business scenario, the loan amount is usually low, the loan amount is huge, the traditional manual examination and approval can not meet the demand of the network loan business scenario; moreover, the network loan customer group mostly belongs to the non-credit group. It is difficult to define the default risk of users only by the method of credit evaluation based on basic information. But in fact, the net loan platform relies on the Internet, it has certain data superiority naturally, make full use of the existing data of the network loan platform and integrate the third party data, At the same time, it is a major direction of future development to excavate the user's behavior in advance. Based on the loan transaction data including user logon log and user information update log, this paper divides the loan transaction data into basic information, third party data, geographical information, logon log. Six sub-modules, such as information update log, are used to deeply mine and analyze the data, and the research model of feature engineering in machine learning is introduced to expand and refine the data. It is found that users who update personal information frequently before loans are more likely to default on loans than other users. Then, based on the information extracted by feature engineering, this paper uses the method of package selection and filter selection to further screen and simplify it, and constructs the feature subset which has the most ability to predict user default. The Xgboost algorithm framework is used to train the model to obtain a default prediction model with accuracy and stability up to the expected level. Through the in-depth analysis of the business scenario of the loan data involved in this paper, combined with the analysis flow built by the model, this paper puts forward some suggestions for the application of user behavior log data in the prediction of user default. It is considered that the data is suitable for extracting samples as anti-fraud rules. By analyzing and modeling the data, the early warning index can be obtained and deployed in the back-end of the wind control model body. It is used to adjust the user's risk level or to warn the user of default and to introduce human intervention.
【學(xué)位授予單位】:上海師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:F724.6;F832.4
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 李淑錦;呂靖強;;基于BP神經(jīng)網(wǎng)絡(luò)的P2P網(wǎng)貸借款者的信用風(fēng)險評估[J];生產(chǎn)力研究;2016年04期
2 欒紅波;文福安;;數(shù)據(jù)挖掘在大學(xué)英語成績預(yù)測中的應(yīng)用研究[J];軟件;2016年03期
3 呂勇斌;姜藝偉;張小青;;我國P2P平臺網(wǎng)絡(luò)借貸逾期行為和羊群行為研究[J];統(tǒng)計與決策;2016年04期
4 李明初;;基于Probit的網(wǎng)絡(luò)借貸成功影響因素分析——以拍拍貸為例[J];會計之友;2016年04期
5 孫麗;;我國P2P網(wǎng)絡(luò)借貸的發(fā)展現(xiàn)狀及趨勢探析[J];中國商論;2016年01期
6 熊志斌;;信用評估中的特征選擇方法研究[J];數(shù)量經(jīng)濟技術(shù)經(jīng)濟研究;2016年01期
7 毛雯青;譚中明;;網(wǎng)貸融資信用評級問題探討——以拍拍貸、人人貸為例[J];金融經(jīng)濟;2015年22期
8 黃秋_g;史小康;;個人信用風(fēng)險評分的指標(biāo)選擇研究[J];新疆財經(jīng)大學(xué)學(xué)報;2015年03期
9 李先瑞;;大數(shù)據(jù)征信破解小微企業(yè)融資困境探討——以拍拍貸為例[J];會計之友;2015年13期
10 申端明;喬德新;許琨;林霞;江日念;;梯度漸進(jìn)回歸樹算法在電子商務(wù)品牌推薦中的應(yīng)用[J];計算機系統(tǒng)應(yīng)用;2015年06期
相關(guān)碩士學(xué)位論文 前7條
1 王夢佳;基于Logistic回歸模型的P2P網(wǎng)貸平臺借款人信用風(fēng)險評估[D];北京外國語大學(xué);2015年
2 程冠皓;基于數(shù)據(jù)的信用評級處理和分析系統(tǒng)的設(shè)計與實現(xiàn)[D];哈爾濱工業(yè)大學(xué);2015年
3 孫萬龍;基于GBDT的社區(qū)問題標(biāo)簽推薦技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2015年
4 劉暢;基于Logistic的P2P網(wǎng)絡(luò)貸款信用風(fēng)險測度研究[D];安徽財經(jīng)大學(xué);2015年
5 段昊;基于P2P網(wǎng)貸平臺特點的信用體系實證研究[D];北京郵電大學(xué);2015年
6 鄒潤;基于模型組合算法的用戶個性化推薦研究[D];南京大學(xué);2014年
7 袁羽;基于Logistic回歸的P2P網(wǎng)絡(luò)貸款信用風(fēng)險度量[D];上海社會科學(xué)院;2014年
,本文編號:1974472
本文鏈接:http://sikaile.net/jingjilunwen/guojimaoyilunwen/1974472.html