基于支持向量機的互聯(lián)網(wǎng)金融個人信用評估方法研究
本文選題:支持向量機 + bagging; 參考:《浙江財經(jīng)大學(xué)》2017年碩士論文
【摘要】:中國經(jīng)濟的快速發(fā)展,提高了居民信用消費能力;ヂ(lián)網(wǎng)金融的快速發(fā)展,為居民信用消費提供了便利,個人住房按揭貸款、個人小額貸款、信用卡消費貸款等信貸產(chǎn)品如雨后春筍般涌現(xiàn)。隨著中國經(jīng)濟進一步信用化,信用消費拉動經(jīng)濟增長的作用進一步凸顯,居民信用消費意愿和能力正穩(wěn)步上升。國內(nèi)各互聯(lián)網(wǎng)金融機構(gòu)紛紛把個人消費貸款業(yè)務(wù)作為未來的發(fā)展戰(zhàn)略之一。但是,國內(nèi)的互聯(lián)網(wǎng)金融機構(gòu)對個人消費貸款的風(fēng)險管理水平相對較低,管理手段和方法還比較落后。此外,互聯(lián)網(wǎng)金融機構(gòu)不存在有效的個人信用評估方法,這嚴(yán)重阻礙了個人信貸業(yè)務(wù)的發(fā)展。有效的信用評估模型不僅能增加互聯(lián)網(wǎng)金融機構(gòu)的利潤,而且還能擴大互聯(lián)網(wǎng)金融機構(gòu)的信貸規(guī)模。因此,個人信用評估方法的研究意義重大。在互聯(lián)網(wǎng)金融時代,信用數(shù)據(jù)獲取的方式發(fā)生了改變,不僅可以從傳統(tǒng)的金融機構(gòu)獲取信貸數(shù)據(jù),還可以從電商平臺獲取電商數(shù)據(jù)以及從社交平臺獲取社交數(shù)據(jù)。伴隨而來的是信用數(shù)據(jù)規(guī)模的大幅度增長,信用評級業(yè)務(wù)面臨著巨大的機遇和挑戰(zhàn),如果缺乏大數(shù)據(jù)的處理能力,就無法充分挖掘潛藏在海量信用數(shù)據(jù)背后的價值;ヂ(lián)網(wǎng)金融機構(gòu)已經(jīng)使用定量模型來評價消費者個人的信用風(fēng)險,它的研究重點之一是信用評價模型。支持向量機是數(shù)據(jù)驅(qū)動型模型,它在監(jiān)督式學(xué)習(xí)過程中對數(shù)據(jù)處理,不需要對數(shù)據(jù)做特別的假設(shè)。當(dāng)數(shù)據(jù)量豐富或容易獲取時,支持向量機的優(yōu)勢更加明顯,所以,它得到了學(xué)者的青睞。支持向量機的泛化能力相對其它的模型更好,本文提出了基于支持向量機的集成模型。基于大數(shù)據(jù)時代背景下對互聯(lián)網(wǎng)金融個人信用數(shù)據(jù)進行評估,本文在數(shù)據(jù)分析與整合方面進行探索分析。本文在現(xiàn)有研究的基礎(chǔ)上,提出了基于支持向量機的集成模型RSBC-SVM,它以支持向量機作為基學(xué)習(xí)器,結(jié)合了bagging和random subspace兩種常見的集成策略以及相關(guān)性最小化集成選擇方法。此外,它還使用了模式搜索算法進行參數(shù)優(yōu)化。RSBC-SVM模型的構(gòu)建分四個階段。第一個階段為數(shù)據(jù)分割,該階段先把原始數(shù)據(jù)分成初始訓(xùn)練集、驗證集和測試集等三部分。本文使用訓(xùn)練集的數(shù)據(jù)訓(xùn)練個體學(xué)習(xí)器,使用驗證集的數(shù)據(jù)挑選個體學(xué)習(xí)器,使用測試集的數(shù)據(jù)對所構(gòu)建的集成模型進行效果驗證。初始訓(xùn)練集經(jīng)過bagging和random subspace算法處理后又產(chǎn)生若干個新的訓(xùn)練子集。第二個階段為個體學(xué)習(xí)器的訓(xùn)練,在每一個新的訓(xùn)練子集上構(gòu)建相應(yīng)的支持向量機模型,并采用模式搜索算法調(diào)參。從個體學(xué)習(xí)器的角度分析,運用模式搜索算法尋找參數(shù),提高了個體學(xué)習(xí)器的泛化能力;從個體學(xué)習(xí)器之間的關(guān)系角度分析,模式搜索算法為每一個支持向量機模型匹配不同參數(shù),增強了個體學(xué)習(xí)器的多樣性。第三個階段為個體學(xué)習(xí)器的選擇,本文采用相關(guān)性最小化方法對集成模型進行修剪,減小集成規(guī)模有助于減小模型的存儲開銷和預(yù)測開銷,而且增強了個體學(xué)習(xí)器間的差異性。第四個階段為合成模型,此階段為RSBC-SVM模型構(gòu)建的最后一步,本階段先用Sigmoid函數(shù)將支持向量機的決策值輸出轉(zhuǎn)換成概率輸出,而后使用簡單平均法對個體學(xué)習(xí)器進行組合。本文最后還嘗試在互聯(lián)網(wǎng)金融個人信用數(shù)據(jù)上對所構(gòu)建的RSBC-SVM模型進行效果驗證。在數(shù)據(jù)實驗前需要對數(shù)據(jù)進行預(yù)處理,本文使用隨機森林方法插補缺失值,箱線圖法刪除異常數(shù)據(jù),使用對數(shù)變換和歸一化方法對變量進行處理。最后,與其它五種模型進行了對比分析,研究表明本文所構(gòu)造的模型性能最好,具有較強的現(xiàn)實意義。本文的理論創(chuàng)新點在于對支持向量機作了深入研究,提出了新的集成模型RSBC-SVM,豐富了支持向量機的理論研究。影響集成模型效果的因素之一是個體學(xué)習(xí)器間的差異性;個體學(xué)習(xí)器多樣性強,集成模型的效果就越好。在增強個體學(xué)習(xí)器的多樣性方面,以往學(xué)者的關(guān)注重點是數(shù)據(jù)擾動、特征擾動和參數(shù)擾動,他們忽視了在合成模型前對個體學(xué)習(xí)器的選擇研究。在互聯(lián)網(wǎng)金融的背景下,本文采用了相關(guān)性最小化集成模型選擇方法對個體學(xué)習(xí)器進行選擇,為集成模型的個體學(xué)習(xí)器的選擇研究提供了有益的參考。以上的研究,不僅在豐富支持向量機的內(nèi)容方面具有一定的理論意義,而且在推動我國信用體系建設(shè),提高我國互聯(lián)網(wǎng)金融機構(gòu)消費信貸市場的風(fēng)險管理水平,促進我國消費信貸市場的進一步發(fā)展方面具有一定的現(xiàn)實意義。
[Abstract]:The rapid development of China's economy has improved the capacity of residents' credit consumption. The rapid development of Internet finance has provided convenience for the residents' credit consumption, such as personal housing mortgage loans, personal small loans, credit card consumer loans and other credit products springing up. With the further credit of China's economy, credit consumption has stimulated the economy to increase. The long-term effect is further highlighted, and the willingness and ability of residents' credit consumption are rising steadily. The domestic Internet financial institutions have taken personal consumer loan business as one of the future development strategies. However, the risk management of personal consumer loans by internet financial institutions in China is relatively low, and the management means and methods are still relatively falling. In addition, there is no effective personal credit evaluation method for Internet financial institutions, which seriously hinders the development of personal credit business. An effective credit evaluation model can not only increase the profit of Internet financial institutions, but also expand the scale of credit of Internet financial institutions. Therefore, the research significance of personal credit evaluation method is heavy. In the era of Internet finance, the way of obtaining credit data has changed, not only from the traditional financial institutions to obtain credit data, but also from the e-commerce platform to obtain e-commerce data and to obtain social data from the social platform. Opportunities and challenges, if the lack of large data processing capacity, can not fully excavate the value hidden behind the mass credit data. The Internet financial institutions have used quantitative models to evaluate the consumer's personal credit risk. One of the focus of its research is the model of credit evaluation. Support vector machine is a data driven model, and it is a data driven model. In the process of supervised learning, data processing does not require a special assumption of data. When the amount of data is rich or easy to obtain, the advantage of support vector machine is more obvious. Therefore, it gets the favor of the scholars. The generalization ability of support vector machine is better than other models. This paper proposes an integrated model based on support vector machine. On the basis of the existing research, this paper proposes an integrated model RSBC-SVM based on support vector machine (SVM), which is based on support vector machine (SVM) as a base learner, combined with two kinds of bagging and random subspace. The common integration strategy and the correlation minimization integration selection method. In addition, it uses the pattern search algorithm to build the parameter optimization.RSBC-SVM model in four stages. The first phase is data segmentation, which first divides the original data into the initial training set, the validation set and the test set, and other three parts. This paper uses the training set. The data training individual learner, using the data of the verification set to select the individual learner, uses the data of the test set to verify the effect of the integrated model. After the initial training set is processed by bagging and random subspace algorithm, a number of new training subsets are produced. The second stage is the training of individual learner, in every one. A new subset of training subsets is constructed, and the model search algorithm is used to adjust the parameter. From the point of view of individual learner, the model search algorithm is used to search for parameters and improve the generalization ability of individual learner. From the angle of relationship between individual learners, the pattern search algorithm is each support vector machine. The model matches the different parameters and enhances the diversity of individual learner. The third stage is the choice of individual learner. This paper uses the correlation minimization method to trim the integrated model, reducing the size of the integration helps to reduce the storage overhead and the prediction overhead, and increases the difference between the individual learners. Fourth orders are enhanced. The stage is the synthetic model. This stage is the last step of building the RSBC-SVM model. In this stage, the Sigmoid function is used to convert the decision value output of support vector machine into probability output, and then the individual learner is combined with a simple mean method. Finally, this paper also tries to build the RSBC-SVM module on the Internet gold to integrate the personal credit data. It needs to preprocess the data before the data experiment. This paper uses the random forest method to interpolate the missing value, the box line graph method deletes the abnormal data, uses the logarithmic transformation and normalization method to deal with the variables. Finally, the comparison analysis is carried out with the other five models, and the research shows the model performance constructed in this paper. The theoretical innovation of this paper is to make a thorough study of the support vector machine, and put forward a new integrated model RSBC-SVM, which enriches the theoretical research of support vector machines. One of the factors that affect the effect of the integrated model is the difference between individual learners, the diversity of individual learners and the effectiveness of the integrated model. As for the diversity of the individual learner, the focus of previous scholars' attention is on data disturbance, characteristic disturbance and parameter disturbance. They ignore the selection of individual learners before the synthetic model. Under the background of Internet finance, this paper adopts the correlation minimization integration model selection method to individual learning. The selection of the device provides a useful reference for the selection of the individual learner of the integrated model. The above study not only has a certain theoretical significance in enriching the content of support vector machines, but also promotes the construction of the credit system in China, and improves the risk management level of the consumer credit market of the Internet financial institutions in China, and promotes the promotion of the risk management level of the consumer credit market of the Internet financial institutions in China. The further development of China's consumer credit market has certain practical significance.
【學(xué)位授予單位】:浙江財經(jīng)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:F724.6;F832.4
【參考文獻】
相關(guān)期刊論文 前10條
1 黃巍;張靚;唐友;;基于SVM算法的個人信用評估方法的完善[J];黑龍江八一農(nóng)墾大學(xué)學(xué)報;2016年02期
2 曹杰;邵笑笑;;基于信息增益和Bagging集成學(xué)習(xí)算法的個人信用評估模型研究[J];數(shù)學(xué)的實踐與認(rèn)識;2016年08期
3 李淑錦;呂靖強;;基于BP神經(jīng)網(wǎng)絡(luò)的P2P網(wǎng)貸借款者的信用風(fēng)險評估[J];生產(chǎn)力研究;2016年04期
4 石澄賢;陳雪交;;P2P網(wǎng)貸個人信用評價指標(biāo)體系的構(gòu)建[J];常州大學(xué)學(xué)報(社會科學(xué)版);2016年01期
5 楊雪雁;;商業(yè)銀行不良貸款問題研究[J];時代金融;2015年23期
6 ;P2P發(fā)展呈現(xiàn)新趨勢 壞賬率上升引發(fā)關(guān)注[J];北方金融;2015年05期
7 李揚;李竟翔;王園萍;;基于AUC回歸的不平衡數(shù)據(jù)特征選擇模型研究[J];統(tǒng)計與信息論壇;2015年05期
8 朱海;張紅梅;徐超;;基于相對熵的存貨質(zhì)押融資模式下中小企業(yè)信用評價[J];貴州工程應(yīng)用技術(shù)學(xué)院學(xué)報;2015年02期
9 孟杰;李春林;;基于隨機森林模型的分類數(shù)據(jù)缺失值插補[J];統(tǒng)計與信息論壇;2014年09期
10 張目;黃春燕;李巖;;基于相對熵和可變模糊集理論的戰(zhàn)略性新興產(chǎn)業(yè)企業(yè)信用評價[J];數(shù)學(xué)的實踐與認(rèn)識;2014年13期
,本文編號:1847618
本文鏈接:http://sikaile.net/jingjilunwen/guojimaoyilunwen/1847618.html