基于貝葉斯網(wǎng)絡(luò)的客戶信用風(fēng)險(xiǎn)評(píng)估及系統(tǒng)設(shè)計(jì)
本文關(guān)鍵詞: 貝葉斯網(wǎng)絡(luò) 違約風(fēng)險(xiǎn) 屬性約簡(jiǎn) 數(shù)據(jù)挖掘 出處:《云南大學(xué)》2014年碩士論文 論文類型:學(xué)位論文
【摘要】:信用風(fēng)險(xiǎn)又叫違約風(fēng)險(xiǎn),是銀行三大風(fēng)險(xiǎn)之一,也是銀行面臨的最主要風(fēng)險(xiǎn)。評(píng)估信用風(fēng)險(xiǎn)是銀行風(fēng)險(xiǎn)管理的關(guān)鍵過程,銀行可以基于評(píng)估結(jié)果采取相應(yīng)的規(guī)避,轉(zhuǎn)移,對(duì)沖風(fēng)險(xiǎn)的措施,避免風(fēng)險(xiǎn)進(jìn)一步惡化為損失。隨著信息技術(shù)的發(fā)展,銀行信用風(fēng)險(xiǎn)評(píng)估方法中也逐漸引入了信息科學(xué)的概念,不再是以往單純的數(shù)理統(tǒng)計(jì)方法,而將數(shù)據(jù)挖掘和知識(shí)發(fā)現(xiàn)等技術(shù)也用到風(fēng)險(xiǎn)評(píng)估系統(tǒng)中。數(shù)據(jù)挖掘技術(shù)拓展了概率論和數(shù)理統(tǒng)計(jì),將數(shù)學(xué)與計(jì)算機(jī)科學(xué)較好地結(jié)合在一起。貝葉斯網(wǎng)絡(luò)是可用于分類及預(yù)測(cè)的數(shù)據(jù)挖掘技術(shù),借助概率論中的條件概率和計(jì)算機(jī)強(qiáng)大的計(jì)算能力,以直觀的圖形方式向人們展示結(jié)點(diǎn)間的全局依賴關(guān)系。因?yàn)樨惾~斯網(wǎng)絡(luò)是個(gè)“白箱”模型,因而它具有很好的解釋性,便于被人們接受。 對(duì)貝葉斯網(wǎng)絡(luò)的研究涉及結(jié)構(gòu)學(xué)習(xí),參數(shù)學(xué)習(xí)和貝葉斯推理方法,內(nèi)容廣泛而充實(shí),因而本文系統(tǒng)地介紹了貝葉斯網(wǎng)絡(luò)學(xué)習(xí)和推理過程中一些常用的技術(shù)和方法,包括貝葉斯評(píng)分,似然評(píng)分,碰撞識(shí)別定向,Gibbs采樣等。數(shù)據(jù)挖掘系統(tǒng)是以數(shù)據(jù)挖掘技術(shù)為核心,兼有數(shù)據(jù)預(yù)處理模塊,可視化模塊的應(yīng)用系統(tǒng),本文以貝葉斯網(wǎng)絡(luò)模型為核心,設(shè)計(jì)一個(gè)可計(jì)算客戶違約概率的數(shù)據(jù)挖掘系統(tǒng),詳細(xì)設(shè)計(jì)了數(shù)據(jù)預(yù)處理模塊。 本文首先基于粗糙集理論得到經(jīng)過約簡(jiǎn)的最小指標(biāo)集,從而降低貝葉斯網(wǎng)絡(luò)學(xué)習(xí)的時(shí)間復(fù)雜度和準(zhǔn)確度。然后再采用貪婪搜索方法搜索后驗(yàn)概率最大的貝葉斯網(wǎng)絡(luò)模型,并采用EM方法學(xué)習(xí)缺失數(shù)據(jù)集的結(jié)點(diǎn)參數(shù)表,得到能反映客戶違約概率的貝葉斯網(wǎng)絡(luò)PDBN。在PDBN上,采用精確推理方法,得到測(cè)試集中每個(gè)客戶的違約概率。實(shí)際應(yīng)用中,概率值往往不夠簡(jiǎn)潔,因而本文進(jìn)一步基于違約概率對(duì)客戶分類,并與違約風(fēng)險(xiǎn)評(píng)估領(lǐng)域中常見的Logistic回歸模型和神經(jīng)網(wǎng)絡(luò)方法對(duì)比。為了評(píng)估分類錯(cuò)誤對(duì)銀行收益的影響,本文還考慮了損失矩陣。通過評(píng)估可以發(fā)現(xiàn)貝葉斯網(wǎng)的準(zhǔn)確性更高,解釋性更好。
[Abstract]:Credit risk, also called default risk, is one of the three major risks faced by banks, and it is also the most important risk faced by banks. Assessing credit risks is the key process of bank risk management, and banks can take corresponding evading and transferring based on the evaluation results. With the development of information technology, the concept of information science has been gradually introduced into the credit risk assessment methods of banks, and it is no longer a simple mathematical statistical method. Data mining and knowledge discovery are also used in risk assessment systems. Data mining extends probability theory and mathematical statistics. Bayesian network is a data mining technique that can be used for classification and prediction, with the help of conditional probability in probability theory and powerful computing power of computer. Because Bayesian network is a "white box" model, it has a good explanation and is easily accepted by people. The research of Bayesian network involves structural learning, parameter learning and Bayesian reasoning methods, so this paper systematically introduces some commonly used techniques and methods in the learning and reasoning process of Bayesian networks. The data mining system is a data mining system with data mining technology as the core, data preprocessing module and visualization module, this paper takes Bayesian network model as the core, the data mining system is composed of Bayesian score, likelihood score, collision identification orientation and Gibbs sampling. A data mining system which can calculate the probability of customer default is designed, and the data preprocessing module is designed in detail. In this paper, the reduced minimum index set is obtained based on rough set theory, so as to reduce the time complexity and accuracy of Bayesian network learning. Then the greedy search method is used to search the Bayesian network model with the largest posterior probability. Using EM method to learn the node parameter table of the missing data set, the Bayesian network PDBN, which can reflect the customer default probability, is obtained. On PDBN, the exact inference method is used to get the default probability of each customer in the test set. The probability value is often not simple enough, so this paper classifies the customer based on the default probability, and compares it with the Logistic regression model and neural network, which are commonly used in the field of default risk assessment, in order to evaluate the effect of the classification error on the bank income. The loss matrix is also considered in this paper. It is found that the Bayesian network has higher accuracy and better explanation.
【學(xué)位授予單位】:云南大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:F832.33;TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 張超;;公司違約概率模型及其在商業(yè)銀行中的應(yīng)用[J];華北金融;2010年04期
2 趙信文;楊永;張強(qiáng);;預(yù)估-校正LU-SGS的隱式算法[J];航空計(jì)算技術(shù);2012年04期
3 晏艷陽;蔣恒波;;信用評(píng)分模型應(yīng)用比較研究——基于個(gè)體工商戶數(shù)據(jù)的檢驗(yàn)[J];統(tǒng)計(jì)與信息論壇;2010年05期
4 何鳳;;我國(guó)商業(yè)銀行信用風(fēng)險(xiǎn)防范初探[J];現(xiàn)代經(jīng)濟(jì)信息;2011年02期
5 胡小建,楊善林,馬溪駿;基于聯(lián)結(jié)樹的貝葉斯網(wǎng)的推理結(jié)構(gòu)及構(gòu)造算法[J];系統(tǒng)仿真學(xué)報(bào);2004年11期
6 王小明;;關(guān)于一類廣義可加違約概率模型的探討[J];系統(tǒng)工程理論與實(shí)踐;2008年06期
7 厲海濤;金光;周經(jīng)倫;周忠寶;李大慶;;貝葉斯網(wǎng)絡(luò)推理算法綜述[J];系統(tǒng)工程與電子技術(shù);2008年05期
8 張小蓮;李群;殷明慧;葉星;鄒云;;一種引入停止機(jī)制的改進(jìn)爬山算法[J];中國(guó)電機(jī)工程學(xué)報(bào);2012年14期
相關(guān)博士學(xué)位論文 前1條
1 莫俊文;工程進(jìn)度網(wǎng)絡(luò)中工時(shí)的相依性研究[D];天津大學(xué);2010年
,本文編號(hào):1494037
本文鏈接:http://sikaile.net/jingjilunwen/guojijinrong/1494037.html