基于改進(jìn)SMOTE非均衡支持向量機(jī)的建模與應(yīng)用
[Abstract]:Support Vector Machine (SVM) is a classical classification method in machine learning algorithm, which has the advantages of good classification performance and fast training speed, especially in non-linear classification scenarios. Based on strict mathematical deduction and solid statistical methods, SVM has been widely used in industrial production and invasion. At the same time, with the development of social economy, personal credit has gradually been promoted to a more important position. With the continuous updating of data mining technology, machine learning based on large data has gradually replaced the method of manual screening. However, with the development of technology, the cost of data acquisition and storage decreases rapidly, and the complexity of data in classification problems increases with the rapid increase of data volume. For example, the data dimension increases constantly, and the data balance becomes more and more like a one-sided tilt. These changes bring about the problem of classification. For support vector machines, these problems seriously affect the classification performance of classical classifiers in specific scenarios. In order to deal with these problems caused by increasing data volume and more complex practical scenarios, it is necessary to fully consider unbalanced data, index complexity and so on according to the inherent characteristics of support vector machines. The impact on the classification results, starting from the root of the impact on classification performance, and then it is possible to improve the classical support vector machine, in the continuation of the strict theoretical basis of support vector machine, further enhance its application value. This paper systematically studies the classical support vector machine theory and its related theory. In this paper, we discuss the problem of dealing with data imbalance in support vector machine and the method of solution modeling and implementation, and propose an improved support vector machine algorithm with self-adaptive characteristics and good resistance to imbalance data. The main research contents of this paper are as follows: (1) The modeling and application of SVM classifier in fuzzy case are studied, and the SVM classifier based on interval number is studied. For the case of interval number in the sample, a sampling method based on hypercube fixed-point sampling is proposed. (2) The disadvantage of traditional SMOTE algorithm in dealing with unbalanced data without considering the meaning of the sample itself is analyzed, and the whole minority sample is operated. Based on SMOTE interpolation for minority samples, an improved oversampling method based on key index optimization is proposed. Finally, the complete model and algorithm flow of the improved SMOTE support vector machine with unbalanced data are given. (3) The influence of setting key indicators and related parameters in the process of using the improved SMOTE on the classification results is analyzed, and an optimized SOM based on information gain is proposed. TE Support Vector Machine (SVM) algorithm. Firstly, a hypercube vertex-sampled SMOTE support vector machine based on information gain is established, and then the parameters of the improved SMOTE-SVM model are automatically optimized by an optimization algorithm. Then the rationality of the algorithm parameters setting is enhanced, and the classification performance is improved. Finally, the specific flow of the combined algorithm is given. The practical problems faced by microfinance companies in credit risk assessment are analyzed, and their disadvantages in credit assessment are analyzed; the credit risk assessment index system is constructed according to the actual operation of microfinance companies; the improved support vector machine algorithm proposed in this paper is applied to practical problems, and is carried out with other classical classification algorithms. It compares the comprehensive performance of classification, analyzes the distribution of the key indicators of customer default from the key indicators, and finally carries out user portraits according to the typical characteristics of the two types of users.
【學(xué)位授予單位】:南京航空航天大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP181;F832.4
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 王鮮芳;王歲花;杜昊澤;王平;;基于模糊粗糙集和支持向量機(jī)的化工過程故障診斷[J];控制與決策;2015年02期
2 呂成戍;;基于代價(jià)敏感支持向量機(jī)的推薦系統(tǒng)托攻擊檢測方法[J];計(jì)算機(jī)工程與科學(xué);2014年04期
3 孟慶芳;陳珊珊;陳月輝;馮志全;;基于遞歸量化分析與支持向量機(jī)的癲癇腦電自動(dòng)檢測方法[J];物理學(xué)報(bào);2014年05期
4 林宇;黃迅;徐凱;;基于RU-SMOTE-SVM的金融市場極端風(fēng)險(xiǎn)預(yù)警研究[J];預(yù)測;2013年04期
5 陶新民;郝思媛;張冬雪;李震;;基于樣本特性欠取樣的不均衡支持向量機(jī)[J];控制與決策;2013年07期
6 袁飛;詹宜巨;王永華;;區(qū)間數(shù)模糊c均值聚類中相對(duì)位置相異度的研究[J];信號(hào)處理;2012年10期
7 彭宇;羅清華;王丹;彭喜元;;基于區(qū)間數(shù)聚類的無線傳感器網(wǎng)絡(luò)定位方法[J];自動(dòng)化學(xué)報(bào);2012年07期
8 姚瀟;余樂安;;模糊近似支持向量機(jī)模型及其在信用風(fēng)險(xiǎn)評(píng)估中的應(yīng)用[J];系統(tǒng)工程理論與實(shí)踐;2012年03期
9 朱明;陶新民;;基于隨機(jī)下采樣和SMOTE的不均衡SVM分類算法[J];信息技術(shù);2012年01期
10 韓立巖;宋曉東;姚偉龍;;基于改進(jìn)支持向量機(jī)的上市公司財(cái)務(wù)困境判別研究[J];管理評(píng)論;2011年05期
,本文編號(hào):2176868
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2176868.html