改進(jìn)的LMS-KNN近鄰分類方法研究
[Abstract]:As one of the classical machine learning algorithms, the nearest neighbor classification algorithm is suitable for multi-classification problems because it does not need to estimate parameters and is easy to implement. In recent years, it has been widely used in advertising, chat robot, network security, medical care, etc. Marketing planning and other fields have been widely used. Among them, the nearest neighbor classification algorithm based on local mean and class means, (Nearest neighbor classification based on local mean and class mean-LMS-KNN, is an improved algorithm for K-nearest neighbor classification (K-nearest neighbor classification) is insensitive to outliers and does not use global information of samples). Although the improved algorithm improves the classification accuracy and classification efficiency, it still has some drawbacks. The unbalance of data will affect the classification accuracy of LMS-KNN. At the same time, the algorithm involves the setting of many parameters, such as the selection of nearest neighbor value K, the determination of weight value, the selection of distance measure and so on. Therefore, in order to further improve the classification accuracy of the LMS-KNN algorithm, the following research work: 1) summarizes and analyzes several commonly used nearest neighbor classification methods and local mean and class mean nearest neighbor classification algorithms. In this paper, the principles, advantages and disadvantages of their algorithms are compared, and several optimization algorithms used in this paper are briefly introduced. In view of the effect of unbalanced data on LMS-KNN classification accuracy, the iterative nearest neighbor oversampling algorithm is used to preprocess the data. After processing the approximate equilibrium data set, the semi-supervised local mean and class mean are used to classify the parameters of the LMS-KNN classification algorithm. The cross-validation and the traditional iterative algorithm are used to determine the parameters of the LMS-KNN classification algorithm. In this paper, the cross-validation error of the classification algorithm is first modeled. Then the weight of the class mean vector is determined as a mathematical formula based on objective decision information, and the weighted weight is selected by the uniform iterative method of step size optimization. In order to optimize the parameter determination of LMS-KNN classification algorithm, genetic algorithm (Genetic Algorithm) can solve the nonlinearity without depending on the specific domain of the problem in order to optimize the parameter determination of LMS-KNN classification algorithm by improving the classification accuracy and classification efficiency of the traditional algorithm under the condition of balancing the subjective and objective decision rules. In this paper, a local mean and class mean nearest neighbor classification algorithm based on genetic algorithm is proposed. The weight of class mean is selected as initial population, and the classification error is used as evaluation function. The best class mean weight is selected by genetic iteration, and compared with the traditional KNNN LM-KNN (A local mean based nonparametric classifier) and LMS-KNN algorithm, the experimental results show that this method can effectively search the appropriate feature weights on the UCI dataset and obtain better classification accuracy.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP181
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 古平;楊煬;;面向不均衡數(shù)據(jù)集中少數(shù)類細(xì)分的過采樣算法[J];計(jì)算機(jī)工程;2017年02期
2 李彥冬;郝宗波;雷航;;卷積神經(jīng)網(wǎng)絡(luò)研究綜述[J];計(jì)算機(jī)應(yīng)用;2016年09期
3 曾勇;舒歡;胡江平;葛月月;;基于BP神經(jīng)網(wǎng)絡(luò)的自適應(yīng)偽最近鄰分類[J];電子與信息學(xué)報;2016年11期
4 安波;;人工智能與博弈論——從阿爾法圍棋談起[J];中國發(fā)展觀察;2016年06期
5 文志誠;陳志剛;;基于隱馬爾可夫模型的網(wǎng)絡(luò)安全態(tài)勢預(yù)測方法[J];中南大學(xué)學(xué)報(自然科學(xué)版);2015年10期
6 崔承剛;楊曉飛;;基于內(nèi)部罰函數(shù)的進(jìn)化算法求解約束優(yōu)化問題[J];軟件學(xué)報;2015年07期
7 蔣卓人;陳燕;高良才;湯幟;劉曉鐘;;一種結(jié)合有監(jiān)督學(xué)習(xí)的動態(tài)主題模型[J];北京大學(xué)學(xué)報(自然科學(xué)版);2015年02期
8 孟子健;馬江洪;;一種可選初始聚類中心的改進(jìn)k均值算法[J];統(tǒng)計(jì)與決策;2014年12期
9 李知藝;丁劍鷹;吳迪;文福拴;;步長優(yōu)化技術(shù)在交直流系統(tǒng)潮流計(jì)算中的應(yīng)用研究[J];華北電力大學(xué)學(xué)報(自然科學(xué)版);2014年03期
10 王秀巖;;決策樹算法及其應(yīng)用[J];電子技術(shù)與軟件工程;2014年05期
相關(guān)博士學(xué)位論文 前2條
1 于文華;數(shù)學(xué)問題解決中模式識別的影響因素研究[D];南京師范大學(xué);2012年
2 向曉林;非線性代數(shù)方程組與幾何約束問題求解[D];四川大學(xué);2003年
相關(guān)碩士學(xué)位論文 前6條
1 樊存佳;基于CHI和KNN的文本特征選擇與分類算法的研究[D];北京工業(yè)大學(xué);2016年
2 岳永鵬;深度無監(jiān)督學(xué)習(xí)算法研究[D];西南石油大學(xué);2015年
3 俞闖;半監(jiān)督學(xué)習(xí)中不平衡數(shù)據(jù)集分類研究[D];大連理工大學(xué);2015年
4 李俊平;人工智能技術(shù)的倫理問題及其對策研究[D];武漢理工大學(xué);2013年
5 徐曉艷;基于K近鄰算法的中文文本分類研究[D];安徽大學(xué);2012年
6 林麗;基于語義距離的文本聚類算法研究[D];廈門大學(xué);2007年
,本文編號:2185147
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2185147.html