用于不平衡分類問題的自適應(yīng)加權(quán)極限學(xué)習(xí)機研究
發(fā)布時間:2018-01-07 06:31
本文關(guān)鍵詞:用于不平衡分類問題的自適應(yīng)加權(quán)極限學(xué)習(xí)機研究 出處:《深圳大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 不平衡數(shù)據(jù) 分類 不平衡分類 極限學(xué)習(xí)機 加權(quán)極限學(xué)習(xí)機
【摘要】:極限學(xué)習(xí)機是由新加坡南洋理工大學(xué)Huang等人于2006年提出,它是一種單隱層前饋型神經(jīng)網(wǎng)絡(luò)(SLFNs)學(xué)習(xí)算法。這種算法在學(xué)習(xí)過程中不需要調(diào)整網(wǎng)絡(luò)的輸入權(quán)值和隱藏層神經(jīng)元的偏置,只需要設(shè)置隱藏層神經(jīng)元節(jié)點的個數(shù)。通過使用最小二乘法產(chǎn)生唯一的最優(yōu)解,極大的提高了SLFNs網(wǎng)絡(luò)的訓(xùn)練速度,同時在某種程度上降低了過擬合的概率。但是,它依然受到數(shù)據(jù)分布不平衡的影響。2013年Zong等人在ELM基礎(chǔ)上采用加權(quán)的方式提出加權(quán)極限學(xué)習(xí)機(WELM)算法,將ELM算法很好的應(yīng)用在不平衡數(shù)據(jù)集之上。但是WELM的加權(quán)機制是固定的,對于二分類問題,多數(shù)類A的樣本總數(shù)為sumA,少數(shù)類B的樣本總數(shù)為sumB,它選擇給A類樣本添加1/sumA的權(quán)重值,給B類樣本添加1/sumB的權(quán)重值,這種方式顯然不是最優(yōu)解。本文從三個方面展開工作:第一,探討了隱含層輸出權(quán)重對極限學(xué)習(xí)機處理非平衡分類問題的影響。為了直觀的了解非平衡數(shù)據(jù)集是如何影響極限學(xué)習(xí)機性能,我們在多個數(shù)據(jù)集上,通過逐步增加數(shù)據(jù)集的不平衡比,從試驗中發(fā)現(xiàn),極限學(xué)習(xí)機正是在數(shù)據(jù)集平衡時取得最優(yōu)性能,數(shù)據(jù)的不平衡度對極限學(xué)習(xí)機的分類效果有著直接的影響。第二,提出了一種新的自適應(yīng)式隱含層輸出加權(quán)策略用以改進加權(quán)極限學(xué)習(xí)機的預(yù)測表現(xiàn)。加權(quán)極限學(xué)習(xí)機能夠有效的提升極限學(xué)習(xí)機在不平衡數(shù)據(jù)集上的分類性能,但是其加權(quán)機制過于武斷。本文從減小錯分樣本對分類器的影響入手,提出了自適應(yīng)加權(quán)極速學(xué)習(xí)機(SawELM),全新設(shè)計了計算輸出層權(quán)重的機制。該機制包括以下兩個模塊:1.逐步減小錯分訓(xùn)練樣本的權(quán)重2.動態(tài)更新錯分樣本的輸出層的值。SawELM的第一個模塊減少了錯誤分類樣本在計算輸出層權(quán)重的影響,第二個模塊告知SawELM去調(diào)整輸出層的權(quán)值。對WELM分類錯誤的樣本,一方面,在計算輸出層權(quán)重時弱化錯分實例的影響,另一方面,增大錯分樣本實例的輸出,使得錯分樣本可以被分類器更好的學(xué)習(xí)。第三,給出了充分的實驗比較證實自適應(yīng)加權(quán)極限學(xué)習(xí)機的可行性和有效性。本文從KEEL數(shù)據(jù)倉庫中隨機選取了50個二分類不平衡數(shù)據(jù)集,分別對比了SawELM,ELM以及WELM的三個指標(biāo):準(zhǔn)確率、G-mean和F1-measure。實驗結(jié)果顯示新設(shè)計的自適應(yīng)機制是有效的。同時,SawELM顯著的提升了WELM的不平衡分類性能。與ELM和WELM相比,SawELM的G-mean,F1-measure二個指標(biāo)顯著提升。與此同時,SawELM的準(zhǔn)確率要高于WELM并且與ELM不相上下。
[Abstract]:Extreme learning machine (LLM) was proposed by Huang et al., Nanyang Polytechnic University of Singapore in 2006. It is a single hidden layer feedforward neural network (SLFNs) learning algorithm, this algorithm does not need to adjust the network input weights and hidden layer neural network bias in the learning process. Only need to set the number of hidden layer neuron nodes, by using the least square method to generate the unique optimal solution, greatly improve the training speed of SLFNs network. And to some extent reduce the probability of overfitting. In 2013, Zong et al proposed a weighted extreme Learning Machine (WELM) algorithm based on ELM. The ELM algorithm is well applied to the unbalanced data set. But the weighting mechanism of WELM is fixed. For the two-classification problem, the total number of samples of most class A is sumA. The total sample of a few class B is sumb, it chooses to add the weight value of 1% sumA to the class A sample, and add the weight value of 1% sumB to the class B sample. This method is obviously not the optimal solution. This paper works from three aspects: first. This paper discusses the influence of the hidden layer output weight on the extreme learning machine to deal with the problem of non-equilibrium classification. In order to understand directly how the non-equilibrium data set affects the performance of the ultimate learning machine, we are on multiple data sets. By gradually increasing the unbalance ratio of the data set, it is found from the experiment that the ultimate learning machine achieves the optimal performance when the data set is balanced. The imbalance of data has a direct impact on the classification effect of LLM. Second. A new adaptive hidden layer output weighting strategy is proposed to improve the prediction performance of the weighted extreme learning machine, which can effectively improve the classification performance of the ultimate learning machine on the unbalanced data set. However, the weighting mechanism is too arbitrary. In this paper, an adaptive weighted extreme learning machine (SawELM) is proposed to reduce the influence of misdivision samples on the classifier. A new mechanism for calculating the weight of the output layer is designed. The mechanism consists of the following two modules:. 1. Gradually reducing the weight of the training samples. 2. Dynamically updating the value of the output layer of the misclassified samples. The first module of SawELM reduces the influence of the error classification samples in calculating the weight of the output layer. The second module tells SawELM to adjust the weight of the output layer. For the sample of WELM classification error, on the one hand, when calculating the weight of the output layer, it weakens the influence of the error instance, on the other hand. Increase the output of the sample, so that the sample can be better classified by the classifier. Third. The feasibility and effectiveness of adaptive weighted extreme learning machine are proved by full experimental comparison. In this paper, 50 two-class unbalanced data sets are randomly selected from KEEL data warehouse. The accuracy of SawELM ELM and WELM were compared. The experimental results show that the proposed adaptive mechanism is effective. SawELM significantly improved the unbalanced classification performance of WELM. Compared with ELM and WELM, the G-mean of SawELM was significantly improved. At the same time, the accuracy of SawELM was higher than that of WELM and was comparable to that of ELM.
【學(xué)位授予單位】:深圳大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP181
【參考文獻】
中國期刊全文數(shù)據(jù)庫 前2條
1 李威龍;范新南;李敏;鄭Ou斌;;基于加權(quán)極限學(xué)習(xí)機的異常軌跡檢測算法[J];微處理機;2014年01期
2 葉志飛;文益民;呂寶糧;;不平衡分類問題研究綜述[J];智能系統(tǒng)學(xué)報;2009年02期
,本文編號:1391394
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1391394.html
最近更新
教材專著