基于二分搜索結(jié)合修剪隨機(jī)森林的特征選擇算法在近紅外光譜分類中的應(yīng)用
發(fā)布時(shí)間:2018-12-23 09:26
【摘要】:針對(duì)隨機(jī)森林(RF)在高維空間特征選擇過程中計(jì)算繁瑣和內(nèi)存開銷大、分類準(zhǔn)確率低等問題,提出了基于二分搜索(BS)結(jié)合修剪隨機(jī)森林(RFP)的特征選擇算法(BSRFP);該算法首先根據(jù)純度基尼指數(shù)獲取特征重要性評(píng)分,刪除重要性評(píng)分較低的特征,然后利用BS算法結(jié)合基分類器差異性的修剪技術(shù)得到最優(yōu)特征子集和最高分類準(zhǔn)確率的分類器;為了驗(yàn)證算法的有效性,構(gòu)建卷煙質(zhì)量識(shí)別模型并與其他方法進(jìn)行比較。結(jié)果表明:BS算法簡(jiǎn)化了特征搜索過程,RFP算法縮減了RF算法的規(guī)模;RFP算法的分類準(zhǔn)確率可達(dá)96.47%;BSRFP算法選擇出的特征相關(guān)性更強(qiáng),對(duì)卷煙質(zhì)量識(shí)別具有更高的準(zhǔn)確度。
[Abstract]:In order to solve the problems of complicated computation, large memory cost and low classification accuracy of random forest (RF) in the process of high dimensional spatial feature selection, a feature selection algorithm (BSRFP); based on binary search (BS) combined with pruning random forest (RFP) is proposed. The algorithm firstly obtains the feature importance score according to the purity Gini index and removes the feature with the lower importance score. Then the BS algorithm combined with the pruning technique of base classifier difference is used to obtain the optimal feature subset and the classifier with the highest classification accuracy. In order to verify the effectiveness of the algorithm, a cigarette quality recognition model was constructed and compared with other methods. The results show that the BS algorithm simplifies the feature search process, the RFP algorithm reduces the scale of the RF algorithm, and the classification accuracy of the RFP algorithm reaches 96.47. The feature correlation selected by BSRFP algorithm is stronger and has higher accuracy for cigarette quality recognition.
【作者單位】: 中國海洋大學(xué)信息科學(xué)與工程學(xué)院;云南中煙工業(yè)有限責(zé)任公司技術(shù)中心;
【基金】:國家科技支撐計(jì)劃(2015BAF12B01) 云南中煙工業(yè)有限責(zé)任公司項(xiàng)目(JSZX2014YL01,20530001020152000086)
【分類號(hào)】:O433.4
本文編號(hào):2389772
[Abstract]:In order to solve the problems of complicated computation, large memory cost and low classification accuracy of random forest (RF) in the process of high dimensional spatial feature selection, a feature selection algorithm (BSRFP); based on binary search (BS) combined with pruning random forest (RFP) is proposed. The algorithm firstly obtains the feature importance score according to the purity Gini index and removes the feature with the lower importance score. Then the BS algorithm combined with the pruning technique of base classifier difference is used to obtain the optimal feature subset and the classifier with the highest classification accuracy. In order to verify the effectiveness of the algorithm, a cigarette quality recognition model was constructed and compared with other methods. The results show that the BS algorithm simplifies the feature search process, the RFP algorithm reduces the scale of the RF algorithm, and the classification accuracy of the RFP algorithm reaches 96.47. The feature correlation selected by BSRFP algorithm is stronger and has higher accuracy for cigarette quality recognition.
【作者單位】: 中國海洋大學(xué)信息科學(xué)與工程學(xué)院;云南中煙工業(yè)有限責(zé)任公司技術(shù)中心;
【基金】:國家科技支撐計(jì)劃(2015BAF12B01) 云南中煙工業(yè)有限責(zé)任公司項(xiàng)目(JSZX2014YL01,20530001020152000086)
【分類號(hào)】:O433.4
【相似文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前1條
1 閆西章;近紅外無創(chuàng)血糖檢測(cè)的隨機(jī)森林模型及實(shí)驗(yàn)系統(tǒng)的設(shè)計(jì)[D];吉林大學(xué);2014年
,本文編號(hào):2389772
本文鏈接:http://sikaile.net/kejilunwen/wulilw/2389772.html
最近更新
教材專著