面向基因數(shù)據(jù)分類的核主成分分析旋轉(zhuǎn)森林算法
發(fā)布時(shí)間:2018-03-08 21:11
本文選題:核函數(shù) 切入點(diǎn):主成分分析 出處:《計(jì)算機(jī)科學(xué)與探索》2017年10期 論文類型:期刊論文
【摘要】:旋轉(zhuǎn)森林(rotation forest,Ro F)是一種運(yùn)用線性分析理論和決策樹的集成分類算法,在分類器個(gè)數(shù)較少的情況下仍可以取得良好的結(jié)果,同時(shí)能保證集成分類的準(zhǔn)確性。但對(duì)于部分基因數(shù)據(jù)集,存在線性不可分的情況,原始的算法分類效果不佳。提出了一種運(yùn)用核主成分分析變換的旋轉(zhuǎn)森林算法(rotation forest algorithm based on kernel principal component analysis,KPCA-Ro F),選擇高斯徑向基核函數(shù)和主成分分析的方法對(duì)基因數(shù)據(jù)集進(jìn)行非線性映射和差異性變化,著重于參數(shù)的選擇問題,再利用決策樹算法進(jìn)行集成學(xué)習(xí)。實(shí)驗(yàn)證明,改進(jìn)后的算法能很好地解決數(shù)據(jù)線性不可分的情形,同時(shí)也提高了基因數(shù)據(jù)集上的分類精度。
[Abstract]:Rotation forestRo F) is an integrated classification algorithm based on linear analysis theory and decision tree. It can obtain good results even if the number of classifiers is small. It can also ensure the accuracy of the integrated classification. However, for some gene data sets, there is linear inseparability. This paper presents a rotation forest algorithm based on kernel principal component analysis KPCA-Ro FN algorithm based on kernel principal component analysis (KPCA). Gao Si radial basis kernel function and principal component analysis (PCA) are used to analyze the genetic data set. Nonlinear mapping and variation of differences, The experiment shows that the improved algorithm can solve the problem of inseparability of data line and improve the classification accuracy of genetic data set.
【作者單位】: 中國(guó)計(jì)量大學(xué)信息工程學(xué)院;中國(guó)計(jì)量大學(xué)現(xiàn)代科技學(xué)院;
【基金】:國(guó)家自然科學(xué)基金Nos.61272315,60905034 浙江省自然科學(xué)基金No.Y1110342 國(guó)家安全總局項(xiàng)目No.zhejiang-00062014AQ~~
【分類號(hào)】:R73-3;TP18
,
本文編號(hào):1585563
本文鏈接:http://sikaile.net/kejilunwen/jiyingongcheng/1585563.html
最近更新
教材專著