天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 基因論文 >

基于反投影表示的腫瘤基因表達譜數(shù)據(jù)分類研究

發(fā)布時間:2018-10-08 16:02
【摘要】:隨著基因芯片技術的快速發(fā)展,我們可以快速準確地獲得腫瘤基因表達譜數(shù)據(jù).特征選擇和樣本分類是基于基因表達譜數(shù)據(jù)的腫瘤分類的兩個基本問題.通過分析這些數(shù)據(jù)可以為腫瘤早期診斷和從分子層面上研究提供強有力的工具.近幾年來基于稀疏表示的腫瘤分類技術受到越來越多的關注.然而基于稀疏表示的分類器存在以下問題:(1)高度依賴充足的訓練樣本;(2)忽略蘊含在測試樣本中的信息;(3)重建誤差的分類不穩(wěn)定性.而且,設計高效且具有生物意義的基因選擇方法是目前發(fā)展的趨勢.針對以上問題,本文主要做了如下研究工作:一方面,提出了一種基于反投影表示和類別貢獻率的腫瘤分類方法,并從理論上證明了該方法的可行性和穩(wěn)定性.首先,通過挖掘嵌入在測試樣本中的信息,構造了一種新的反投影表示模型以減小訓練樣本數(shù)目的影響;然后,為了匹配反投影表示模型完成分類,提出了一種新的分類準則——類別貢獻率;最后定義了一種新的統(tǒng)計指標——分類穩(wěn)定性指標,用于量化不同分類準則的穩(wěn)定性.另一方面,在前一工作的基礎之上,進一步提出了一種結合兩階段混合基因選擇和反投影表示模型的腫瘤分類方法.兩階段混合基因選擇方法的第一階段是綜合BW、SNR和F檢驗三種過濾法的基因初選,第二階段是基于統(tǒng)計Lasso方法對初選出的信息基因進行再次選擇,得到可能的致病基因.進而,結合反投影表示模型完成分類.實驗部分針對第一個工作,首先驗證了反投影表示對小樣本問題的有效性,然后利用分類穩(wěn)定性指標驗證了本文基于類別貢獻率的分類準則的穩(wěn)定性,最后進行了分類方法的魯棒性測試;對于第二個工作,首先給出了基因選擇的必要性和Lasso的可行性驗證,然后借助不同階段基于主成分分析的可視化投影分布圖和分類性能驗證兩階段混合基因選擇方法的高效性.值得一提的是,進一步地借助該方法選出了候選致病基因并對這些基因進行了生物學分析.
[Abstract]:With the rapid development of gene chip technology, we can obtain tumor gene expression profile data quickly and accurately. Feature selection and sample classification are two basic problems in tumor classification based on gene expression profile data. The analysis of these data provides a powerful tool for early diagnosis and molecular research. In recent years, sparse representation based tumor classification technology has attracted more and more attention. However, the classifier based on sparse representation has the following problems: (1) highly dependent on sufficient training samples; (2) ignoring the information contained in the test samples; (3) the classification instability of reconstruction errors. Moreover, it is a trend to design efficient and biological gene selection methods. In order to solve the above problems, this paper mainly researches as follows: on the one hand, a tumor classification method based on backprojection representation and class contribution rate is proposed, and the feasibility and stability of the method are proved theoretically. Firstly, by mining the information embedded in the test samples, a new backprojection representation model is constructed to reduce the influence of the number of training samples, and then, in order to match the backprojection representation model, the classification is completed. A new classification criterion, category contribution rate, and a new statistical index, classification stability index, are proposed to quantify the stability of different classification criteria. On the other hand, on the basis of the previous work, a tumor classification method combining two-stage mixed gene selection model and back-projection representation model is proposed. The first stage of the two-stage mixed gene selection method is the primary selection of the three filter methods of BW,SNR and F test. The second stage is the selection of the information gene based on the statistical Lasso method to obtain the possible pathogenic gene. Furthermore, the classification is completed by combining the back-projection representation model. In the first part of the experiment, the effectiveness of the backprojection representation for the small sample problem is first verified, and then the stability of the classification criterion based on the category contribution rate is verified by using the classification stability index. Finally, the robustness of the classification method is tested. For the second work, the necessity of gene selection and the feasibility of Lasso are given. Then the effectiveness of the two-stage hybrid gene selection method is verified by the visual projection map based on principal component analysis (PCA) and classification performance in different stages. It is worth mentioning that the candidate pathogenic genes were further selected and biologically analyzed by this method.
【學位授予單位】:河南大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:R73-3

【參考文獻】

相關期刊論文 前6條

1 張靖;胡學鋼;李培培;張玉紅;;基于迭代Lasso的腫瘤分類信息基因選擇方法研究[J];模式識別與人工智能;2014年01期

2 張秀秀;王慧;田雙雙;喬楠;閆麗娜;王彤;;高維數(shù)據(jù)回歸分析中基于LASSO的自變量選擇[J];中國衛(wèi)生統(tǒng)計;2013年06期

3 張靖;胡學鋼;張玉紅;施萬鋒;;K-split Lasso:有效的腫瘤特征基因選擇方法[J];計算機科學與探索;2012年12期

4 楊華;駱嘉偉;;基于BW ratio與二進制量子粒子群的基因選擇方法[J];微計算機信息;2011年01期

5 王樹林;王戟;陳火旺;李樹濤;張波云;;腫瘤信息基因啟發(fā)式寬度優(yōu)先搜索算法研究[J];計算機學報;2008年04期

6 李穎新;李建更;阮曉鋼;;腫瘤基因表達譜分類特征基因選取問題及分析方法研究[J];計算機學報;2006年02期

相關博士學位論文 前3條

1 陸慧娟;基于基因表達數(shù)據(jù)的腫瘤分類算法研究[D];中國礦業(yè)大學;2012年

2 于化龍;基于DNA微陣列數(shù)據(jù)的癌癥分類技術研究[D];哈爾濱工程大學;2010年

3 盧新國;基于DNA微陣列基因表達譜數(shù)據(jù)的癌癥檢測研究[D];湖南大學;2007年

相關碩士學位論文 前2條

1 于攀;基于基因表達數(shù)據(jù)的腫瘤分類方法研究[D];重慶大學;2012年

2 張秋水;支持向量機在基因表達數(shù)據(jù)中的研究[D];廈門大學;2007年

,

本文編號:2257373

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jiyingongcheng/2257373.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶ac085***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com