天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 碩博論文 > 信息類博士論文 >

腫瘤純度在差異基因表達(dá)和腫瘤亞型聚類中的作用研究

發(fā)布時間:2018-02-27 05:32

  本文關(guān)鍵詞: 差異表達(dá)基因 廣義線性模型 DNA甲基化 EM算法 腫瘤純度 癌 癥 亞 型 出處:《上海師范大學(xué)》2017年博士論文 論文類型:學(xué)位論文


【摘要】:腫瘤與正常細(xì)胞的差異基因表達(dá)分析、腫瘤的亞型識別都對癌癥的早期診斷和臨床治療具有非常重要的意義。然而,臨床上獲得的腫瘤組織往往包含一定數(shù)量的其它細(xì)胞,如正常細(xì)胞、免疫細(xì)胞、基質(zhì)細(xì)胞、血管細(xì)胞等。其中,正常細(xì)胞的混入會對差異基因表達(dá)分析和腫瘤亞型分類產(chǎn)生不利影響。因此,建立合適的統(tǒng)計模型修正腫瘤純度信息對差異基因表達(dá)分析、腫瘤聚類的影響是亟待解決的工作。本論文針對以上兩個問題展開系統(tǒng)研究。首先,我們研究了腫瘤純度信息對差異表達(dá)基因分析的影響。通過模擬分析發(fā)現(xiàn),腫瘤純度與基因表達(dá)量差異之間的關(guān)系是乘性而非原來認(rèn)為的線性關(guān)系。忽略腫瘤純度,或者將腫瘤純度作為協(xié)變量加入回歸模型都會使得差異表達(dá)基因分析的結(jié)果出現(xiàn)偏差。為了解決這個問題,我們提出了一種廣義的最小二乘模型和Wald方法來檢驗每個基因在腫瘤和正常細(xì)胞之間的差異性。通過對TCGA腫瘤數(shù)據(jù)的分析表明,無論是在差異表達(dá)基因個數(shù)、腫瘤間統(tǒng)計量一致性等指標(biāo)上還是在對應(yīng)癌癥類型功能關(guān)聯(lián)性上,該方法都優(yōu)于傳統(tǒng)的t-test和limma。其次,我們研究了腫瘤純度信息對腫瘤樣本進(jìn)行無監(jiān)督聚類的影響。通過對TCGA乳腺癌450K甲基化芯片數(shù)據(jù)聚類結(jié)果分析發(fā)現(xiàn),利用傳統(tǒng)的k-means和NMF進(jìn)行聚類,腫瘤純度將會使得聚類結(jié)果出現(xiàn)偏差,具有相類似純度的腫瘤樣本極易聚在同一類,并且腫瘤純度較低的樣本極容易聚錯。基于此,我們針對DNA甲基化芯片數(shù)據(jù),提出了一個基于模型的聚類算法。我們將腫瘤樣本在每一個位點的甲基化水平假設(shè)成了一個高斯混合分布,利用EM算法進(jìn)行參數(shù)估計和腫瘤樣本聚類。數(shù)據(jù)模擬分析表明,相比較于k-means,我們的算法具有更高的精度。通過對TCGA的23種癌癥的分析發(fā)現(xiàn),我們的方法得到了相對于k-means和NMF的偏差較小的聚類結(jié)果。
[Abstract]:The differential gene expression analysis between tumor and normal cells and the recognition of tumor subtypes are of great significance for the early diagnosis and clinical treatment of cancer. However, the tumor tissues obtained in clinic often contain a certain number of other cells. Such as normal cells, immune cells, stromal cells, vascular cells, etc. Among them, the mixing of normal cells will have a negative effect on differential gene expression analysis and tumor subtype classification. It is urgent to establish a suitable statistical model to modify the tumor purity information for differential gene expression analysis, and the effect of tumor clustering is urgently needed to be solved. In this paper, the above two problems are systematically studied. First of all, We studied the effect of tumor purity information on differential expression gene analysis. Simulation analysis showed that the relationship between tumor purity and gene expression difference was multiplicative rather than linear. Or adding tumor purity as a covariable to the regression model can skew the results of differential expression gene analysis. We propose a generalized least-squares model and Wald method to test the difference between each gene in tumor and normal cells. The analysis of TCGA tumor data shows that, regardless of the number of differentially expressed genes, This method is superior to the traditional t-test and limma.Secondly, this method is superior to the traditional t-test and limma.These methods are better than the traditional t-test and limma.Secondly, We studied the effect of tumor purity information on the unsupervised clustering of tumor samples. By clustering the data of 450K methylation chip for TCGA breast cancer, we found that traditional k-means and NMF were used for clustering. The purity of tumor will cause the clustering results to deviate, the samples with similar purity are easily clustered in the same class, and the samples with lower tumor purity are easy to get wrong. Based on this, we aim at the DNA methylation chip data. A model-based clustering algorithm is proposed. The methylation level of tumor samples at each site is assumed to be a mixed distribution of Gao Si, and the EM algorithm is used to estimate the parameters and cluster the tumor samples. Compared with k-meanss, our algorithm has higher accuracy. Through the analysis of 23 kinds of cancer of TCGA, we find that our method has less deviation than k-means and NMF clustering results.
【學(xué)位授予單位】:上海師范大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2017
【分類號】:R73-3;TP311.13

【相似文獻(xiàn)】

相關(guān)會議論文 前3條

1 姜紅;饒丹;郭和平;王敏;V.Adams;葛均波;;細(xì)胞因子刺激鼠骨骼肌成肌細(xì)胞的差異基因表達(dá)[A];中華醫(yī)學(xué)會心血管病分會第八次全國心血管病學(xué)術(shù)會議匯編[C];2004年

2 彭振;何守樸;龔文芳;孫君靈;潘兆娥;許菲菲;杜雄明;;不同基因型棉花苗期葉片受鹽脅迫誘導(dǎo)的差異基因表達(dá)與轉(zhuǎn)錄調(diào)控分析[A];中國棉花學(xué)會2014年年會論文匯編[C];2014年

3 曾桂英;崔大祥;田芙蓉;王楓;任東青;趙濤;閻小君;蘇成芝;;小腸RNA對小鼠小腸輻射損傷的恢復(fù)及其差異基因表達(dá)[A];西部大開發(fā) 科教先行與可持續(xù)發(fā)展——中國科協(xié)2000年學(xué)術(shù)年會文集[C];2000年

相關(guān)博士學(xué)位論文 前3條

1 張偉偉;腫瘤純度在差異基因表達(dá)和腫瘤亞型聚類中的作用研究[D];上海師范大學(xué);2017年

2 王瑤;基于樣本子集差異基因表達(dá)檢測的統(tǒng)計方法研究[D];吉林大學(xué);2011年

3 紀(jì)兆華;基于樣本子集差異基因表達(dá)檢測的統(tǒng)計方法研究[D];吉林大學(xué);2010年

,

本文編號:1541398

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1541398.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶7d005***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com