R語(yǔ)言包InfiniumPurify在腫瘤純度估計(jì)和差異甲基化分析中的應(yīng)用
發(fā)布時(shí)間:2018-05-18 01:02
本文選題:DNA甲基化 + 表觀遺傳學(xué) ; 參考:《上海師范大學(xué)》2017年碩士論文
【摘要】:DNA甲基化與人類(lèi)發(fā)育以及腫瘤疾病密切相關(guān),對(duì)腫瘤細(xì)胞純度的估計(jì)以及差異甲基化分析都是表觀遺傳學(xué)研究的重要內(nèi)容。但是,基于DNA甲基化芯片數(shù)據(jù)來(lái)研究腫瘤細(xì)胞純度以及差異甲基化分析的方法還不完善。由于腫瘤樣本中含有正常細(xì)胞,而腫瘤純度會(huì)給混合腫瘤-正常樣本的差異甲基化分析帶來(lái)偏差甚至是產(chǎn)生錯(cuò)誤預(yù)測(cè)。現(xiàn)有的方法還沒(méi)有完全實(shí)現(xiàn)對(duì)腫瘤樣本的“矯正”,估計(jì)純腫瘤樣本甲基化水平的方法有待研究。本文通過(guò)InfiniumPurify包來(lái)研究上述問(wèn)題。InfiniumPurify包含如下三個(gè)模型:第一,估計(jì)腫瘤細(xì)胞純度的getPurity函數(shù)。getPurity首先基于混合腫瘤-正常樣本的?值矩陣得到差異最顯著的CpG位點(diǎn)(記為iDMCs),再通過(guò)iDMCs屬于超甲基化位點(diǎn)還是低甲基化位點(diǎn)來(lái)轉(zhuǎn)換iDMCs的甲基化水平,最后對(duì)這些iDMC利用核密度估計(jì)得到腫瘤樣本的純度。預(yù)測(cè)的結(jié)果與ABSOLUTE以及其他方法的結(jié)果高度一致;第二,考慮腫瘤純度進(jìn)行差異甲基化分析的InfiniumDMC函數(shù)。由于Infinium DMC考慮了腫瘤樣本的純度,避免了因?yàn)槟[瘤樣本不純而導(dǎo)致的差異甲基化分析中誤差的出現(xiàn),與其它現(xiàn)有的方法相比得到的差異甲基化位點(diǎn)更準(zhǔn)確;在沒(méi)有正常樣本控制時(shí)InfiniumDMC也可以進(jìn)行差異甲基化分析,大大擴(kuò)展了對(duì)TCGA數(shù)據(jù)的分析與應(yīng)用;第三,InfiniumPurify函數(shù),其基于腫瘤、正常樣本以及腫瘤純度值通過(guò)線性回歸模型來(lái)估計(jì)純的腫瘤樣本的甲基化水平,經(jīng)過(guò)純度的矯正,使得差異甲基化位點(diǎn)處腫瘤樣本與正常樣本的甲基化水平的分布有了非常顯著的差異。
[Abstract]:DNA methylation is closely related to human development and tumor disease. Estimation of tumor cell purity and differential methylation analysis are important in epigenetics. However, it is not perfect to study tumor cell purity and differential methylation analysis based on DNA methylation chip data. Due to the presence of normal cells in the tumor samples, the purity of the tumor can cause deviation or even false prediction for differential methylation analysis of mixed tumor-normal samples. The existing methods have not completely achieved the "correction" of tumor samples, and methods to estimate the methylation level of pure tumor samples need to be studied. In this paper, we use InfiniumPurify package to study the above problem. Infinium purify contains the following three models: first, the getPurity function of estimating tumor cell purity. GetPurity is based on mixed tumor-normal sample? The value matrix obtained the most significant difference of CpG sites (denoted as iDMCsN, then converted the methylation level of iDMCs by iDMCs belonging to hypermethylation site or low methylation site). Finally, the purity of tumor samples was estimated by nuclear density estimation for these iDMC. The predicted results are highly consistent with those of ABSOLUTE and other methods. Secondly, the InfiniumDMC function for differential methylation analysis of tumor purity is considered. Because Infinium DMC takes into account the purity of tumor samples and avoids errors in differential methylation analysis caused by the impurity of tumor samples, the differential methylation sites obtained by Infinium DMC are more accurate than those obtained by other existing methods. InfiniumDMC can also perform differential methylation analysis without normal sample control, greatly expanding the analysis and application of TCGA data. Normal samples and tumor purity values were estimated by linear regression model to estimate the methylation level of pure tumor samples and corrected by purity. The distribution of methylation level in tumor samples at differential methylation sites is significantly different from that in normal samples.
【學(xué)位授予單位】:上海師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:R73-3;O212.1
【相似文獻(xiàn)】
相關(guān)博士學(xué)位論文 前2條
1 陳栩q,
本文編號(hào):1903661
本文鏈接:http://sikaile.net/kejilunwen/yysx/1903661.html
最近更新
教材專(zhuān)著