利用計(jì)算方法研究疾病突變的分子調(diào)控機(jī)制
本文選題:疾病突變 切入點(diǎn):調(diào)控元件 出處:《安徽大學(xué)》2017年碩士論文
【摘要】:隨著高通量測(cè)序技術(shù)的發(fā)展,產(chǎn)生海量生物數(shù)據(jù),但是如何從生物大數(shù)據(jù)中挖掘出所蘊(yùn)含生物規(guī)律是一個(gè)巨大的挑戰(zhàn)。生物信息學(xué)是一門利用統(tǒng)計(jì)分析、計(jì)算方法以及其他學(xué)科來(lái)分析研究生物學(xué)的交叉學(xué)科;虮磉_(dá)是一個(gè)高度調(diào)控的過(guò)程,一直是生物信息學(xué)的研究熱點(diǎn)之一;虮磉_(dá)過(guò)程可以分為轉(zhuǎn)錄和翻譯兩大部分,在每一階段都有眾多的調(diào)控元件、蛋白質(zhì)分子參與其中,任何一個(gè)階段出現(xiàn)異常,都有可能導(dǎo)致基因功能失活,影響基因的表達(dá),最后導(dǎo)致疾病的發(fā)生。調(diào)控元件在基因組上廣泛分布,深入?yún)⑴c基因的表達(dá),調(diào)控元件的功能活性變化情況對(duì)基因表達(dá)有重要作用。落在調(diào)控元件上的基因突變可以改變?cè)墓δ芑钚?對(duì)基因表達(dá)產(chǎn)生異常影響,是重要的分子致病機(jī)制之一。為了定量度量不同調(diào)控元件突變對(duì)基因表達(dá)的影響程度,本文對(duì)四類不同疾病的相關(guān)突變的分子調(diào)控機(jī)制進(jìn)行了研究,發(fā)現(xiàn)不同種類的疾病突變具有不同特異性的分子調(diào)控機(jī)制。另外,利用序列模式挖掘建模方法,對(duì)調(diào)控元件中的啟動(dòng)子序列和增強(qiáng)子序列進(jìn)行建模研究,進(jìn)一步分析啟動(dòng)子和增強(qiáng)子突變致病機(jī)制。本文主要研究工作和創(chuàng)新之處如下:(1)不同種類的疾病突變富集于不同的調(diào)控元件區(qū)域。首先從FANTOM、ENCODE項(xiàng)目組公布的數(shù)據(jù)中獲取九類調(diào)控元件,發(fā)現(xiàn)不同類型調(diào)控元件在基因組上的分布顯著差異;然后從OMMI,GWAS,ClinVar,VarDi等數(shù)據(jù)庫(kù)獲取四類疾病突變數(shù)據(jù):遺傳疾病突變,癌癥誘發(fā)性生殖細(xì)胞突變,癌癥體細(xì)胞突變和復(fù)雜疾病突變;統(tǒng)計(jì)四類疾病突變?cè)诰蓬愓{(diào)控元件上的發(fā)布,發(fā)現(xiàn)遺傳疾病突變富集于啟動(dòng)子,癌癥突變富集于啟動(dòng)子、甲基化區(qū)域和染色體物理互作區(qū)域,復(fù)雜疾病在九類調(diào)控元件上的分布均勻。(2)利用序列模式挖掘模型,對(duì)啟動(dòng)子和增強(qiáng)子的突變致病機(jī)制進(jìn)行研究,量化突變對(duì)啟動(dòng)子和增強(qiáng)子功能活性的影響程度;蛐蛄袛(shù)據(jù)上蘊(yùn)含著豐富的調(diào)控序列,它們能夠在基因表達(dá)過(guò)程中發(fā)揮調(diào)控功能,產(chǎn)生不同的蛋白產(chǎn)物。結(jié)合序列的差異性以及保守性特征,本文融合頻繁模式挖掘與PSSM模型,對(duì)啟動(dòng)子和增強(qiáng)子進(jìn)行建模研究,實(shí)現(xiàn)了對(duì)啟動(dòng)子信號(hào)強(qiáng)度和增強(qiáng)子信號(hào)強(qiáng)度的定量度量,計(jì)算驗(yàn)證實(shí)驗(yàn)表明該模型能夠有效的區(qū)分真、假啟動(dòng)子以及增強(qiáng)子。并進(jìn)一步對(duì)啟動(dòng)子和增強(qiáng)子上的突變進(jìn)行研究,結(jié)果顯示啟動(dòng)子信號(hào)強(qiáng)度降低則致病概率增大,表明降低啟動(dòng)子信號(hào)強(qiáng)度的啟動(dòng)子單核苷酸突變與疾病有正相關(guān)性;而增強(qiáng)子上疾病突變導(dǎo)致的信號(hào)強(qiáng)度的改變,與疾病發(fā)生無(wú)顯著相關(guān)性。
[Abstract]:With the development of high-throughput sequencing technology, huge amounts of biological data are produced, but it is a great challenge to find out the biological laws from the biological big data. Bioinformatics is a statistical analysis. Gene expression is a highly regulated process and has always been one of the hot topics in bioinformatics. Gene expression can be divided into two parts: transcription and translation. At each stage, there are many regulatory elements, in which protein molecules are involved. Any abnormal phase may lead to inactivation of gene function and affect gene expression. Finally, the disease occurs. The regulatory elements are widely distributed in the genome, deeply involved in gene expression, Changes in the functional activity of regulatory elements play an important role in gene expression. Gene mutations that fall on the regulatory elements can change the functional activity of the elements and have an abnormal effect on gene expression. In order to quantitatively measure the effect of mutations of different regulatory elements on gene expression, the molecular regulatory mechanisms of mutations related to four different diseases have been studied in this paper. It is found that different disease mutations have different specific molecular regulation mechanisms. In addition, the promoter sequence and enhancer sequence in regulatory elements are modeled by using sequence pattern mining modeling method. Further analysis of the pathogenetic mechanism of promoter and enhancer mutation. The main work and innovations of this paper are as follows: 1) different disease mutations are concentrated in different regulatory element regions. Firstly, the data published by the FANTOMMONCODE project team. Gets nine types of regulatory elements, It was found that there were significant differences in the distribution of different types of regulatory elements in the genome, and then four kinds of disease mutation data were obtained from OMMIA GWASN ClinvarvarDi database: genetic disease mutation, cancer-induced germ cell mutation, cancer somatic mutation and complex disease mutation. Four kinds of disease mutations were reported on nine regulatory elements. Genetic disease mutations were found to be enriched in promoters, cancer mutations in promoters, methylation regions and chromosomal physical interactions. Complex diseases are evenly distributed on nine regulatory elements.) using sequential pattern mining models, the mutational pathogenicity of promoters and enhancers is studied. The extent to which quantitative mutations affect the functional activity of promoters and enhancers. Gene sequence data contain a wealth of regulatory sequences that can play regulatory roles in the course of gene expression. We combine frequent pattern mining with PSSM model to model promoter and enhancer. The quantitative measurement of signal intensity of promoter and enhancer is realized. The experimental results show that the model can effectively distinguish true promoter from false promoter and enhancer. Furthermore, the mutation on promoter and enhancer is studied. The results showed that when the signal intensity of promoter decreased, the probability of pathogenicity increased, which indicated that the single nucleotide mutation of promoter which decreased the signal intensity of promoter was positively correlated with disease, while the signal intensity of disease mutation on enhancer was changed. There was no significant correlation with disease.
【學(xué)位授予單位】:安徽大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:Q811.4;R3416
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 孫永山;趙海峰;湯振宇;李旦;馬猛;陳榮;;基于序列模式挖掘的基因剪接位點(diǎn)[J];數(shù)據(jù)采集與處理;2016年05期
2 馬宏;王永芳;李偉;;谷子突變體研究進(jìn)展[J];廣東農(nóng)業(yè)科學(xué);2014年04期
3 李彪;;桉樹(shù)全基因組測(cè)序及相關(guān)研究進(jìn)展[J];林業(yè)實(shí)用技術(shù);2013年07期
4 王帆;劉帥;;計(jì)算機(jī)在生物信息學(xué)中的應(yīng)用[J];科技致富向?qū)?2012年35期
5 馬猛;汪洋;;應(yīng)用序列特征分析基因剪接信號(hào)[J];計(jì)算機(jī)工程與應(yīng)用;2012年27期
6 熊燕;陳大明;楊琛;趙國(guó)屏;;合成生物學(xué)發(fā)展現(xiàn)狀與前景[J];生命科學(xué);2011年09期
7 王悅冰;郎志宏;黃大f ;;內(nèi)含子對(duì)真核基因表達(dá)調(diào)控的影響[J];生物技術(shù)通報(bào);2008年04期
8 鄭一哲;杜進(jìn)堂;李艷梅;;生命體系中的氫鍵[J];大學(xué)化學(xué);2007年02期
9 管曉翔;陳龍邦;;組蛋白乙;揎椩诨虮磉_(dá)調(diào)控中的作用機(jī)制[J];中華腫瘤防治雜志;2007年04期
10 屈艾,汪承潤(rùn),蔣繼宏;遺傳信息傳遞的中心法則發(fā)展過(guò)程[J];細(xì)胞生物學(xué)雜志;2003年01期
,本文編號(hào):1664245
本文鏈接:http://sikaile.net/shoufeilunwen/benkebiyelunwen/1664245.html