生物網(wǎng)絡(luò)分析及其在復(fù)雜疾病研究中的應(yīng)用
發(fā)布時(shí)間:2018-04-04 19:43
本文選題:系統(tǒng)生物學(xué) 切入點(diǎn):生物網(wǎng)絡(luò) 出處:《中南大學(xué)》2012年博士論文
【摘要】:如何診斷和治療以癌癥為代表的復(fù)雜疾病一直是生物醫(yī)學(xué)研究的重點(diǎn)和難點(diǎn)。但這方面的研究長(zhǎng)期以來(lái)受限于生物實(shí)驗(yàn)技術(shù)和實(shí)驗(yàn)結(jié)果分析技術(shù),沒(méi)能取得重大的突破。高通量生物技術(shù)的快速發(fā)展為復(fù)雜疾病的研究提供了海量的數(shù)據(jù)來(lái)源,尤其是以基因調(diào)控網(wǎng)絡(luò)和蛋白質(zhì)相互作用網(wǎng)絡(luò)為代表的生物網(wǎng)絡(luò)很好的表示了生物大分子間的復(fù)雜關(guān)系,為復(fù)雜疾病的研究提供了很好的數(shù)據(jù)支持。正是由于這類生物網(wǎng)絡(luò)數(shù)據(jù)的大量積累,研究人員迫切的需要新的分析技術(shù)對(duì)生物網(wǎng)絡(luò)進(jìn)行分析,并最終對(duì)復(fù)雜疾病的研究、診斷和治療提供支持。 本文從評(píng)估生物大分子間相互作用數(shù)據(jù)的可靠性出發(fā),對(duì)圖聚類、多數(shù)據(jù)融合的動(dòng)態(tài)網(wǎng)絡(luò)構(gòu)建等技術(shù)進(jìn)行了研究,最終將這些分析技術(shù)應(yīng)用到復(fù)雜疾病的疾病基因和生物過(guò)程的識(shí)別中。主要的研究工作包括: 針對(duì)目前高通量實(shí)驗(yàn)技術(shù)所產(chǎn)生的生物網(wǎng)絡(luò)存在假陽(yáng)性高和假陰性高的問(wèn)題,利用Gene Ontology注釋信息和語(yǔ)義相似性對(duì)現(xiàn)有的蛋白質(zhì)相互作用數(shù)據(jù)的可靠性進(jìn)行評(píng)估,通過(guò)統(tǒng)計(jì)分析和機(jī)器學(xué)習(xí)尋找最適合于評(píng)估蛋白質(zhì)相互作用可靠性的語(yǔ)義相似性定義。 現(xiàn)在直接從公開數(shù)據(jù)庫(kù)中得到的生物網(wǎng)絡(luò)都是靜態(tài)的,但這顯然沒(méi)有反應(yīng)出生物的動(dòng)態(tài)性。我們通過(guò)對(duì)時(shí)序基因表達(dá)數(shù)據(jù)和組織特異性基因表達(dá)數(shù)據(jù)進(jìn)行分析,并將其與現(xiàn)有的靜態(tài)生物網(wǎng)絡(luò)融合,構(gòu)建出了具有一定時(shí)空動(dòng)態(tài)特性的生物網(wǎng)絡(luò),并對(duì)這種動(dòng)態(tài)網(wǎng)絡(luò)進(jìn)行了基本的分析,并將其跟靜態(tài)網(wǎng)絡(luò)做了比較。 現(xiàn)有的大部分用于從生物網(wǎng)絡(luò)中挖掘功能模塊和復(fù)合物的算法都只是基于生物網(wǎng)絡(luò)的拓?fù)浣Y(jié)構(gòu)。通過(guò)分析發(fā)現(xiàn),關(guān)鍵蛋白質(zhì)在功能模塊和復(fù)合物中的分布式不均勻的,而且功能模塊和復(fù)合物都存在核結(jié)構(gòu),因此在聚類過(guò)程中有必要對(duì)關(guān)鍵蛋白和非關(guān)鍵蛋白做不同的處理。據(jù)此,我們提出了基于關(guān)鍵蛋白質(zhì)的圖聚類算法,EPOF。將該算法應(yīng)用到酵母的蛋白質(zhì)相互作用網(wǎng)絡(luò)上,通過(guò)GO富集分析和跟已知的復(fù)合物進(jìn)行比較,EPOF算法的性能比其他同類算法有顯著提高。 最后,在對(duì)生物網(wǎng)絡(luò)進(jìn)行各種分析的基礎(chǔ)之上,我們利用圖聚類算法對(duì)疾病和藥物對(duì)照研究中的基因表達(dá)數(shù)據(jù)進(jìn)行分析,并用GO語(yǔ)義相似性對(duì)聚類結(jié)果進(jìn)行比較,識(shí)別出跟疾病相關(guān)的生物過(guò)程。同時(shí),我們還利用疾病的Gene Signature和生物網(wǎng)絡(luò)數(shù)據(jù)融合不同的Gene Signature,并識(shí)別出跟疾病有密切關(guān)系的基因。 本文從生物網(wǎng)絡(luò)數(shù)據(jù)的預(yù)處理開始,研究了生物網(wǎng)絡(luò)的各種分析方法,最終將這些方法應(yīng)用到復(fù)雜疾病的研究中,取得了較好的結(jié)果。本文的研究?jī)?nèi)容和成果,為從系統(tǒng)的角度對(duì)各種復(fù)雜疾病展開研究提供了支持,有助于推動(dòng)我們對(duì)以癌癥為代表的復(fù)雜疾病的診斷和治療等方面的研究。
[Abstract]:How to diagnose and treat complex diseases represented by cancer has always been the focus and difficulty of biomedical research.However, the research in this field has been limited by biological experimental technology and experimental results analysis technology for a long time, and failed to make a major breakthrough.The rapid development of high-throughput biotechnology provides massive data sources for the study of complex diseases, especially the biological networks represented by gene regulation networks and protein interaction networks, which represent the complex relationships among biomolecules.It provides a good data support for the study of complex diseases.Because of this kind of biological network data accumulation, researchers urgently need new analysis technology to analyze biological network, and finally provide support for the research, diagnosis and treatment of complex diseases.In order to evaluate the reliability of biomolecular interaction data, the techniques of graph clustering, dynamic network construction of multi-data fusion and so on are studied in this paper.Finally, these analytical techniques are applied to the identification of disease genes and biological processes of complex diseases.Major research efforts include:In view of the problem of false positive and false negative high in biological networks produced by high-throughput experimental technology, the reliability of existing protein-protein interaction data is evaluated by using Gene Ontology annotation information and semantic similarity.Through statistical analysis and machine learning to find the most suitable for evaluating the reliability of protein interaction semantic similarity definition.Biological networks obtained directly from public databases are now static, but this obviously does not reflect the dynamic nature of organisms.Based on the analysis of temporal gene expression data and tissue specific gene expression data, and fusion with the existing static biological networks, we have constructed a biological network with a certain temporal and spatial dynamic characteristics.The dynamic network is analyzed and compared with the static network.Most of the existing algorithms for mining functional modules and complexes from biological networks are based on the topology of biological networks.It is found that the key proteins are distributed inhomogeneously in functional modules and complexes, and the nuclear structures exist in both functional modules and complexes. Therefore, it is necessary to treat the key proteins and non-key proteins differently in the process of clustering.Based on this, we propose a graph clustering algorithm based on key proteins (EPOF).This algorithm is applied to yeast protein interaction network. The performance of EPOF algorithm is significantly improved by go enrichment analysis and comparison with known complexes.Finally, based on the analysis of biological networks, we analyze the gene expression data in disease and drug control studies using map clustering algorithm, and compare the clustering results with go semantic similarity.Identify biological processes associated with disease.At the same time, we use the disease Gene Signature and biological network data to fuse different Gene signature and identify genes closely related to the disease.Starting with the pretreatment of biological network data, various analytical methods of biological network are studied in this paper. Finally, these methods are applied to the study of complex diseases, and good results are obtained.The research contents and results of this paper provide support for the systematic study of various complex diseases, and help to promote our research on the diagnosis and treatment of complex diseases represented by cancer.
【學(xué)位授予單位】:中南大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2012
【分類號(hào)】:R319;O157.5
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 MOTULSKY Arno G.;;Genetics of complex diseases[J];Journal of Zhejiang University Science;2006年02期
,本文編號(hào):1711400
本文鏈接:http://sikaile.net/yixuelunwen/swyx/1711400.html
最近更新
教材專著