基于語義的蛋白質(zhì)復(fù)合物識別算法的研究與應(yīng)用
發(fā)布時間:2018-04-19 12:51
本文選題:蛋白質(zhì)相互作用網(wǎng)絡(luò) + 聚集系數(shù); 參考:《西安理工大學(xué)》2017年碩士論文
【摘要】:近年以來,隨著生物學(xué)的不斷發(fā)展,推動了人類基因組計劃的順利完成,與此同時,系統(tǒng)生物學(xué)和蛋白質(zhì)組學(xué)的研究也在不斷的深入。因此,如何基于已知蛋白質(zhì)相互作用網(wǎng)絡(luò)的結(jié)構(gòu)及生物特性,研究蛋白質(zhì)復(fù)合物及其功能特性成為當(dāng)下的一個研究熱點。蛋白質(zhì)復(fù)合物識別的算法就是用來解決這一問題,通過該方法可以挖掘有生物意義的蛋白質(zhì)復(fù)合物,預(yù)測未知蛋白質(zhì)的功能。本文詳細介紹了蛋白質(zhì)復(fù)合物識別的基本研究方法,主要包括基于圖劃分方法、基于層次的方法以及基于生物信息融合的方法。在此基礎(chǔ)上,對蛋白質(zhì)相互作用網(wǎng)絡(luò)的結(jié)構(gòu)為基點,結(jié)合了蛋白質(zhì)復(fù)合物自身結(jié)構(gòu)特點,提出了兩個新的復(fù)合物識別算法。(1)提出了一個基于語義相似度的蛋白質(zhì)復(fù)合物識別算法。由于目前多數(shù)復(fù)合物的識別算法是作用于無權(quán)蛋白質(zhì)網(wǎng)絡(luò)上,沒有考慮到蛋白質(zhì)之間固有的生物特性,這會對復(fù)合物識別的準確率產(chǎn)生了較大的影響。因此,本文提出了一種基于語義相似度的聚類算法---DSC算法。該算法首先構(gòu)建蛋白質(zhì)加權(quán)網(wǎng)絡(luò),在無權(quán)網(wǎng)絡(luò)上基于邊的聚集系數(shù)識別蛋白質(zhì)復(fù)合物。實驗證明,該算法取得良好的實驗結(jié)果。(2)提出了一個基于關(guān)鍵節(jié)點分層擴展的蛋白質(zhì)復(fù)合物識別算法。針對傳統(tǒng)算法側(cè)重于網(wǎng)絡(luò)整體的拓撲結(jié)構(gòu),忽略了對復(fù)合物自身結(jié)構(gòu)特點的研究。本文采用了網(wǎng)絡(luò)關(guān)鍵節(jié)點選擇、多層次擴展的方法來識別蛋白質(zhì)復(fù)合物的方式。在分層擴展的過程中,使用我們構(gòu)造的加權(quán)相互作用網(wǎng)絡(luò),以節(jié)點之間的語義相似度作為擴展的基礎(chǔ),提出了基于關(guān)鍵節(jié)點分層擴展的蛋白質(zhì)復(fù)合物識別算法---KNHE算法,并將其應(yīng)用在蛋白質(zhì)加權(quán)網(wǎng)絡(luò)中。由于算法充分考慮了已知關(guān)鍵蛋白質(zhì)的重要性以及復(fù)合物自身的結(jié)構(gòu)特點。實驗結(jié)果顯示,該算法在敏感性、特異性等方面都有很大的提升,實驗取得了良好的結(jié)果。本論文提出的兩個蛋白質(zhì)復(fù)合物識別算法從不同角度出發(fā),有效的解決了識別率低的問題,而且算法具有很好的聚類效果,識別的復(fù)合物普遍具有生物意義。
[Abstract]:In recent years, with the continuous development of biology, the human genome project has been successfully completed. At the same time, the research of system biology and proteomics is also deepening.Therefore, how to study protein complexes and their functional properties based on the structure and biological properties of known protein interaction networks has become a hot topic.The algorithm of protein complex recognition is used to solve this problem, by which we can mine protein complex with biological significance and predict the function of unknown protein.In this paper, the basic research methods of protein complex recognition are introduced in detail, including graph partitioning method, hierarchical method and biological information fusion method.On the basis of this, the structure of protein interaction network is taken as the basis point, and the structural characteristics of protein complex itself are combined.This paper proposes two new complex recognition algorithms. (1) A protein complex recognition algorithm based on semantic similarity is proposed.Because most of the current complex recognition algorithms act on the unweighted protein network and do not take into account the inherent biological characteristics between proteins, this will have a great impact on the accuracy of complex recognition.Therefore, a clustering algorithm-DSC based on semantic similarity is proposed in this paper.The algorithm firstly constructs a protein-weighted network and recognizes the protein complex based on edge aggregation coefficient on the unweighted network.Experimental results show that the algorithm achieves good experimental results. (2) A protein complex recognition algorithm based on delamination expansion of key nodes is proposed.The traditional algorithm focuses on the topology of the whole network and neglects the study of the structure of the complex itself.In this paper, the method of network key node selection and multilevel expansion is used to identify protein complex.In the process of delamination expansion, using the weighted interaction network constructed by us, and taking the semantic similarity between nodes as the basis of the extension, a protein complex recognition algorithm-KNHE algorithm based on delamination expansion of key nodes is proposed.It is applied to protein weighted network.The importance of known key proteins and the structural characteristics of the complexes are fully considered in the algorithm.The experimental results show that the algorithm has a great improvement in sensitivity and specificity, and good results have been obtained.The two protein complex recognition algorithms proposed in this paper effectively solve the problem of low recognition rate from different angles, and the algorithm has a good clustering effect, and the recognized complex has biological significance.
【學(xué)位授予單位】:西安理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:Q51;O157.5
【參考文獻】
相關(guān)期刊論文 前4條
1 冀俊忠;劉志軍;劉紅欣;劉椿年;;蛋白質(zhì)相互作用網(wǎng)絡(luò)功能模塊檢測的研究綜述[J];自動化學(xué)報;2014年04期
2 王巍;盧衛(wèi)紅;孫野青;;基于基因本體論的模式生物分子功能分布異同[J];生物信息學(xué);2010年03期
3 程云輝;王璋;許時嬰;;Antioxidant properties of wheat germ protein hydrolysates evaluated in vitro[J];Journal of Central South University of Technology(English Edition);2006年02期
4 劉濤,陳忠,陳曉榮;復(fù)雜網(wǎng)絡(luò)理論及其應(yīng)用研究概述[J];系統(tǒng)工程;2005年06期
相關(guān)博士學(xué)位論文 前1條
1 李敏;蛋白質(zhì)網(wǎng)絡(luò)中復(fù)合物和功能模塊挖掘算法研究[D];中南大學(xué);2008年
相關(guān)碩士學(xué)位論文 前1條
1 張睿;基于點聚集系數(shù)和邊聚集系數(shù)的社區(qū)發(fā)現(xiàn)算法[D];云南大學(xué);2013年
,本文編號:1773165
本文鏈接:http://sikaile.net/kejilunwen/yysx/1773165.html
最近更新
教材專著