天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

蛋白質(zhì)功能標(biāo)注中噪聲識別模型研究

發(fā)布時間:2018-11-19 17:31
【摘要】:蛋白質(zhì)是最主要的生命活動過程的載體,執(zhí)行著生物體內(nèi)各種重要功能。對蛋白質(zhì)功能進(jìn)行自動標(biāo)注是生物信息學(xué)領(lǐng)域的關(guān)鍵問題,也是后基因時代的核心問題之一。準(zhǔn)確地標(biāo)注蛋白質(zhì)功能,對疾病機(jī)理分析與調(diào)控、新藥品研發(fā)、農(nóng)作物促產(chǎn)、生物能源開發(fā)等研究領(lǐng)域都有著極大的促進(jìn)作用。然而,蛋白質(zhì)功能標(biāo)注信息來源廣泛,噪聲標(biāo)注信息不可避免地被引入。這些噪聲功能標(biāo)注會誤導(dǎo)蛋白質(zhì)相關(guān)功能的分析與應(yīng)用,降低后續(xù)蛋白質(zhì)功能的預(yù)測精度。已有蛋白質(zhì)功能預(yù)測研究更關(guān)注預(yù)測功能信息完全未知蛋白質(zhì)的功能和蛋白質(zhì)的缺失功能,極少關(guān)注蛋白質(zhì)噪聲功能的識別。本文針對蛋白質(zhì)噪聲功能標(biāo)注識別這一問題進(jìn)行研究,其主要工作如下:(1)提出了一種基于語義相似度和分類相似度的蛋白質(zhì)噪聲功能識別算法(NoisyGOA)。該方法首先計算蛋白質(zhì)之間的語義相似度和基因本體(Gene Ontology,GO)功能標(biāo)簽之間的分類相似度,然后計算一個蛋白質(zhì)的每個功能標(biāo)注與它語義近鄰蛋白質(zhì)的功能標(biāo)注最大分類相似度之和,最后選擇與這些近鄰蛋白質(zhì)具有最小分類相似度的功能標(biāo)注為該蛋白質(zhì)的噪聲功能標(biāo)注。在酵母菌,人類和擬南芥這3個模式生物的模擬噪聲數(shù)據(jù)和真實噪聲數(shù)據(jù)集上都顯示了該方法在噪聲功能標(biāo)注識別方面的有效性與優(yōu)越性。NoisyGOA不僅展示了蛋白質(zhì)噪聲功能的可識別性,而且表明了語義相似度和分類相似度對蛋白質(zhì)噪聲功能識別的作用。(2)由于NoisyGOA在計算語義相似度的過程中易受蛋白質(zhì)已有噪聲功能標(biāo)注的影響,并且沒有功能標(biāo)注之間的差異性,本文提出另一種基于證據(jù)屬性加權(quán)和稀疏表示的蛋白質(zhì)噪聲功能識別方法(NoGOA)。NoGOA首先用一個蛋白質(zhì)-功能標(biāo)簽關(guān)聯(lián)矩陣存儲蛋白質(zhì)功能標(biāo)注信息,利用稀疏表示來計算蛋白質(zhì)之間的語義相似度,并通過蛋白質(zhì)的語義近鄰對蛋白質(zhì)的功能標(biāo)注信息投票來初步識別該蛋白質(zhì)的噪聲功能;其次,NoGOA按不同的證據(jù)屬性,對過去時期的蛋白質(zhì)噪聲功能標(biāo)注進(jìn)行統(tǒng)計和概率預(yù)估,在關(guān)聯(lián)矩陣上,根據(jù)不同證據(jù)屬性的噪聲概率,對功能標(biāo)注分別加權(quán),再利用功能標(biāo)簽間層次結(jié)構(gòu)關(guān)系向上傳播權(quán)重;最后通過整合基于語義相似度的初步識別結(jié)果和加權(quán)的蛋白質(zhì)-功能關(guān)聯(lián)矩陣的結(jié)果識別蛋白質(zhì)的噪聲功能標(biāo)注。在酵母菌,人類和擬南芥這3個模式生物上的實驗結(jié)果表明,與現(xiàn)有算法相比,NoGOA能更準(zhǔn)確識別蛋白質(zhì)噪聲功能。另外,為了驗證NoGOA識別蛋白質(zhì)噪聲功能的效果,我們剔除了NoGOA識別的噪聲功能信息,在此基礎(chǔ)上進(jìn)行蛋白質(zhì)功能預(yù)測。實驗結(jié)果展示,該方法能夠提升現(xiàn)有蛋白質(zhì)功能預(yù)測算法的精度。
[Abstract]:Protein is the most important carrier of life process, which performs various important functions in organism. Automatic labeling of protein functions is a key issue in bioinformatics and one of the core issues in the post-gene era. The accurate labeling of protein function plays a significant role in the research fields of disease mechanism analysis and regulation, new drug development, crop production promotion, bioenergy development and so on. However, the information of protein functional labeling comes from a wide range of sources, and noise tagging information is inevitably introduced. These noise function labeling can mislead the analysis and application of protein-related functions and reduce the prediction accuracy of subsequent protein functions. The research of protein function prediction has paid more attention to the function of the completely unknown protein and the missing function of the protein, and paid little attention to the recognition of the noise function of the protein. The main work of this paper is as follows: (1) A protein noise recognition algorithm (NoisyGOA). Based on semantic similarity and classification similarity is proposed. The method first calculates the semantic similarity between proteins and the classification similarity between functional tags of gene ontology (Gene Ontology,GO). Then the sum of the maximum classification similarity between each functional label of a protein and its semantic nearest neighbor protein is calculated. Finally, the function with minimal classification similarity with these adjacent proteins is selected as the noise function tagging of the protein. In yeast, Both the simulated noise data of human and Arabidopsis model organisms and the real noise datasets show the effectiveness and superiority of this method in the recognition of noise function. NoisyGOA not only shows the recognizability of protein noise function, but also shows the effectiveness of the proposed method. It also shows the effect of semantic similarity and classification similarity on protein noise function recognition. (2) NoisyGOA is easily affected by the existing noise function tagging in the process of calculating semantic similarity. And there is no difference between functional tagging, In this paper, another method of protein noise function recognition based on evidential attribute weighted and sparse representation is proposed. Firstly, a protein-function label correlation matrix is used to store protein function tagging information. The semantic similarity between proteins is calculated by sparse representation, and the noise function of protein is preliminarily recognized by voting on the function tagging information of protein by the semantic nearest neighbor of protein. Secondly, according to different evidential attributes, NoGOA makes statistical and probabilistic estimation of protein noise function tagging in the past period. In the correlation matrix, according to the noise probability of different evidential attributes, the function tagging is weighted separately. Secondly, using the hierarchical structure relationship between functional labels, the weight of transmission power is high. Finally, the noise function labeling of proteins is recognized by integrating the preliminary recognition results based on semantic similarity and the results of weighted protein-functional correlation matrix. The results of experiments on yeast, human and Arabidopsis show that NoGOA can recognize the noise function of protein more accurately than the existing algorithms. In addition, in order to verify the effect of NoGOA recognition on protein noise function, we eliminate the noise function information from NoGOA recognition, and predict the protein function on this basis. Experimental results show that this method can improve the accuracy of existing protein function prediction algorithms.
【學(xué)位授予單位】:西南大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:Q51;Q811.4

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 傅廣垣;余國先;王峻;郭茂祖;;基于正負(fù)樣例的蛋白質(zhì)功能預(yù)測[J];計算機(jī)研究與發(fā)展;2016年08期

相關(guān)博士學(xué)位論文 前1條

1 施紹萍;基于支持向量機(jī)的蛋白質(zhì)功能預(yù)測新方法研究[D];南昌大學(xué);2012年



本文編號:2342956

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/benkebiyelunwen/2342956.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶3ec63***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com