自聯(lián)想神經(jīng)網(wǎng)絡(luò)算法在蛋白質(zhì)結(jié)構(gòu)取樣空間中的應(yīng)用
發(fā)布時間:2018-05-25 15:17
本文選題:同源建模 + 缺失值 ; 參考:《華北電力大學(xué)(北京)》2017年碩士論文
【摘要】:蛋白質(zhì)結(jié)構(gòu)預(yù)測是蛋白質(zhì)結(jié)構(gòu)和功能研究工作的重要組成部分,對蛋白質(zhì)藥物分子設(shè)計、生物制藥等方面有重要的意義。若已知同源蛋白質(zhì)家族中某些蛋白質(zhì)的結(jié)構(gòu),就可以預(yù)測其他一些序列已知而結(jié)構(gòu)未知的同源蛋白質(zhì)結(jié)構(gòu)。通過序列比對,能夠?qū)㈤L度不等的序列通過插入空位變成等長序列,這些空位位置代表了相比對的序列是從相同的祖先通過插入和刪除等操作的演化而來,進(jìn)而反應(yīng)了在生物進(jìn)化過程中的變異,突變現(xiàn)象。空位的出現(xiàn)會對同源蛋白質(zhì)建模的尺度和精度產(chǎn)生很大影響,因此對蛋白質(zhì)序列比對中缺失值的研究具有重要意義。對蛋白質(zhì)缺失數(shù)據(jù)的填充在之前已經(jīng)通過一些方法得到了很好的實(shí)現(xiàn),如最鄰近算法,自組織神經(jīng)網(wǎng)絡(luò)算法。這兩種方法對蛋白質(zhì)缺失數(shù)據(jù)均給予了合理的填充,并且在平均探究尺度上從62.9%提升到82.7%,研究精度從1.65?提升到0.88?。但是由于蛋白質(zhì)的結(jié)構(gòu)空間復(fù)雜,對蛋白質(zhì)取樣空間預(yù)測的計算量非常龐大,這使得計算過程比較耗時。為此,我們希望在能夠合理對蛋白質(zhì)缺失值填充的前提下,提高計算的速度,減少計算量。本文以自聯(lián)想神經(jīng)網(wǎng)絡(luò)(Autoassociative Neural Networks,AANN)的非線性主成分算法為基礎(chǔ),綜合考慮到蛋白質(zhì)取樣空間構(gòu)造復(fù)雜和蛋白質(zhì)列數(shù)據(jù)庫的增長速度,本文采用一種基于改進(jìn)的逆非線性網(wǎng)絡(luò)模型(Inverse NLPCA Model)來實(shí)現(xiàn)缺失值的填充和效率提升,并對該網(wǎng)絡(luò)模型采用共軛梯度算法優(yōu)化以更進(jìn)一步加快計算效率。
[Abstract]:Protein structure prediction is an important part of protein structure and function research, which is of great significance in protein drug molecular design and biopharmaceutical. If the structure of some proteins in the homologous protein family is known, some other homologous protein structures with known sequences and unknown structures can be predicted. By sequence alignment, it is possible to convert sequences of varying lengths from inserted vacancies to equal-length sequences, which represent the evolution of pairs of sequences from the same ancestor through operations such as insertion and deletion. It also reflects the variation and mutation in the process of biological evolution. The occurrence of vacancies will have a great impact on the scale and accuracy of homologous protein modeling, so it is of great significance to study the missing values in protein sequence alignment. The filling of protein missing data has been implemented by some methods, such as nearest neighbor algorithm and self-organizing neural network algorithm. These two methods have given reasonable filling to the protein missing data, and the average inquiry scale has been raised from 62.9% to 82.7, and the precision of the research has been increased from 1.65? Rose to 0.88. However, because of the complexity of protein structure space, the calculation of protein sampling space prediction is very large, which makes the calculation process more time-consuming. Therefore, we hope to increase the speed of calculation and reduce the amount of calculation on the premise of reasonably filling the missing value of protein. Based on the nonlinear principal component algorithm of autoassociative Neural Networks, this paper considers the complexity of protein sampling space and the growth rate of protein sequence database. In this paper, an inverse NLPCA model based on the improved inverse NLPCA Model is used to fill the missing value and improve the efficiency. The conjugate gradient algorithm is used to optimize the network model to further accelerate the computational efficiency.
【學(xué)位授予單位】:華北電力大學(xué)(北京)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:Q51;TP183
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 韓榕生;吳國慶;張美玲;;一種有效擴(kuò)大蛋白質(zhì)同源建模尺度方法[J];河北科技師范學(xué)院學(xué)報;2013年03期
2 高曉紅;;ART神經(jīng)網(wǎng)絡(luò)的發(fā)展與應(yīng)用[J];電腦知識與技術(shù)(學(xué)術(shù)交流);2007年20期
3 殷志祥;蛋白質(zhì)結(jié)構(gòu)預(yù)測方法的研究進(jìn)展[J];計算機(jī)工程與應(yīng)用;2004年20期
4 黃向華;基于自聯(lián)想神經(jīng)網(wǎng)絡(luò)的發(fā)動機(jī)控制系統(tǒng)傳感器故障診斷與重構(gòu)(英文)[J];Chinese Journal of Aeronautics;2004年01期
5 孔薇,楊杰;基于神經(jīng)網(wǎng)絡(luò)的非線性PCA方法[J];計算機(jī)仿真;2003年07期
,本文編號:1933682
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1933682.html
最近更新
教材專著