當(dāng)前位置：主頁(yè) > 醫(yī)學(xué)論文 > 基礎(chǔ)醫(yī)學(xué)論文 >

人類(lèi)精子非編碼氨基酸多樣性的研究

發(fā)布時(shí)間：2018-03-16 06:11

本文選題：蛋白質(zhì)組　切入點(diǎn)：非編碼氨基酸　出處：《山東大學(xué)》2017年碩士論文　論文類(lèi)型：學(xué)位論文

【摘要】：蛋白質(zhì)通常是由基因組的編碼序列翻譯確定的。然而,因?yàn)榉g后修飾,氨基酸替換等原因,它們的氨基酸殘基很少直接以基因組的方式確定,實(shí)際情況下的氨基酸殘基往往會(huì)發(fā)生改變,從而改變蛋白結(jié)構(gòu)和影響蛋白功能。但是目前生物體中的氨基酸殘基很少直接以蛋白質(zhì)組學(xué)的方式確定,主要是因?yàn)榕c編碼的氨基酸不同的氨基酸殘基通常會(huì)被普通搜索算法忽略,其次是因?yàn)榈鞍踪|(zhì)測(cè)序技術(shù)通常取決于理論上翻譯的蛋白質(zhì)數(shù)據(jù)庫(kù)。然而,通過(guò)假設(shè)在肽斷序列中存在一個(gè)或多個(gè)未定義的非編碼氨基酸殘基,成為解決那些無(wú)法匹配肽譜的突破點(diǎn)。在早期的方法中,部分肽段序列來(lái)源于不匹配的光譜,可以用作標(biāo)簽來(lái)搜索理論上基因組翻譯的蛋白質(zhì)數(shù)據(jù)庫(kù),搜索結(jié)果就會(huì)出現(xiàn)意想不到的翻譯后修飾和氨基酸取代。后來(lái)用非限制性搜索算法來(lái)識(shí)別非編碼氨基酸殘基,卻不知道它們是否存在。mass-tolerant方法最初用于通過(guò)允許前體與其片段之間的質(zhì)量差異來(lái)檢測(cè)已知的修飾,近來(lái)改進(jìn)了該方法,通過(guò)允許寬泛的mass-tolerant來(lái)匹配含有寬范圍質(zhì)量差或未定義的修飾的肽段序列,找到許多修飾。但是這些方法的主要問(wèn)題仍然是較高的假陽(yáng)性,較低的靈敏度和漫長(zhǎng)的搜索時(shí)間。在這里我們系統(tǒng)研究了在人類(lèi)精子細(xì)胞中所有可能的氨基酸殘基,它們的相對(duì)分子質(zhì)量不同于在基因組序列中編碼的氨基酸,稱(chēng)為非編碼氨基酸(ncAA)。通過(guò)測(cè)量編碼氨基酸和實(shí)際蛋白質(zhì)殘基之間的質(zhì)量差,發(fā)現(xiàn)超過(guò)一百萬(wàn)個(gè)存在非零質(zhì)量差的氨基酸,即側(cè)鏈發(fā)生改變的氨基酸。然后根據(jù)這些質(zhì)量差做高斯混合分布分析以及迭代回歸分析,從而確定了424種高可信度的聚集高斯簇,通過(guò)機(jī)器學(xué)習(xí)算法建立決策樹(shù)確定了849種高度可信的ncAAs,分布在35,274個(gè)蛋白質(zhì)位點(diǎn)上。其中發(fā)現(xiàn)180種質(zhì)量差聚類(lèi)顯示具有從未報(bào)告過(guò)的氨基酸側(cè)鏈結(jié)構(gòu);105種ncAAs匹配到氨基酸替換的類(lèi)型,其中40種通過(guò)轉(zhuǎn)錄組測(cè)序得以確認(rèn)。此外,根據(jù)ANOVA分析結(jié)果,發(fā)現(xiàn)有些ncAAs在正常人群中存在特異性分布,暗示著這些ncAAs可能與人群差異性有關(guān)。還有些ncAAs在重度少弱精患者和正常人群中呈差異性分布,暗示著這些ncAAs與患病機(jī)理有關(guān),其中有些磷酸化位點(diǎn)已經(jīng)被之前的研究所報(bào)道。我們的研究表明ncAAs廣泛存在于精子細(xì)胞中,主要是因?yàn)楹塑账岫鄳B(tài)性,翻譯后修改,以及一些未知的機(jī)制,這些對(duì)疾病的診斷和藥物靶向治療存在重要意義。
[Abstract]:Proteins are usually determined by the translation of the coding sequence of the genome. However, because of post-translational modification, amino acid substitution and other reasons, their amino acid residues are rarely determined directly by the genome. In practice, amino acid residues often change, thus changing protein structure and affecting protein function. However, at present, amino acid residues in organisms are rarely determined directly by proteomics. Mainly because amino acid residues that are different from the amino acids encoded are often ignored by common search algorithms, followed by protein sequencing techniques that are generally dependent on the protein database that is theoretically translated. By assuming that there are one or more undefined non-coding amino acid residues in the peptide sequence, it becomes a breakthrough point to solve the problem of unmatched peptide spectrum. In the early methods, some of the peptide fragment sequences were derived from mismatched spectra. Can be used as a tag to search a protein database for theoretical genomic translation, resulting in unexpected posttranslational modifications and amino acid substitutions. Then an unconstrained search algorithm is used to identify non-encoded amino acid residues. Not knowing whether they exist or not, the. Mass-tolerant method, which was originally used to detect known modifications by allowing quality differences between precursors and their fragments, has recently been improved. Many modifications are found by allowing broad mass-tolerant to match peptide sequences containing a wide range of poor or undefined modifications. But the main problem with these methods is still high false positivity. Low sensitivity and long search time. Here we systematically studied all the possible amino acid residues in human sperm cells, whose relative molecular weights differ from those encoded in genomic sequences. By measuring the mass difference between the encoded amino acids and the actual protein residues, more than one million amino acids with non-zero mass differences were found. According to these mass differences, Gao Si mixed distribution analysis and iterative regression analysis were made to determine 424 kinds of high reliability aggregating Gao Si clusters. A decision tree of 849 highly trusted ncAAss was established by machine learning algorithm, which was distributed on 35,274 protein sites. Among them, 180 mass difference clusters were found to have unreported amino acid side chain structure (ncAAs) matching. Type of amino acid replacement, Forty of them were identified by transcriptome sequencing. In addition, according to the results of ANOVA analysis, some ncAAs were found to have specific distribution in normal population. This suggests that these ncAAs may be related to population differences, and that some ncAAs are distributed differently in patients with severe oligozoospermia and in normal people, suggesting that these ncAAs may be related to the pathogenesis of the disease. Some of these phosphorylation sites have been reported in previous studies. Our studies have shown that ncAAs is widespread in sperm cells, mainly due to nucleotide polymorphisms, post-translational modifications, and unknown mechanisms. These are important for the diagnosis of disease and drug targeted therapy.
【學(xué)位授予單位】：山東大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類(lèi)號(hào)】：R321.1

【相似文獻(xiàn)】

相關(guān)碩士學(xué)位論文前2條

1 張晨;UAA編碼氨基酸表達(dá)體系的構(gòu)建[D];吉林大學(xué);2017年

2 陳新駿;人類(lèi)精子非編碼氨基酸多樣性的研究[D];山東大學(xué);2017年

，

本文編號(hào)：1618651

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/yixuelunwen/jichuyixue/1618651.html

上一篇：骨不連的初步流行病學(xué)研究及ADAMTS-7在大鼠骨不連模型中的表達(dá)和意義
下一篇：結(jié)核分枝桿菌致病基因簇的全基因組偶聯(lián)定位及功能分析

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

人類(lèi)精子非編碼氨基酸多樣性的研究