細(xì)菌必需基因團(tuán)簇模型及最小基因集構(gòu)建
本文選題:實驗確定的必需基因 + 必需基因團(tuán)簇。 參考:《電子科技大學(xué)》2015年博士論文
【摘要】:必需基因是生物體維持基本生命活動所不可缺少的基因。近來,細(xì)菌的必需基因集已經(jīng)成為微生物學(xué)、醫(yī)學(xué)、基因組學(xué)、生物信息學(xué)等學(xué)科的研究熱點。由于必需基因的重要性,必需基因已成為合成生物學(xué)的基礎(chǔ),同時必需基因能成為抗菌藥物設(shè)計的潛在靶標(biāo),且有助于理解生命的最早共同祖先。本學(xué)位論文,以必需基因為研究對象,提出了必需基因團(tuán)簇模型,并構(gòu)建了第一個必需基因團(tuán)簇數(shù)據(jù)庫(Database of cluster of essential gene,CEG);诒匦杌驁F(tuán)簇數(shù)據(jù)庫,發(fā)展了必需基因算法和軟件的實現(xiàn)(CEG_Match),并描繪了一個細(xì)菌的最小基因集藍(lán)圖和重構(gòu)了最小代謝網(wǎng)絡(luò)。以必需基因團(tuán)簇數(shù)據(jù)庫的物種為參考集,計算了2186個細(xì)菌的基因適應(yīng)度,并構(gòu)建了第一個細(xì)菌基因適應(yīng)度數(shù)據(jù)庫(IFIM)。具體如下:(1)我們首次提出必需基因團(tuán)簇模型來存儲必需基因,而不是像已有的必需基因數(shù)據(jù)庫用單個基因形式存儲。并構(gòu)建了第一個必需基因團(tuán)簇數(shù)據(jù)庫,該模型(數(shù)據(jù)庫)包含同源的必需基因簇。模型以實驗確定了必需基因的16個菌株(15個物種)為對象,把在這些物種中具有相同功能的基因作為一個團(tuán)簇,獲得了932個包含2個必需基因以上的真實必需基因團(tuán)簇,以及1929個只有1個必需基因的偽團(tuán)簇。與現(xiàn)有的以單個基因模式存儲必需基因的數(shù)據(jù)庫不同,必需基因團(tuán)簇數(shù)據(jù)庫以團(tuán)簇為單位存儲必需基因。這將大大方便研究人員的使用,例如:基于模型(數(shù)據(jù)庫)中的每個團(tuán)簇的團(tuán)簇大小信息,用戶可以很方便地確定一個必需基因是多細(xì)菌物種中保守還是物種特異的。該模型(數(shù)據(jù)庫)還收錄了每個必需基因團(tuán)簇基因(蛋白)與人類的保守性結(jié)果。利用必需基因數(shù)據(jù)庫的必需基因團(tuán)簇大小、與人類保守性等重要信息,研究人員可以進(jìn)行進(jìn)化和藥物設(shè)計的相關(guān)研究。(2)基于提出的必需基因團(tuán)簇模型,我們發(fā)展了一個必需基因預(yù)測的K-value算法并形成軟件(CEG_Match)。該軟件基于基因的功能同源性而不是基于序列的同源性。因此不需要對基因進(jìn)行測序,只需要通過簡單的實驗確定功能就能預(yù)測基因必需與否。該軟件使用簡單,相比BLAST的同源搜索比對方法具有更低的偽正率,同時保持不低的準(zhǔn)確度,且在運行時間上遠(yuǎn)遠(yuǎn)低于BLAST的同源搜索。(3)理解生物體的生存適應(yīng)度對完整地理解微生物遺傳和有效的藥物設(shè)計十分重要。目前存在的必需基因數(shù)據(jù)庫都僅提供實驗確定的二進(jìn)制必需性數(shù)據(jù)。我們集成了必需基因團(tuán)簇數(shù)據(jù)中(CEG)的細(xì)菌的實驗數(shù)據(jù),并結(jié)合理論預(yù)測數(shù)據(jù),提出了用連續(xù)性的數(shù)值來反映基因的必需性,構(gòu)建了第一個微生物基因適應(yīng)度數(shù)據(jù)庫。該數(shù)據(jù)庫涵蓋了在CEG數(shù)據(jù)庫中通過由單基因敲除和轉(zhuǎn)座突變實驗確定的11個細(xì)菌的基因適應(yīng)度、1個酵母的實驗基因適應(yīng)度和2186個理論預(yù)測的基因適應(yīng)度數(shù)據(jù)。研究發(fā)現(xiàn)理論預(yù)測的基因適應(yīng)度與實驗的基因適應(yīng)度有顯著的相關(guān)性,這說明理論預(yù)測的基因適應(yīng)度與實驗的基因適應(yīng)度一樣具有可靠性。并且用戶可以友好地訪問和瀏覽基因適應(yīng)度數(shù)據(jù)庫中的數(shù)據(jù)。基因適應(yīng)度數(shù)據(jù)庫作為第一個存儲微生物基因適應(yīng)度資源的數(shù)據(jù)庫,該數(shù)據(jù)庫有助于研究人員更好地理解微生物遺傳和開發(fā)抗菌藥物以降低致病菌的耐藥性,特別針對缺少實驗確定的基因適應(yīng)度的物種。(4)最后,基于必需基因團(tuán)簇數(shù)據(jù)庫CEG,描繪了一個細(xì)菌最小基因集藍(lán)圖和重構(gòu)了最小代謝網(wǎng)絡(luò)。最小基因集對組裝最小人工細(xì)胞非常重要,盡管有一些細(xì)菌的最小基因集已經(jīng)被報道出來,但是這些被發(fā)表的最小基因集只滿足自復(fù)制(繁殖)系統(tǒng),或者局限的引入了代謝相關(guān)基因。為了獲得一個更加可靠和完整的細(xì)菌最小基因集,相比傳統(tǒng)的確定最小基因集策略,我們有以下系統(tǒng)的創(chuàng)新:以必需基因團(tuán)簇數(shù)據(jù)庫為基礎(chǔ),從實驗確定的必需基因出發(fā),提出一個半數(shù)保留法來確定保守基因,同時引入最小代謝網(wǎng)絡(luò)重構(gòu)以補全最小基因集。最終獲得一個包含315個必需基因的最小基因集,其中157個基因參與最小代謝網(wǎng)絡(luò),涉及431個代謝反應(yīng)。我們首次獲得了一個同時滿足自復(fù)制(繁殖)和自維持(代謝)兩種系統(tǒng)的最小基因集。通過最小代謝網(wǎng)絡(luò)重構(gòu),除了確認(rèn)已經(jīng)發(fā)現(xiàn)的20個關(guān)鍵代謝物外,我們新確定了5個關(guān)鍵代謝物。此外,發(fā)現(xiàn)在最小代謝網(wǎng)絡(luò)中,高必需性基因更趨向于把其涉及的代謝物分配到多個反應(yīng)中,預(yù)示著細(xì)菌在一個反應(yīng)遭到破壞時,能保留更多的代謝物正常進(jìn)行來降低致死風(fēng)險。最后,本文討論了最小基因集的應(yīng)用領(lǐng)域:基于最小基因集,能夠擴(kuò)充現(xiàn)有的藥物靶標(biāo)數(shù)據(jù)庫來發(fā)展新藥物以降低細(xì)菌耐藥性;提出了一個半從頭合成策略來幫助設(shè)計合成一個具有廣泛生物學(xué)應(yīng)用的底盤細(xì)胞。綜上所述,本文對細(xì)菌必需基因、最小基因集的研究做了一個較全面的探索,并應(yīng)用于必需基因預(yù)測、藥物靶標(biāo)基因發(fā)現(xiàn)、合成生物學(xué)等研究上。本研究取得了一定進(jìn)展,但仍有一些問題需要進(jìn)一步深入研究。
[Abstract]:Essential genes are essential genes for the maintenance of basic life activities. Recently, the essential gene set of bacteria has become a hot spot in microbiology, medicine, genomics, bioinformatics and other disciplines. Due to the importance of essential genes, the essential genes have become the basis of biointegration, and the essential genes can become antibiosis. The potential target of drug design helps to understand the earliest common ancestor of life. This dissertation, based on the research object, proposes the essential gene cluster model and constructs the first essential gene cluster database (Database of cluster of essential gene, CEG). Based on the essential gene cluster database, the necessary base has been developed. According to the implementation of the algorithm and software (CEG_Match), the minimum gene set blueprint of a bacterium and the minimum metabolic network were reconstructed. The gene adaptation of 2186 bacteria was calculated by using the species of the essential gene cluster database as the reference set. The first bacterial gene adaptation database (IFIM) was constructed. The following is as follows: (1) we first The essential gene cluster model is proposed to store the essential genes rather than the existing essential gene databases in single gene form. The first essential gene cluster database is constructed, and the model (database) contains the homologous essential gene cluster. The model is used to determine the 16 strains of the essential genes (15 species) as the object. As a cluster of genes with the same function in these species, 932 real essential gene clusters containing more than 2 essential genes, and 1929 pseudo clusters with only 1 essential genes, are different from the existing database for storing essential genes in a single gene pattern, and the essential gene cluster database is used as a cluster. This will greatly facilitate the use of the researchers, for example: Based on the cluster size information of each cluster in the model (database), the user can easily determine whether a essential gene is conservative or species specific in a multi bacterial species. Human conservation results. Using essential gene cluster size of the essential gene database, and important information such as human conservatism, researchers can conduct related studies on Evolution and drug design. (2) based on the proposed cluster model of essential genes, we developed a K-value algorithm for the prediction of essential genes and formed a software (CEG_Match). The software is based on gene function homology rather than sequence based homology. So it does not need to be sequenced and only needs to be determined by simple experiments. The software is simple and has a lower false positive rate compared with the BLAST homologous search comparison method, while maintaining a low accuracy. And it is far lower than BLAST's homologous search at run time. (3) understanding the survival fitness of organisms is important for understanding the complete understanding of microbial inheritance and effective drug design. The existing essential gene databases provide only the binary necessary data for experimental determination. We integrate the essential gene cluster data (CEG). The experimental data of bacteria, combined with theoretical prediction data, proposed the necessity of using continuity values to reflect genes, and constructed the first microbial gene adaptation database. The database covers the genetic fitness of 11 bacteria determined in the CEG database through Dan Jiyin knockout and transposable mutation experiments, and the 1 yeast is real. Gene adaptation and 2186 theory predicted gene adaptation data. The study found that the predicted gene adaptation has a significant correlation with the experimental gene adaptation, which indicates that the predicted gene adaptation is as reliable as the genetic adaptation of the experiment. And the user can visit and browse the gene adaptation amicably. Data in the degree database. The gene adaptation database is the first database to store microbial adaptation resources. This database helps researchers to better understand the microbial inheritance and development of antimicrobial drugs to reduce the resistance of the pathogenic bacteria, especially for the species lacking the identified genetic adaptation. (4) last, In the essential gene cluster database CEG, a minimum gene set blueprint of bacteria and the reconfiguration of the minimal metabolic network are described. The minimum set of genes is very important for assembling the smallest artificial cells. Although the smallest set of genes has been reported, the smallest set of genes that have been published satisfies the self replicating (reproduction) system, or In order to obtain a more reliable and complete set of minimal bacterial genes, we have the following system innovation in order to obtain a more reliable and complete set of minimal genes. We have the following system innovation: Based on the essential gene cluster database, a half retention method is proposed to determine conservatism from the essential basis of experimental determination. At the same time, the minimum metabolic network reconfiguration was introduced to complete the minimum set of genes. Finally, a minimum set of genes containing 315 essential genes was obtained, of which 157 genes were involved in the minimum metabolic network and involved 431 metabolic reactions. We first obtained a minimum basis for both self replicating (reproduction) and self maintenance (metabolism) of two systems. In addition to identifying the 20 key metabolites that have been identified, we have identified 5 key metabolites in addition to identifying the 20 key metabolites that have been identified. In addition, it is found that in the minimum metabolic network, the high essential genes are more likely to assign their metabolites to multiple reactions, indicating that the bacteria can be retained when a response is destroyed. More metabolites are normally carried out to reduce the risk of death. Finally, this paper discusses the application field of the minimum gene set: Based on the minimum gene set, it is able to expand the existing drug target database to develop new drugs to reduce bacterial resistance; 1.5 ab initio synthesis strategy is proposed to help design and synthesize a broad biological response. In summary, this paper makes a more comprehensive study on the essential genes of bacteria and the minimum set of genes, and has been applied to the research of essential gene prediction, drug target gene discovery, synthetic biology and so on. Some progress has been made in this study, but some problems still need to be further studied.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2015
【分類號】:Q78
【相似文獻(xiàn)】
相關(guān)期刊論文 前3條
1 葉遠(yuǎn)濃;郭鋒彪;;微生物必需基因的理論研究現(xiàn)狀[J];遺傳;2012年04期
2 沈露露;杜敏;林興鳳;蔡婷;王大勇;;嗅覺神經(jīng)元AWA功能必需基因以胰島素信號依賴的方式調(diào)控秀麗線蟲的衰老(英文)[J];Neuroscience Bulletin;2010年02期
3 ;[J];;年期
相關(guān)會議論文 前2條
1 張春霆;;細(xì)菌必需基因研究與最小基因組[A];第五屆全國生物信息學(xué)與系統(tǒng)生物學(xué)學(xué)術(shù)大會論文集[C];2012年
2 郭鋒彪;寧綠文;黃健;林昊;張會雄;;新洋蔥伯克霍爾德氏菌AU-1054菌株的三條染色體上必需基因的異常分布[A];中國的遺傳學(xué)研究——遺傳學(xué)進(jìn)步推動中國西部經(jīng)濟(jì)與社會發(fā)展——2011年中國遺傳學(xué)會大會論文摘要匯編[C];2011年
相關(guān)博士學(xué)位論文 前2條
1 葉遠(yuǎn)濃;細(xì)菌必需基因團(tuán)簇模型及最小基因集構(gòu)建[D];電子科技大學(xué);2015年
2 林巖;微生物必需基因數(shù)據(jù)的分析[D];天津大學(xué);2010年
相關(guān)碩士學(xué)位論文 前2條
1 林丹;多種微生物功能基因的預(yù)測和分析[D];電子科技大學(xué);2014年
2 竇運濤;原核生物基因識別程序ZCURVE 1.02的研發(fā)和微生物必需基因的分析[D];天津大學(xué);2005年
,本文編號:2115898
本文鏈接:http://sikaile.net/shoufeilunwen/jckxbs/2115898.html