細菌sRNA靶標數(shù)據(jù)庫3.0構(gòu)建及其功能注釋研究
發(fā)布時間:2018-01-16 04:24
本文關(guān)鍵詞:細菌sRNA靶標數(shù)據(jù)庫3.0構(gòu)建及其功能注釋研究 出處:《中國人民解放軍軍事醫(yī)學科學院》2016年博士論文 論文類型:學位論文
更多相關(guān)文章: 細菌sRNA 靶標mRNA 數(shù)據(jù)庫
【摘要】:細菌s RNA是與多種生物學過程相關(guān)的重要調(diào)控RNA,例如新陳代謝、群體感應(quorum sensing)、生物膜形成、鐵元素調(diào)控和毒力調(diào)節(jié)等。它們主要通過與靶標m RNA或者蛋白質(zhì)結(jié)合發(fā)揮功能,因此,系統(tǒng)收集實驗證實的細菌s RNA靶標,并開發(fā)相應的數(shù)據(jù)庫管理分析系統(tǒng),不僅可為深入了解s RNA功能和作用機制提供幫助,也可為開發(fā)細菌s RNA靶標預測模型提供支持。目前與細菌s RNA相關(guān)的數(shù)據(jù)庫主要有s RNAMap、s RNAdb、Rfam、Regulon DB、NPInter、BSRD和s RNATar Base,這些數(shù)據(jù)庫在數(shù)據(jù)收集與數(shù)據(jù)注釋方面各有側(cè)重。例如s RNAMap是一個革蘭氏陰性細菌s RNA數(shù)據(jù)庫,包含了來自70個微生物基因組的397個s RNA、62個s RNA轉(zhuǎn)錄因子和60個s RNA靶標。此外數(shù)據(jù)庫還提供了s RNA的二級結(jié)構(gòu)預測、s RNA表達條件和s RNA表達水平等信息。s RNAdb則是一個收集革蘭氏陽性細菌s RNA的數(shù)據(jù)庫平臺,該數(shù)據(jù)庫包括了558個革蘭氏陽性細菌基因組和質(zhì)粒、671個實驗證實的細菌s RNA以及9993個預測的細菌s RNA,并可以對用戶輸入的s RNA數(shù)據(jù)進行分析,尋找其同源s RNA。Rfam數(shù)據(jù)庫主要收集來自真核與原核生物的各種nc RNA家族,并提供二級結(jié)構(gòu)信息,在細菌s RNA方面,主要收集相關(guān)s RNA序列信息,不涉及s RNA靶標。數(shù)據(jù)庫Regulon DB則是一個關(guān)于大腸桿菌K-12中轉(zhuǎn)錄調(diào)控網(wǎng)絡的數(shù)據(jù)庫,其中包括轉(zhuǎn)錄單元(transcription units,TUs)、啟動子和轉(zhuǎn)錄調(diào)控子(transcriptional regulators,TRs)等信息。該數(shù)據(jù)庫收錄了110個s RNA和227對s RNA-target相互作用,其中包含53個靶標m RNA結(jié)合位點。NPInter主要收集實驗證實的非編碼RNA(排除t RNA和r RNA)和其他生物分子(蛋白質(zhì)、RNA和基因組DNA)的相互作用。NPInter v2.0含有201107個相互作用條目,涉及18個物種。其中包括32個細菌s RNA以及107個細菌s RNAtarget相互作用數(shù)據(jù),但沒有收錄結(jié)合位點信息。BSRD是由Huang等人2013年開發(fā)的一個綜合性的細菌s RNA數(shù)據(jù)庫,它系統(tǒng)收集了細菌s RNA信息并整合了大量的注釋信息。BSRD通過整合其他數(shù)據(jù)庫信息和手工文獻收集的方法共收集了897個實驗證實的細菌s RNA、8248個s RNA同系物以及高通量測序數(shù)據(jù)預測得到的507個候選s RNA。在s RNA靶標方面,主要整合了s RNA預測靶標和s RNATar Base數(shù)據(jù)庫提供的靶標信息。s RNATar Base是我們課題組2010年開發(fā)的一個實驗證實的細菌s RNA靶標數(shù)據(jù)庫。該數(shù)據(jù)庫共收錄數(shù)據(jù)392例,涉及17個細菌基因組,包含68個s RNA和227個靶標(或非靶標),特別是還包含了s RNA-m RNA相互作用結(jié)合位點信息。通過上述分析,可以看到,除了s RNATar Base,目前并沒有數(shù)據(jù)庫提供完整的細菌s RNA靶標信息,特別是沒有s RNA-m RNA相互作用位點信息,不利于s RNA靶標m RNA預測模型的開發(fā)。其次,s RNATar Base數(shù)據(jù)庫久未更新。為此,本課題擬在s RNATar Base的基礎(chǔ)上構(gòu)建全新的細菌s RNA靶標數(shù)據(jù)庫,并在數(shù)據(jù)庫的基礎(chǔ)上開展功能注釋研究。為構(gòu)建一個數(shù)據(jù)全面、功能豐富的細菌s RNA靶標數(shù)據(jù)庫,本研究采用三個策略進行數(shù)據(jù)收集工作:(1)根據(jù)NCBI基因組最新注釋信息以及s RNATar Base2.0各條目的對應文獻,對2.0版的392條數(shù)據(jù)進行全面校驗和系統(tǒng)更新,如s RNA和靶標的NCBI識別編號鏈接、基因組位置、序列、各種位點坐標等信息。(2)采用不同關(guān)鍵詞組合,例如bacterial s RNA target、bacterial small regulatory RNA target等,在Pub Med數(shù)據(jù)庫中搜索細菌s RNA靶標相關(guān)文獻,共得到在2010年1月1日-2015年6月1日之間發(fā)表的3124篇文獻。根據(jù)摘要,從中篩選出120篇包含細菌s RNA靶標數(shù)據(jù)的文獻,然后詳細閱讀這些文獻并提取需要的s RNA-靶標信息和實驗證據(jù)。(3)為防止靶標數(shù)據(jù)的遺漏,從所有細菌s RNA靶標預測工具的文獻中提取s RNA-靶標數(shù)據(jù)集,并與數(shù)據(jù)庫中的數(shù)據(jù)進行比對。最后,截至2015年6月1日,數(shù)據(jù)庫共包含來自53個基因組的771個s RNA-靶標數(shù)據(jù),其中有492個經(jīng)實驗證實細菌s RNA-靶標數(shù)據(jù)和279個無相互作用數(shù)據(jù)。數(shù)據(jù)庫中包含752條s RNA-m RNA記錄,和19條s RNA-蛋白質(zhì)記錄。此外,我們搭建了全新的數(shù)據(jù)庫網(wǎng)站服務器,為用戶提供更好的服務。數(shù)據(jù)庫網(wǎng)站(http://ccb1.bmi.ac.cn/srnatarbase/)主要包括6個主要功能。(1)通過常見信息(s RNA信息、靶標信息、s RNA-靶標相互作用信息和實驗證據(jù))、序列(Blast功能)以及文獻對數(shù)據(jù)庫進行檢索,同時還支持多條件組合查詢。(2)RNA二級結(jié)構(gòu)動態(tài)展示。(3)細菌s RNA-靶標相互作用的NCBI序列展示。(4)細菌s RNA-靶標調(diào)控網(wǎng)絡展示。(5)基于s RNATarget和s Tar Picker靶標預測,并對得到的預測靶標進行功能富集分析。網(wǎng)站提供DAVID、GOEAST和PANTHER三個注釋平臺供用戶選擇。(6)進化分析(Phylogenetic analysis),用來檢測s RNA-靶標相互作用在相近基因組中的保守性。在數(shù)據(jù)庫中我們發(fā)現(xiàn)一些s RNA擁有多個靶標,一些靶標被多個s RNA調(diào)控。為了研究一個s RNA與一組靶標或一個靶標與一組s RNA之間的關(guān)系,我們開發(fā)了在線服務器Cos Tar,一個用于分析細菌s RNA靶標協(xié)同調(diào)控作用的分析工具。對于實驗中產(chǎn)生的s RNA(或者基因)集合,例如在不同條件下差異表達基因集合,Cos Tar可以預測可能和它們相互作用的基因(或者s RNA)列表,從而對進一步的實驗提供指導。我們從BSRD數(shù)據(jù)庫中得到897個s RNA序列,從NCBI數(shù)據(jù)庫中下載最新的細菌基因組序列。然后選取s RNATarge和s Tar Picker兩種預測工具對選取的s RNA進行批量預測,將得到的結(jié)果按照統(tǒng)一的格式存入預測靶標數(shù)據(jù)庫中。輸入為一組s RNA時,我們采用統(tǒng)計學中的超幾何分布來計算每一個m RNA的P值,依據(jù)P值對所有靶標進行排序。其中P值小于給定閾值的m RNA可以作為這一組s RNA的預測靶標。為方便相關(guān)研究人員的使用,我們還構(gòu)建了在線分析服務器Cos Tar。綜上所述,本文以細菌s RNA為中心,開展了兩部分的工作:(1)我們成功地構(gòu)建了細菌s RNA靶標數(shù)據(jù)庫3.0。數(shù)據(jù)庫共包含來自213篇文獻的771條記錄,其中實驗證實的細菌s RNA-靶標數(shù)據(jù)有492個,結(jié)合位點有316個。與其他細菌s RNA數(shù)據(jù)庫(Regulon DB、BSRD、s RNAMap和NPInter等)相比,s RNATar Base3.0不僅提供了最新最全的細菌s RNA靶標數(shù)據(jù),同時還包含了316個結(jié)合位點數(shù)據(jù)以及實驗中的突變信息。此外,全新的數(shù)據(jù)庫網(wǎng)站提供了NCBI序列展示、s RNA調(diào)控網(wǎng)絡、預測靶標及其GO注釋和進化分析等各項功能,使得s RNATar Base3.0成為一個功能豐富的細菌s RNA靶標數(shù)據(jù)庫。(2)我們成功構(gòu)建了一個用于預測細菌s RNA-靶標協(xié)同調(diào)控作用的在線服務器Cos Tar。Cos Tar提供s RNA-Gene和Gene-s RNA兩個功能,不僅能預測一組s RNA協(xié)同調(diào)控的靶標m RNA,還可以預測調(diào)控一組靶標m RNA的s RNA。該工作的主要特色與創(chuàng)新點有三個方面:(1)構(gòu)建的細菌s RNA靶標數(shù)據(jù)庫3.0擁有最為全面的細菌s RNA靶標數(shù)據(jù),可以為相關(guān)研究(例如開發(fā)細菌s RNA靶標預測模型等)提供全面、準確的數(shù)據(jù)。(2)構(gòu)建的數(shù)據(jù)庫網(wǎng)站提供NCBI基因組展示、s RNA調(diào)控網(wǎng)絡和GO分析等多種工具,可以從各個角度解讀s RNA靶標數(shù)據(jù),能夠為相關(guān)研究人員提供幫助。(3)構(gòu)建的Cos Tar在線分析工具是首次從協(xié)同調(diào)控角度分析細菌s RNA-靶標數(shù)據(jù)的工具,可以為相關(guān)人員提供幫助。
[Abstract]:Bacterial s RNA is an important regulator of RNA, associated with a variety of biological processes such as The new supersedes the old. (quorum sensing), quorum sensing, biofilm formation, iron regulation and virulence regulation. They mainly through the function and target of M RNA or protein binding system therefore, collect experiments confirmed that bacterial s RNA target, and the development of the corresponding database management and analysis system, not only can provide help for the function and mechanism of the in-depth understanding of s RNA, also can forecast model to support the development of s RNA. The target bacteria associated with bacterial s RNA database to s RNAMap, s RNAdb, Rfam Regulon, DB, NPInter, BSRD and s RNATar Base. The database in the data collection and data annotation respectively. For example, s RNAMap is a gram-negative bacterium s RNA database, including 397 s RNA from 70 microbial genomes, 62 s transcription factor RNA And 60 s of RNA target. In addition the database also provides a forecast for the two level structure of s RNA, s RNA and s RNA expression of the expression level of.S RNAdb information is a collection of gram positive bacteria s RNA database platform, the database includes 558 gram positive bacterial genomes and plasmids, 671 experiments confirmed bacterial s RNA and 9993 s RNA and prediction of bacteria, can s RNA user input data analysis, find the homologous s RNA.Rfam database is mainly collected from various eukaryotic and prokaryotic NC RNA family, and provides two levels of structure information, the bacterium s RNA, the main collection s RNA sequence information, does not involve the S RNA target. Regulon DB is a database of Escherichia coli K- 12 transcriptional regulatory network database, including transcription units (transcription, units, TUs) promoter and transcription factor (trans Criptional regulators, TRs) and other information. The database is a collection of 110 s RNA and 227 s RNA-target interaction, which contains 53 target m RNA RNA confirmed.NPInter loci encoding with non main collection experiment (t RNA and R RNA excluded) and other biological molecules (proteins, RNA and genomic DNA) each other.NPInter v2.0 contains 201107 interactions involving 18 items, including 32 bacterial species. S RNA and s RNAtarget 107 bacterial interaction data, but without binding site information.BSRD is a comprehensive Huang et al in 2013 the development of bacterial s RNA database system, it collects the information of bacterial s RNA and the integration of a large number of.BSRD through the method of annotation information integration of database information and manual collection of literature collected a total of 897 experiments confirmed that bacteria s RNA, 8248 s RNA homologues and high flux measurement The predicted 507 candidate s RNA. in s RNA on the target sequence data, mainly the integration of s RNA and s RNATar Base forecast target database provides information of.S RNATar Base is the target we confirmed a subject of experimental group in 2010 the development of bacterial s RNA target database. This database collected data of 392 cases, involving 17 bacterial genome contains 68 s RNA and 227 target (or target), especially s RNA-m RNA also contains the binding site of the interaction information. Through the above analysis, we can see that in addition to s RNATar Base, there is no database to provide complete information of the target bacteria s RNA s RNA-m RNA, especially not mutually site information, is not conducive to the development of predictive model of s RNA m RNA target. Secondly, the s RNATar Base database for a long time not updated. Therefore, this paper intends to build a new bacterial s RNA target data based on s RNATar Base. Library, and carry out functional annotation research based on the database. Data for the construction of a comprehensive, feature rich bacterial s RNA target database, this study adopts three strategies for data collection work: (1) according to the latest NCBI genome annotation information and s RNATar Base2.0 to the corresponding literature, a comprehensive update check the system of 392 and 2 version of the data, such as NCBI RNA and s link identification number, target sequence, genomic location, site coordinates and other information. (2) using different combination of keywords, such as bacterial s RNA target, bacterial small regulatory RNA target s RNA, the search target bacteria related literature in Pub Med database in a total of 3124 articles published between January 1, 2010 -2015 June 1st. According to the summary, screened from 120 included bacterial s RNA target data of the literature, then read the text in detail Offer and extract needed s RNA- target information and experimental evidence. (3) to prevent the target missing data extraction, s RNA- target data set from all the bacteria s RNA target prediction tools in the literature, and compared with the data in the database. Finally, as of June 1, 2015, the database contains a total of 771 s RNA- target data from the 53 genomes, including 492 experiments of bacterial s RNA- target data and 279 non interaction data. The database contains 752 records of RNA s RNA-m, and 19 s RNA- protein records. In addition, we build a new database web server, to provide users with better service. The database website (http://ccb1.bmi.ac.cn/srnatarbase/) mainly includes 6 main functions. (1) through the common information (s RNA, s RNA- target target information, interaction information and experimental evidence), the sequence (Blast) and the Offer the retrieval of the database, and also supports multi condition combination query. (2) RNA two level structure dynamic display. (3) NCBI sequences show bacterial s RNA- target interactions. (4) bacterial s RNA- target regulatory network display. (5) s RNATarget and s Tar Picker target prediction based on and on the prediction of target enrichment analysis. The site provides DAVID, GOEAST and PANTHER three annotation platform for users to choose. (6) (Phylogenetic analysis), phylogenetic analysis to detect s RNA- in the genome of the target are similar to conservative. In the database we found some s RNA with multiple targets, some target by multiple s RNA regulation. In order to study the relationship between a s RNA and a group of target or a target and a group of s RNA, we developed an online server Cos Tar, an analysis tool for analysis of synergistic regulation of bacterial s RNA on the target. S RNA in the set (or genes), such as the differences in gene expression under different conditions set, Cos Tar can predict genes and their mutual action (or s RNA) list, so as to provide guidance for further experiment. We are from the BSRD database to the 897 s RNA sequence, Download bacteria the latest genome sequence from NCBI database. Then select s RNATarge and s Tar Picker two prediction tools to predict s RNA batch selection, the results will be in accordance with the unified format stored in the database. The input predicted targets for a group of s RNA, we used hypergeometric distribution statistics to calculate each a m RNA P, according to the P value to sort all targets. The P value of M RNA is less than a given threshold can be used as the predicted targets of a group of s RNA. For the convenience of research staff, we also constructed in Line analysis server Cos Tar. based on s RNA in bacteria as the center, to carry out the work of two parts: (1) we have successfully constructed the bacterial s RNA target database 3.0. database contains 771 records from 213 articles, of which s RNA- bacteria target experiments confirmed the calibration data of 492, binding site 316. And other bacteria s RNA database (Regulon DB, BSRD s, RNAMap and NPInter) compared to s RNATar Base3.0 not only provides a bacterial s RNA target data of the latest, but also includes the combination of 316 mutation site data and information in the experiment. In addition, the new database website the NCBI sequence of s RNA display, regulatory network, the function of GO and its predicted target annotation and phylogenetic analysis, the s RNATar Base3.0 has become a rich function of bacterial s RNA target database. (2) we have successfully constructed for a pre Detection of bacterial s RNA- target synergistic regulation Cos Tar.Cos Tar online server provides s RNA-Gene and Gene-s RNA two, m RNA can not only predict the target of a group of s RNA cooperative regulation, the main characteristics and innovations can also predict a group of M RNA control target s RNA. the work has three aspects: (1) construction of the bacterial s RNA target database 3 has the most bacteria s RNA target data comprehensively, can for related research (such as the development of bacterial s RNA target prediction model) to provide comprehensive and accurate data. (2) the construction of database website NCBI genome display, a variety of tools s RNA control network and GO analysis so, s RNA target data can be interpreted from various angles, to provide help for the related researchers. (3) Cos Tar online analysis tool construction is the first analysis of bacterial s RNA- target data from the collaborative tools for the phase angle control, The personnel are provided with help.
【學位授予單位】:中國人民解放軍軍事醫(yī)學科學院
【學位級別】:博士
【學位授予年份】:2016
【分類號】:Q78
【相似文獻】
相關(guān)期刊論文 前3條
1 劉倩;應曉敏;吳佳td;查磊;李伍舉;;基于轉(zhuǎn)錄終點序列特征預測大腸桿菌sRNA[J];生物物理學報;2011年03期
2 趙小凱;竹俊蘭;嚴浩;王慧利;;細菌sRNA功能、預測及鑒定方法的研究進展[J];溫州醫(yī)學院學報;2012年05期
3 ;[J];;年期
相關(guān)博士學位論文 前2條
1 王江;細菌sRNA靶標數(shù)據(jù)庫3.0構(gòu)建及其功能注釋研究[D];中國人民解放軍軍事醫(yī)學科學院;2016年
2 徐杰;布魯氏菌轉(zhuǎn)錄組測序分析及sRNA功能研究[D];吉林大學;2013年
相關(guān)碩士學位論文 前1條
1 汪屹;大腸桿菌sRNA編碼基因yigP結(jié)構(gòu)與功能研究[D];華東理工大學;2012年
,本文編號:1431550
本文鏈接:http://sikaile.net/shoufeilunwen/jckxbs/1431550.html
最近更新
教材專著