天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于RNA-Seq數(shù)據(jù)的關(guān)鍵選擇性剪接識(shí)別方法研究

發(fā)布時(shí)間:2018-11-04 10:31
【摘要】:選擇性剪接是指在基因表達(dá)過(guò)程中,同一段基因編碼區(qū)域能夠經(jīng)轉(zhuǎn)錄并翻譯成多個(gè)具有不同功能的蛋白質(zhì)的一種細(xì)胞內(nèi)在機(jī)制。識(shí)別選擇性剪接事件對(duì)研究基因功能、蛋白質(zhì)結(jié)構(gòu)多樣性、細(xì)胞分化、物種進(jìn)化具有重要意義。隨著高通量測(cè)序技術(shù)的問(wèn)世及快速發(fā)展,從轉(zhuǎn)錄組高通量測(cè)序(RNA-Sequencing,RNA-Seq)數(shù)據(jù)中識(shí)別選擇性剪接事件成為了當(dāng)前生物信息學(xué)研究的一個(gè)前沿課題。然而,如何準(zhǔn)確地從RNA-Seq數(shù)據(jù)中識(shí)別外顯子跳躍事件和內(nèi)含子保留事件仍是一個(gè)未被解決的問(wèn)題,F(xiàn)有的方法在識(shí)別外顯子跳躍和內(nèi)含子保留這兩種選擇性剪接事件上依然存在著諸多問(wèn)題,例如:利用了與外顯子跳躍事件或內(nèi)含子保留事件相關(guān)的部分信息去構(gòu)建計(jì)算模型和方法;使用了低質(zhì)量的測(cè)序短片段;對(duì)特征的正則化表述方式?jīng)]有統(tǒng)一的標(biāo)準(zhǔn);沒(méi)有指出哪一種特征對(duì)準(zhǔn)確地識(shí)別選擇性剪接事件是最有效的等。針對(duì)現(xiàn)有的基于RNA-Seq數(shù)據(jù)識(shí)別外顯子跳躍事件和內(nèi)含子保留事件的方法中所存在的問(wèn)題,本文進(jìn)行了深入的討論與研究。本文的主要工作和創(chuàng)新點(diǎn)包括:(1)提出了外顯子跳躍事件相關(guān)特征分析方法該方法的創(chuàng)新點(diǎn)包括:使用了外顯子跳躍事件相關(guān)的多種特征來(lái)表述每一個(gè)外顯子,評(píng)估了每一種特征對(duì)準(zhǔn)確識(shí)別外顯子跳躍事件的影響,同時(shí)構(gòu)建4種特征集合來(lái)評(píng)估不同的特征正則化方法對(duì)準(zhǔn)確識(shí)別外顯子跳躍事件的影響。在真實(shí)的人類骨骼肌、大腦、心臟和肝臟組織的RNA-Seq數(shù)據(jù)中,整合現(xiàn)有方法的識(shí)別結(jié)果構(gòu)建了參考集,本文評(píng)估了外顯子跳躍事件相關(guān)特征在準(zhǔn)確識(shí)別外顯子跳躍事件中的重要性,并分析了測(cè)序短片段特征的不同正則化表述方式對(duì)準(zhǔn)確識(shí)別外顯子跳躍事件的影響。實(shí)驗(yàn)結(jié)果表明映射到支持外顯子跳躍區(qū)域的、連接選擇性外顯子的上游外顯子和下游外顯子的測(cè)序短片段特征,以及該選擇性外顯子保留水平psi得分這兩種特征對(duì)準(zhǔn)確識(shí)別外顯子跳躍事件具有重要影響。并且得出使用原始測(cè)序短片段數(shù)或是使用經(jīng)過(guò)正則化的測(cè)序短片段數(shù)來(lái)描述特征對(duì)準(zhǔn)確識(shí)別外顯子跳躍事件沒(méi)有顯著影響。(2)提出了基于多特征分析的外顯子跳躍事件識(shí)別方法EScallEScall方法的創(chuàng)新點(diǎn)包括:過(guò)濾掉映射質(zhì)量數(shù)低的和具有歧義的測(cè)序短片段的映射結(jié)果,同時(shí)整合多種與外顯子跳躍事件相關(guān)的特征,包括:映射到外顯子內(nèi)部區(qū)域的測(cè)序短片段特征、支持連接兩個(gè)外顯子的測(cè)序短片段特征、和基因表達(dá)信息等特征,設(shè)計(jì)了新的外顯子跳躍事件的得分計(jì)算方法,用于從兩種不同條件下的RNA-Seq數(shù)據(jù)中識(shí)別外顯子跳躍事件。在真實(shí)的人類骨骼肌和大腦組織的RNA-Seq數(shù)據(jù)中,使用EScall方法識(shí)別外顯子跳躍事件。將識(shí)別結(jié)果與其他方法的識(shí)別結(jié)果相比較,實(shí)驗(yàn)結(jié)果表明EScall方法能夠有效地減少假陽(yáng)性和假陰性的識(shí)別結(jié)果,獲得了較好的預(yù)測(cè)結(jié)果。(3)提出了基于聯(lián)合得分的內(nèi)含子保留事件識(shí)別方法IRcallIRcall方法的創(chuàng)新點(diǎn)包括:整合7種與內(nèi)含子保留事件相關(guān)的特征,包括:映射到內(nèi)含子內(nèi)部區(qū)域的測(cè)序短片段特征、支持內(nèi)含子剪接的連接該內(nèi)含子上游外顯子和該內(nèi)含子下游外顯子的測(cè)序短片段特征、映射到該內(nèi)含子上游和下游外顯子內(nèi)部區(qū)域的測(cè)序短片段特征、與5’端剪接位點(diǎn)相交疊的測(cè)序短片段特征、與3’端剪接位點(diǎn)相交疊的測(cè)序短片段特征、內(nèi)含子區(qū)域能被測(cè)序短片段覆蓋的比例特征、和基因表達(dá)信息特征,設(shè)計(jì)了新的內(nèi)含子保留事件聯(lián)合得分IRScore的計(jì)算方法,用于從兩種不同條件下的RNA-Seq數(shù)據(jù)中識(shí)別內(nèi)含子保留事件。在真實(shí)的擬南芥skip突變體和野生型的RNA-Seq數(shù)據(jù)中,使用IRcall方法識(shí)別內(nèi)含子保留事件。將識(shí)別結(jié)果與其他方法的識(shí)別結(jié)果相比較,實(shí)驗(yàn)結(jié)果表明IRcall方法能夠有效地減少假陽(yáng)性的識(shí)別結(jié)果,從而更加準(zhǔn)確地識(shí)別內(nèi)含子保留事件。(4)提出了基于隨機(jī)森林的內(nèi)含子保留事件識(shí)別方法IRclassifierIRclassifier方法的創(chuàng)新點(diǎn)包括:整合3種方法的識(shí)別結(jié)果構(gòu)建了參考集,使用與內(nèi)含子保留事件相關(guān)的21種特征來(lái)表述每一個(gè)內(nèi)含子,構(gòu)建了基于隨機(jī)森林的分類器,用于從兩種不同條件下的RNA-Seq數(shù)據(jù)中識(shí)別內(nèi)含子保留事件,同時(shí)分析了每一種特征對(duì)準(zhǔn)確識(shí)別內(nèi)含子保留事件的影響。在真實(shí)的擬南芥skip突變體和野生型的RNA-Seq數(shù)據(jù)中,整合現(xiàn)有三個(gè)識(shí)別方法在1號(hào)、2號(hào)、4號(hào)染色體上的識(shí)別結(jié)果來(lái)構(gòu)建訓(xùn)練集。使用IRclassifier識(shí)別內(nèi)含子保留事件,實(shí)驗(yàn)結(jié)果表明該方法識(shí)別結(jié)果的準(zhǔn)確率達(dá)到99.2%。此外,使用IRclassifier識(shí)別3號(hào)和5號(hào)染色體上的內(nèi)含子保留事件,將識(shí)別結(jié)果與現(xiàn)有方法的識(shí)別結(jié)果相比較,實(shí)驗(yàn)結(jié)果表明IRclassifier方法能夠準(zhǔn)確地識(shí)別內(nèi)含子保留事件,驗(yàn)證了方法的有效性。
[Abstract]:Alternative splicing refers to a cellular intrinsic mechanism that can be transcribed and translated into a plurality of proteins having different functions during gene expression. Identifying alternative splicing events is of great significance to the study of protein structure, protein structure diversity, cell differentiation and species evolution. With the advent of high-throughput sequencing technology and rapid development, the identification of selective splice events from transcriptome high-throughput sequencing (RNA-Sequence, RNA-Seq) data has become a leading issue in the study of bioinformatics. However, how to accurately identify exon skipping events and introns retention events from RNA-Seq data remains an unresolved issue. The existing methods still exist many problems in identifying exon skipping and intron retention, for example, using partial information related to exon skipping events or intron retention events to build a computing model and method; A low-quality sequencing video segment is used; there is no uniform standard for the regularization of features; it is not pointed out which feature is the most effective for accurately identifying alternative splicing events. In view of the existing problems in the methods of identifying exon skipping events and introns retention events based on RNA-Seq data, this paper makes an in-depth discussion and study. The main work and innovation points of this paper include: (1) the relevant characteristic analysis methods of exon skipping events are put forward, and the innovation points of the method include: expressing each exon by using a plurality of characteristics related to exon skipping events, The effect of each feature on accurate identification of exon skipping events was assessed, and four feature sets were constructed to assess the effect of different feature regularization methods on accurately identifying exon skipping events. In the real human skeletal muscle, the RNA-Seq data of the brain, the heart and the liver tissue, the recognition results of the existing method are integrated into the reference set, In this paper, the influence of different regularized expressions of the characteristics of the short segment of sequencing on the jumping events of exons can be accurately identified. The results of the experiment indicate that the characteristics of the short segment of the upstream and downstream exons of the selective exon are mapped to the jump region of the support exon, and the two features of the selective exon reservation horizontal psi score have an important influence on the accurate identification of the exon skipping events. and it is concluded that the feature has no significant effect on the accurate identification of exon jump events using the number of original sequencing clips or using a normalized number of sequenced short segments. (2) An innovative point of EscoalEscoall method based on multi-characteristic analysis of exon skipping events is proposed, including: filtering out mapping results with low mapped mass numbers and ambiguous sequencing video segments, and combining a plurality of features related to exon skipping events, including: The method comprises the following steps of: mapping the characteristic of the sequencing short film segment to the inner region of the exon, supporting the characteristics of the sequencing film segment connecting the two exons, and the gene expression information and the like, and designing a score calculation method of the new exon jump event, for identifying exon skipping events from RNA-Seq data under two different conditions. In real human skeletal muscle and brain tissue RNA-Seq data, exon skipping events were identified using the Escoall approach. Comparing the recognition results with the recognition results of other methods, the experiment results show that the EScall method can effectively reduce false positive and false negative identification results, and obtain better prediction results. (3) A new method for the identification of intron retention events based on joint score is proposed, including: the feature of 7 species associated with intron retention events, including the feature of sequencing short segments mapped to introns interior regions, a sequencing video segment feature that supports intron splicing of the intron upstream exon and the intron downstream exon, a sequencing short segment feature mapped to an inner region of the intron upstream and downstream, a sequencing short segment feature that overlaps the 5 'end splice site, Compared with the 3' terminal splicing site, the sequence short segment feature, the intron region can be sequenced short segment coverage ratio feature, and the gene expression information characteristic, the calculation method of the new intron retention event joint score IRSore is designed, An intron retention event is identified from RNA-Seq data under two different conditions. In real Arabidopsis thaliana skip mutants and wild-type RNA-Seq data, introns retention events were identified using the IRXRF method. Comparing the recognition results with the recognition results of other methods, the experimental results show that the IR032 method can effectively reduce false positive identification results, thus more accurately recognizing intron retention events. (4) An innovative point of IRclassfier IRclassfier method based on an intron retention event recognition method based on a random forest is proposed, which comprises the following steps: a reference set is constructed based on the recognition results of the three methods, 21 features related to the intron retention event are used to express each intron, A random forest-based classifier was constructed for identifying intron retention events from RNA-Seq data under two different conditions while analyzing the effect of each feature on accurately identifying intron retention events. In the real Arabidopsis thaliana skip mutant and wild-type RNA-Seq data, the training set was constructed by combining the recognition results of the existing three identification methods on chromosomes 1, 2 and 4. Intron retention events were identified using IRclassfier, and the results showed that the accuracy of the method was 99.2%. In addition, using IRclassfier to identify intron retention events on chromosome 3 and chromosome 5, the results of the identification are compared with the recognition results of the existing methods, and the results show that the IRclassfier method can accurately identify intron retention events and verify the effectiveness of the method.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2016
【分類號(hào)】:Q78

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 劉靜;A Novel Vector for Abundant Expression of Antisense RNA, Triplex forming RNA and Ribozyme in vivo[J];High Technology Letters;2000年04期

2 魯慧英;Detection of hepatitis C virus RNA sequences in cholangiocarcinomas in Chinese and American patients[J];Chinese Medical Journal;2000年12期

3 梁小兵 ,萬(wàn)國(guó)江 ,黃榮貴;Distribution and Variation of Ribonucleic Acid (RNA) and Protein and Its Hydrolysis Products in Lake Sediments[J];Chinese Journal of Geochemistry;2002年02期

4 Verheyden B ,徐其武;口服脊髓灰質(zhì)炎疫苗核殼和RNA的穩(wěn)定性[J];國(guó)外醫(yī)學(xué).預(yù)防.診斷.治療用生物制品分冊(cè);2002年01期

5 CMBE譯文組;探索小RNA的功能[J];現(xiàn)代臨床醫(yī)學(xué)生物工程學(xué)雜志;2004年03期

6 沈維干;RNA interference and its current application in mammals[J];Chinese Medical Journal;2004年07期

7 孫娣;汪洋;張麗娟;閆玉清;;一種簡(jiǎn)捷提取植物總RNA的方法[J];黑龍江醫(yī)藥;2005年06期

8 南海波;;小麥總RNA的提取[J];渤海大學(xué)學(xué)報(bào)(自然科學(xué)版);2006年01期

9 王冬來(lái);;RNA干擾的成功與困惑[J];中國(guó)生物化學(xué)與分子生物學(xué)報(bào);2008年06期

10 楊靜;;試驗(yàn)性的小RNA藥物可能引發(fā)失明[J];中國(guó)生物化學(xué)與分子生物學(xué)報(bào);2009年05期

相關(guān)會(huì)議論文 前10條

1 金由辛;;面向21世紀(jì)的RNA研究[A];面向21世紀(jì)的科技進(jìn)步與社會(huì)經(jīng)濟(jì)發(fā)展(下冊(cè))[C];1999年

2 ;第四屆RNA全國(guó)研討會(huì)大會(huì)報(bào)告日程安排[A];第四屆全國(guó)RNA進(jìn)展研討會(huì)論文集[C];2005年

3 ;Function of Transfer RNA Modifications in Plant Development[A];植物分子生物學(xué)與現(xiàn)代農(nóng)業(yè)——全國(guó)植物生物學(xué)研討會(huì)論文摘要集[C];2010年

4 王峰;張秋平;陳金湘;;棉花總RNA的快速提取方法[A];中國(guó)棉花學(xué)會(huì)2011年年會(huì)論文匯編[C];2011年

5 關(guān)力;陳本iY;iJ云虹;郭培芝;魏重琴;邱蘇吾;苗健;;關(guān)于動(dòng)物}D~T中RNAn,定方法的研究[A];中國(guó)生理科學(xué)會(huì)學(xué)術(shù)會(huì)議論文摘要匯編(生物化學(xué))[C];1964年

6 夏海濱;;小RNA在免疫學(xué)領(lǐng)域中的應(yīng)用研究進(jìn)展[A];中國(guó)免疫學(xué)會(huì)第五屆全國(guó)代表大會(huì)暨學(xué)術(shù)會(huì)議論文摘要[C];2006年

7 ;The stability of hepatitis C virus RNA in various handling and storage conditions[A];中國(guó)輸血協(xié)會(huì)第四屆輸血大會(huì)論文集[C];2006年

8 郭德銀;;RNA干擾在病毒研究和控制中的應(yīng)用[A];2006中國(guó)微生物學(xué)會(huì)第九次全國(guó)會(huì)員代表大會(huì)暨學(xué)術(shù)年會(huì)論文摘要集[C];2006年

9 甘儀梅;楊業(yè)華;王學(xué)奎;曹燕;楊特武;;棉花總RNA快速提取[A];中國(guó)棉花學(xué)會(huì)2007年年會(huì)論文匯編[C];2007年

10 ;Identification and characterization of novel interactive partner proteins for PCBP1 that is a RNA-binding protein[A];中國(guó)優(yōu)生優(yōu)育協(xié)會(huì)第四屆全國(guó)學(xué)術(shù)論文報(bào)告會(huì)暨基因科學(xué)高峰論壇論文專輯[C];2008年

相關(guān)重要報(bào)紙文章 前10條

1 記者 馮衛(wèi)東;研究人員發(fā)現(xiàn)可破壞腫瘤抑制基因的小RNA[N];科技日?qǐng)?bào);2009年

2 記者 儲(chǔ)笑抒 通訊員 盛偉;人體微小RNA有望提前發(fā)出癌癥預(yù)警[N];南京日?qǐng)?bào);2011年

3 瀘州醫(yī)學(xué)院副教授、科普作家 周志遠(yuǎn);“大頭兒子”與環(huán)狀RNA[N];第一財(cái)經(jīng)日?qǐng)?bào);2014年

4 麥迪信;小分子RNA可能有大作用[N];醫(yī)藥經(jīng)濟(jì)報(bào);2003年

5 董映璧;美發(fā)現(xiàn)基因調(diào)控可回應(yīng)“RNA世界”[N];科技日?qǐng)?bào);2006年

6 張忠霞;特制RNA輕推一下,就能“喚醒”基因[N];新華每日電訊;2007年

7 聶翠蓉;RNA:縱是配角也精彩[N];科技日?qǐng)?bào);2009年

8 馮衛(wèi)東;RNA干擾機(jī)制首次在人體中獲得證實(shí)[N];科技日?qǐng)?bào);2010年

9 馮衛(wèi)東 王小龍;英在地球早期環(huán)境模擬條件下合成類RNA[N];科技日?qǐng)?bào);2009年

10 記者 常麗君;新技術(shù)讓研究進(jìn)入單細(xì)胞內(nèi)RNA的世界[N];科技日?qǐng)?bào);2011年

相關(guān)博士學(xué)位論文 前10條

1 王趙瑋;昆蟲(chóng)RNA病毒復(fù)制及昆蟲(chóng)抗病毒天然免疫機(jī)制研究[D];武漢大學(xué);2014年

2 包純;一類新非編碼RNA的發(fā)現(xiàn)以及產(chǎn)生和功能的初探[D];華中師范大學(xué);2015年

3 李語(yǔ)麗;基于MeRIP-seq的水稻RNA m6A甲基化修飾的研究[D];中國(guó)科學(xué)院北京基因組研究所;2015年

4 熊瑜琳;miR-122靶位基因STAT3調(diào)控長(zhǎng)鏈非編碼 RNA Lethe促進(jìn)HCV復(fù)制的機(jī)制研究[D];第三軍醫(yī)大學(xué);2015年

5 范春節(jié);高通量測(cè)序鑒定毛竹小RNA及其功能分析[D];中國(guó)林業(yè)科學(xué)研究院;2012年

6 王加強(qiáng);小鼠著床前胚胎特異ERV相關(guān)長(zhǎng)非編碼RNA的定向篩選及功能研究[D];東北農(nóng)業(yè)大學(xué);2015年

7 王業(yè)偉;非編碼RNA SPIU的結(jié)構(gòu)和功能研究和p19INK4D在APL發(fā)病中的作用[D];上海交通大學(xué);2013年

8 鄒艷芬;子癇前期中非編碼RNA對(duì)滋養(yǎng)細(xì)胞功能的調(diào)控及機(jī)制探索[D];南京醫(yī)科大學(xué);2015年

9 朱喬;miR-10b在人肝細(xì)胞肝癌發(fā)生中的作用及其機(jī)制的初步探索[D];第四軍醫(yī)大學(xué);2015年

10 蔣俊鋒;長(zhǎng)鏈非編碼RNA BACE1-AS促進(jìn)Aβ聚集及其調(diào)節(jié)BACE1和SERF1a的ceRNA機(jī)制研究[D];第二軍醫(yī)大學(xué);2015年

相關(guān)碩士學(xué)位論文 前10條

1 全弘揚(yáng);長(zhǎng)鏈非編碼RNA在細(xì)胞內(nèi)質(zhì)網(wǎng)應(yīng)激反應(yīng)中的相關(guān)作用及機(jī)制研究[D];北京協(xié)和醫(yī)學(xué)院;2015年

2 胡亮;DDX19A識(shí)別PRRSV基因組RNA并激活NLRP3炎癥小體[D];中國(guó)農(nóng)業(yè)科學(xué)院;2015年

3 雷文婕;小菜蛾不同發(fā)育時(shí)期RNA編輯位點(diǎn)的識(shí)別與驗(yàn)證[D];南京農(nóng)業(yè)大學(xué);2014年

4 周燕;RNA干擾對(duì)大鯢蛙病毒(CGSRV)主要功能基因表達(dá)與增殖影響的研究[D];四川農(nóng)業(yè)大學(xué);2015年

5 石新新;改進(jìn)的RNA-Seq數(shù)據(jù)轉(zhuǎn)錄組表達(dá)分析研究[D];南京航空航天大學(xué);2015年

6 陳金梅;利用植物表達(dá)藥用干擾小RNA的研究[D];南京大學(xué);2014年

7 郭維超;miR-17家族在腫瘤生長(zhǎng)和遷移中的作用及機(jī)制[D];杭州師范大學(xué);2016年

8 沈曉彤;RNA“一步法”檢測(cè)的酶學(xué)基礎(chǔ)及凝血酶等溫?cái)U(kuò)增檢測(cè)方法的研究[D];青島科技大學(xué);2016年

9 孫文陽(yáng);豬miR-15b前體單堿基突變對(duì)其生物加工過(guò)程的影響[D];甘肅農(nóng)業(yè)大學(xué);2016年

10 郅淑引;微小RNA25在肺癌血清中的表達(dá)量與臨床意義的研究[D];山西醫(yī)科大學(xué);2016年



本文編號(hào):2309554

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/jckxbs/2309554.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶da08e***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com