基于RNA-Seq數(shù)據(jù)的關(guān)鍵選擇性剪接識別方法研究
[Abstract]:Alternative splicing refers to a cellular intrinsic mechanism that can be transcribed and translated into a plurality of proteins having different functions during gene expression. Identifying alternative splicing events is of great significance to the study of protein structure, protein structure diversity, cell differentiation and species evolution. With the advent of high-throughput sequencing technology and rapid development, the identification of selective splice events from transcriptome high-throughput sequencing (RNA-Sequence, RNA-Seq) data has become a leading issue in the study of bioinformatics. However, how to accurately identify exon skipping events and introns retention events from RNA-Seq data remains an unresolved issue. The existing methods still exist many problems in identifying exon skipping and intron retention, for example, using partial information related to exon skipping events or intron retention events to build a computing model and method; A low-quality sequencing video segment is used; there is no uniform standard for the regularization of features; it is not pointed out which feature is the most effective for accurately identifying alternative splicing events. In view of the existing problems in the methods of identifying exon skipping events and introns retention events based on RNA-Seq data, this paper makes an in-depth discussion and study. The main work and innovation points of this paper include: (1) the relevant characteristic analysis methods of exon skipping events are put forward, and the innovation points of the method include: expressing each exon by using a plurality of characteristics related to exon skipping events, The effect of each feature on accurate identification of exon skipping events was assessed, and four feature sets were constructed to assess the effect of different feature regularization methods on accurately identifying exon skipping events. In the real human skeletal muscle, the RNA-Seq data of the brain, the heart and the liver tissue, the recognition results of the existing method are integrated into the reference set, In this paper, the influence of different regularized expressions of the characteristics of the short segment of sequencing on the jumping events of exons can be accurately identified. The results of the experiment indicate that the characteristics of the short segment of the upstream and downstream exons of the selective exon are mapped to the jump region of the support exon, and the two features of the selective exon reservation horizontal psi score have an important influence on the accurate identification of the exon skipping events. and it is concluded that the feature has no significant effect on the accurate identification of exon jump events using the number of original sequencing clips or using a normalized number of sequenced short segments. (2) An innovative point of EscoalEscoall method based on multi-characteristic analysis of exon skipping events is proposed, including: filtering out mapping results with low mapped mass numbers and ambiguous sequencing video segments, and combining a plurality of features related to exon skipping events, including: The method comprises the following steps of: mapping the characteristic of the sequencing short film segment to the inner region of the exon, supporting the characteristics of the sequencing film segment connecting the two exons, and the gene expression information and the like, and designing a score calculation method of the new exon jump event, for identifying exon skipping events from RNA-Seq data under two different conditions. In real human skeletal muscle and brain tissue RNA-Seq data, exon skipping events were identified using the Escoall approach. Comparing the recognition results with the recognition results of other methods, the experiment results show that the EScall method can effectively reduce false positive and false negative identification results, and obtain better prediction results. (3) A new method for the identification of intron retention events based on joint score is proposed, including: the feature of 7 species associated with intron retention events, including the feature of sequencing short segments mapped to introns interior regions, a sequencing video segment feature that supports intron splicing of the intron upstream exon and the intron downstream exon, a sequencing short segment feature mapped to an inner region of the intron upstream and downstream, a sequencing short segment feature that overlaps the 5 'end splice site, Compared with the 3' terminal splicing site, the sequence short segment feature, the intron region can be sequenced short segment coverage ratio feature, and the gene expression information characteristic, the calculation method of the new intron retention event joint score IRSore is designed, An intron retention event is identified from RNA-Seq data under two different conditions. In real Arabidopsis thaliana skip mutants and wild-type RNA-Seq data, introns retention events were identified using the IRXRF method. Comparing the recognition results with the recognition results of other methods, the experimental results show that the IR032 method can effectively reduce false positive identification results, thus more accurately recognizing intron retention events. (4) An innovative point of IRclassfier IRclassfier method based on an intron retention event recognition method based on a random forest is proposed, which comprises the following steps: a reference set is constructed based on the recognition results of the three methods, 21 features related to the intron retention event are used to express each intron, A random forest-based classifier was constructed for identifying intron retention events from RNA-Seq data under two different conditions while analyzing the effect of each feature on accurately identifying intron retention events. In the real Arabidopsis thaliana skip mutant and wild-type RNA-Seq data, the training set was constructed by combining the recognition results of the existing three identification methods on chromosomes 1, 2 and 4. Intron retention events were identified using IRclassfier, and the results showed that the accuracy of the method was 99.2%. In addition, using IRclassfier to identify intron retention events on chromosome 3 and chromosome 5, the results of the identification are compared with the recognition results of the existing methods, and the results show that the IRclassfier method can accurately identify intron retention events and verify the effectiveness of the method.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:博士
【學位授予年份】:2016
【分類號】:Q78
【相似文獻】
相關(guān)期刊論文 前10條
1 劉靜;A Novel Vector for Abundant Expression of Antisense RNA, Triplex forming RNA and Ribozyme in vivo[J];High Technology Letters;2000年04期
2 魯慧英;Detection of hepatitis C virus RNA sequences in cholangiocarcinomas in Chinese and American patients[J];Chinese Medical Journal;2000年12期
3 梁小兵 ,萬國江 ,黃榮貴;Distribution and Variation of Ribonucleic Acid (RNA) and Protein and Its Hydrolysis Products in Lake Sediments[J];Chinese Journal of Geochemistry;2002年02期
4 Verheyden B ,徐其武;口服脊髓灰質(zhì)炎疫苗核殼和RNA的穩(wěn)定性[J];國外醫(yī)學.預(yù)防.診斷.治療用生物制品分冊;2002年01期
5 CMBE譯文組;探索小RNA的功能[J];現(xiàn)代臨床醫(yī)學生物工程學雜志;2004年03期
6 沈維干;RNA interference and its current application in mammals[J];Chinese Medical Journal;2004年07期
7 孫娣;汪洋;張麗娟;閆玉清;;一種簡捷提取植物總RNA的方法[J];黑龍江醫(yī)藥;2005年06期
8 南海波;;小麥總RNA的提取[J];渤海大學學報(自然科學版);2006年01期
9 王冬來;;RNA干擾的成功與困惑[J];中國生物化學與分子生物學報;2008年06期
10 楊靜;;試驗性的小RNA藥物可能引發(fā)失明[J];中國生物化學與分子生物學報;2009年05期
相關(guān)會議論文 前10條
1 金由辛;;面向21世紀的RNA研究[A];面向21世紀的科技進步與社會經(jīng)濟發(fā)展(下冊)[C];1999年
2 ;第四屆RNA全國研討會大會報告日程安排[A];第四屆全國RNA進展研討會論文集[C];2005年
3 ;Function of Transfer RNA Modifications in Plant Development[A];植物分子生物學與現(xiàn)代農(nóng)業(yè)——全國植物生物學研討會論文摘要集[C];2010年
4 王峰;張秋平;陳金湘;;棉花總RNA的快速提取方法[A];中國棉花學會2011年年會論文匯編[C];2011年
5 關(guān)力;陳本iY;iJ云虹;郭培芝;魏重琴;邱蘇吾;苗健;;關(guān)于動物}D~T中RNAn,定方法的研究[A];中國生理科學會學術(shù)會議論文摘要匯編(生物化學)[C];1964年
6 夏海濱;;小RNA在免疫學領(lǐng)域中的應(yīng)用研究進展[A];中國免疫學會第五屆全國代表大會暨學術(shù)會議論文摘要[C];2006年
7 ;The stability of hepatitis C virus RNA in various handling and storage conditions[A];中國輸血協(xié)會第四屆輸血大會論文集[C];2006年
8 郭德銀;;RNA干擾在病毒研究和控制中的應(yīng)用[A];2006中國微生物學會第九次全國會員代表大會暨學術(shù)年會論文摘要集[C];2006年
9 甘儀梅;楊業(yè)華;王學奎;曹燕;楊特武;;棉花總RNA快速提取[A];中國棉花學會2007年年會論文匯編[C];2007年
10 ;Identification and characterization of novel interactive partner proteins for PCBP1 that is a RNA-binding protein[A];中國優(yōu)生優(yōu)育協(xié)會第四屆全國學術(shù)論文報告會暨基因科學高峰論壇論文專輯[C];2008年
相關(guān)重要報紙文章 前10條
1 記者 馮衛(wèi)東;研究人員發(fā)現(xiàn)可破壞腫瘤抑制基因的小RNA[N];科技日報;2009年
2 記者 儲笑抒 通訊員 盛偉;人體微小RNA有望提前發(fā)出癌癥預(yù)警[N];南京日報;2011年
3 瀘州醫(yī)學院副教授、科普作家 周志遠;“大頭兒子”與環(huán)狀RNA[N];第一財經(jīng)日報;2014年
4 麥迪信;小分子RNA可能有大作用[N];醫(yī)藥經(jīng)濟報;2003年
5 董映璧;美發(fā)現(xiàn)基因調(diào)控可回應(yīng)“RNA世界”[N];科技日報;2006年
6 張忠霞;特制RNA輕推一下,就能“喚醒”基因[N];新華每日電訊;2007年
7 聶翠蓉;RNA:縱是配角也精彩[N];科技日報;2009年
8 馮衛(wèi)東;RNA干擾機制首次在人體中獲得證實[N];科技日報;2010年
9 馮衛(wèi)東 王小龍;英在地球早期環(huán)境模擬條件下合成類RNA[N];科技日報;2009年
10 記者 常麗君;新技術(shù)讓研究進入單細胞內(nèi)RNA的世界[N];科技日報;2011年
相關(guān)博士學位論文 前10條
1 王趙瑋;昆蟲RNA病毒復制及昆蟲抗病毒天然免疫機制研究[D];武漢大學;2014年
2 包純;一類新非編碼RNA的發(fā)現(xiàn)以及產(chǎn)生和功能的初探[D];華中師范大學;2015年
3 李語麗;基于MeRIP-seq的水稻RNA m6A甲基化修飾的研究[D];中國科學院北京基因組研究所;2015年
4 熊瑜琳;miR-122靶位基因STAT3調(diào)控長鏈非編碼 RNA Lethe促進HCV復制的機制研究[D];第三軍醫(yī)大學;2015年
5 范春節(jié);高通量測序鑒定毛竹小RNA及其功能分析[D];中國林業(yè)科學研究院;2012年
6 王加強;小鼠著床前胚胎特異ERV相關(guān)長非編碼RNA的定向篩選及功能研究[D];東北農(nóng)業(yè)大學;2015年
7 王業(yè)偉;非編碼RNA SPIU的結(jié)構(gòu)和功能研究和p19INK4D在APL發(fā)病中的作用[D];上海交通大學;2013年
8 鄒艷芬;子癇前期中非編碼RNA對滋養(yǎng)細胞功能的調(diào)控及機制探索[D];南京醫(yī)科大學;2015年
9 朱喬;miR-10b在人肝細胞肝癌發(fā)生中的作用及其機制的初步探索[D];第四軍醫(yī)大學;2015年
10 蔣俊鋒;長鏈非編碼RNA BACE1-AS促進Aβ聚集及其調(diào)節(jié)BACE1和SERF1a的ceRNA機制研究[D];第二軍醫(yī)大學;2015年
相關(guān)碩士學位論文 前10條
1 全弘揚;長鏈非編碼RNA在細胞內(nèi)質(zhì)網(wǎng)應(yīng)激反應(yīng)中的相關(guān)作用及機制研究[D];北京協(xié)和醫(yī)學院;2015年
2 胡亮;DDX19A識別PRRSV基因組RNA并激活NLRP3炎癥小體[D];中國農(nóng)業(yè)科學院;2015年
3 雷文婕;小菜蛾不同發(fā)育時期RNA編輯位點的識別與驗證[D];南京農(nóng)業(yè)大學;2014年
4 周燕;RNA干擾對大鯢蛙病毒(CGSRV)主要功能基因表達與增殖影響的研究[D];四川農(nóng)業(yè)大學;2015年
5 石新新;改進的RNA-Seq數(shù)據(jù)轉(zhuǎn)錄組表達分析研究[D];南京航空航天大學;2015年
6 陳金梅;利用植物表達藥用干擾小RNA的研究[D];南京大學;2014年
7 郭維超;miR-17家族在腫瘤生長和遷移中的作用及機制[D];杭州師范大學;2016年
8 沈曉彤;RNA“一步法”檢測的酶學基礎(chǔ)及凝血酶等溫擴增檢測方法的研究[D];青島科技大學;2016年
9 孫文陽;豬miR-15b前體單堿基突變對其生物加工過程的影響[D];甘肅農(nóng)業(yè)大學;2016年
10 郅淑引;微小RNA25在肺癌血清中的表達量與臨床意義的研究[D];山西醫(yī)科大學;2016年
,本文編號:2309554
本文鏈接:http://sikaile.net/shoufeilunwen/jckxbs/2309554.html