SEQ 轉錄組表達 多源映射 非均勻性
本文關鍵詞:改進的RNA-Seq數(shù)據轉錄組表達分析研究,由筆耕文化傳播整理發(fā)布。
改進的RNA-Seq數(shù)據轉錄組表達分析研究
Improved Trancriptome Expression Analysis for RNA-Seq Data
[1] [2] [3]
Shi Xinxin, Liu Xuejun, Zhang Li (College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China)
南京航空航天大學計算機科學與技術學院,南京210016
文章摘要:基于高通量測序的RNA-Seq(RNA-sequencing)是用于轉錄組研究的一種新技術,針對該技術在轉錄組表達分析研究中存在的讀段多源映射和讀段非均勻分布等難點,提出一個改進的轉錄組表達研究方法 LDASeqII(Improvement of latent Dirichlet allocation for sequencing data)。模型利用剪接異構體結構信息對參數(shù)進行約束并進行外顯子讀段數(shù)目歸一化處理,解決了讀段非均勻分布下的多源映射問題。通過引入"偽外顯子"和"偽轉錄本"分別處理接合區(qū)讀段和噪聲讀段。將模型應用到真實數(shù)據集上,并與原LDASeq(Latent Dirichlet allocation for sequencing data)模型和目前流行的Cufflinks與RSEM(RNA-Seq by expectation maximization)方法進行對比。結果顯示,改進方法獲得了更為準確的轉錄本及基因表達水平計算結果。
Abstr:RNA-Seq(RNA-sequencing),based on high-throughput sequencing,is a new technique for transcriptome research.Considering the difficulties in the analysis of transcript expression using RNA-Seq data,an improved method,improvement of latent dirichlet allocation for sequencing data(LDASeqⅡ)is proposed to calculate the transcript expression.To deal with multi-mappings between reads and isoforms and non-uniform distribution of reads along reference,LDASeqⅡ utilizes the known gene-isoform annotation to constrain the hyperparameters and normalizes the read counts by exon length for each individual exon.By introducing″pseudo-exon″and″pseudo-transcript″,the conjunction reads and noise reads gain proper treatments.LDASeqⅡis validated using two real datasets on gene and transcript expression calculation and compared with latent dirichlet allocation for sequencing data(LDASeq)and other two popular methods Cufflinks and RNA-Seq by expectation maximization(RSEM).The results show that LDASeqⅡobtains more accurate transcript and gene expression measurements than other approaches.
文章關鍵詞:
Keyword::gene expression RNA-Seq transcript expression multi-mapping non-uniformity
課題項目:國家自然科學基金(61170152)資助項目; 中央高;究蒲袠I(yè)務費專項(CXZZ11_0217)資助項目
本文關鍵詞:改進的RNA-Seq數(shù)據轉錄組表達分析研究,由筆耕文化傳播整理發(fā)布。
,本文編號:231369
本文鏈接:http://sikaile.net/shoufeilunwen/benkebiyelunwen/231369.html