基于機(jī)器學(xué)習(xí)的microRNA基因預(yù)測
發(fā)布時(shí)間:2018-04-03 01:31
本文選題:microRNA 切入點(diǎn):pre-microRNA 出處:《河北工業(yè)大學(xué)》2011年碩士論文
【摘要】:microRNA是一種單鏈的非編碼小分子RNA,長約20-24個(gè)核苷酸,它由長度約為70個(gè)核苷酸的microRNA前體(pre-microRNA)經(jīng)過具有RNaseIII活性的Dicer和Dicer-like-1內(nèi)切核酸酶加工形成,microRNA通過誘導(dǎo)靶mRNA剪切或者抑制其翻譯來調(diào)控基因表達(dá)的功能。人類有近三分之一的基因都受到microRNA的調(diào)控,它對生物的細(xì)胞增殖分化、細(xì)胞死亡、早期發(fā)育、代謝活動(dòng)等生物過程有著重要的調(diào)控作用,研究表明,它與癌癥也有著緊密的聯(lián)系,對microRNA的研究有助于人們了解基因間的網(wǎng)絡(luò)調(diào)控關(guān)系,更有助于對基因功能的研究以及生物的進(jìn)化探索。雖然microRNA廣泛存在于55個(gè)物種之中,目前被鑒定出來的microRNA數(shù)量比實(shí)際存在的要少的多,還有大量的microRNA有待發(fā)現(xiàn)。因此對microRNA的進(jìn)行預(yù)測具有重要的意義。 目前已知的microRNA預(yù)測主要有兩種方式,cDNA克隆預(yù)測和計(jì)算預(yù)測。前者是microRNA早期主要的預(yù)測方式,這種方式直接、可靠,但是很難克隆出在不同時(shí)期表達(dá),或者只在特定的組織或細(xì)胞系中表達(dá)的microRNA。計(jì)算預(yù)測則不會(huì)受到microRNA在表達(dá)時(shí)間、表達(dá)水平或組織特異性的影響,從而可以彌補(bǔ)cDNA克隆測序的不足。 本文基于機(jī)器學(xué)習(xí)提出了一種稱為ACO+SVM的microRNA預(yù)測方法,由于pre-microRNA的序列較長,并且可以折疊形成stem-loop結(jié)構(gòu),將pre-microRNA序列和結(jié)構(gòu)特征結(jié)合起來提取相應(yīng)屬性信息。本文通過已知的陽性和陰性pre-microRNA構(gòu)建區(qū)分二者的分類器,由于支持向量機(jī)(Support Vector Machines,SVM)在逼近和泛化能力方面具有良好的特性,因此本文microRNA的預(yù)測方法采用SVM訓(xùn)練分類器,考慮到SVM分類器的性能受核函數(shù)和相關(guān)參數(shù)的影響很大,采用蟻群算法(Ant Colony Optimization,ACO)搜索SVM的相關(guān)參數(shù),以構(gòu)建無偏、且同時(shí)具有較高敏感性和特異性的分類器。實(shí)驗(yàn)結(jié)果表明該方法不僅可以有效的鑒別人類真假pre-microRNA,而且在其他多個(gè)物種的預(yù)測上具有較高的準(zhǔn)確度,與其他同類方法相比具有更好的敏感性和特異性。
[Abstract]:MicroRNA is a single-stranded, non-coding small molecule with about 20-24 nucleotides.It is processed from 70 nucleotides of microRNA precursor pre-microRNAs through Dicer and Dicer-like-1 endonucleases with RNaseIII activity to form the ability of microRNAs to regulate gene expression by inducing target mRNA splicing or inhibiting its translation.Nearly 1/3 of human genes are regulated by microRNA, which plays an important role in biological processes such as cell proliferation and differentiation, cell death, early development, metabolic activity, etc.It is also closely related to cancer. The study of microRNA is helpful to understand the network regulation between genes, to study the function of genes and to explore the evolution of organisms.Although microRNA is widespread in 55 species, the number of microRNA identified is much smaller than the actual number, and a large number of microRNA remains to be discovered.Therefore, the prediction of microRNA is of great significance.At present, there are two main methods of microRNA cloning prediction and computational prediction.The former is the main early prediction method of microRNA, which is direct and reliable, but it is difficult to clone microRNAs expressed at different stages or only in specific tissues or cell lines.Calculation and prediction will not be affected by the expression time, expression level or tissue specificity of microRNA, which can make up for the deficiency of cDNA cloning and sequencing.In this paper, a microRNA prediction method called ACO SVM is proposed based on machine learning. Because the pre-microRNA sequence is long and can be folded into stem-loop structure, the pre-microRNA sequence and structure feature are combined to extract the corresponding attribute information.In this paper, a classifier is constructed by using known positive and negative pre-microRNA. Because the support vector machine (SVM) has good properties in approximation and generalization, SVM is used to train the classifier in this microRNA prediction method.Considering that the performance of SVM classifier is greatly affected by kernel function and related parameters, ant colony algorithm (Ant Colony optimization) is used to search the relevant parameters of SVM to construct an unbiased classifier with high sensitivity and specificity.The experimental results show that this method not only can effectively identify human pre-microRNAs, but also has higher accuracy in predicting other species, and has better sensitivity and specificity than other similar methods.
【學(xué)位授予單位】:河北工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2011
【分類號】:TP181;R346
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 楊良懷;呂丕明;陳立軍;鄧明華;;k-gram方法識(shí)別microRNA前體[J];生物化學(xué)與生物物理進(jìn)展;2007年02期
2 張春霆;生物信息學(xué)的現(xiàn)狀與展望[J];世界科技研究與發(fā)展;2000年06期
3 徐寧,李春光,張健,虞厥邦;幾種現(xiàn)代優(yōu)化算法的比較研究[J];系統(tǒng)工程與電子技術(shù);2002年12期
4 侯妍妍;應(yīng)曉敏;李伍舉;;microRNA計(jì)算發(fā)現(xiàn)方法的研究進(jìn)展[J];遺傳;2008年06期
相關(guān)博士學(xué)位論文 前1條
1 秦玉平;基于支持向量機(jī)的文本分類算法研究[D];大連理工大學(xué);2008年
,本文編號:1703040
本文鏈接:http://sikaile.net/xiyixuelunwen/1703040.html
最近更新
教材專著