天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 基因論文 >

集成特征選擇與基因調(diào)控網(wǎng)絡(luò)構(gòu)建研究

發(fā)布時(shí)間:2019-04-26 13:27
【摘要】:隨著生物信息技術(shù)的快速發(fā)展,海量基因組數(shù)據(jù)的涌現(xiàn)進(jìn)入后基因組時(shí)代,研究者不再局限于研究單個(gè)基因的功能,而是希望以系統(tǒng)的角度理解維持生物生命活動(dòng)的復(fù)雜生命過(guò)程,在這種背景下,系統(tǒng)生物學(xué)得到了快速發(fā)展。在系統(tǒng)生物學(xué)領(lǐng)域,挑戰(zhàn)之一就是基因調(diào)控網(wǎng)絡(luò)的構(gòu)建,基因調(diào)控網(wǎng)絡(luò)以圖形化的方式描述了基因之間的相互作用,通過(guò)逆向工程構(gòu)建出基因調(diào)控網(wǎng)絡(luò)可以幫助我們更好的理解當(dāng)環(huán)境條件發(fā)生波動(dòng)時(shí)生物體內(nèi)仍能保持穩(wěn)定的分子機(jī)制。隨著DNA微陣列技術(shù)的發(fā)展,快速積累的基因表達(dá)數(shù)據(jù),出現(xiàn)了大量的構(gòu)建基因調(diào)控網(wǎng)絡(luò)的方法。此外,基因序列數(shù)據(jù)和功能注釋數(shù)據(jù)等也在不斷涌現(xiàn)。不同類型數(shù)據(jù)往往提供了不同的信息,如何有效的利用多種數(shù)據(jù)源之間的互補(bǔ)性,對(duì)于準(zhǔn)確構(gòu)建基因調(diào)控網(wǎng)絡(luò)至關(guān)重要。針對(duì)基于基因表達(dá)數(shù)據(jù),利用特征選擇方法進(jìn)行基因調(diào)控網(wǎng)絡(luò)構(gòu)建的不足,即往往僅給出網(wǎng)絡(luò)中每條潛在邊的重要性評(píng)分,而沒(méi)有確定一個(gè)合適的閾值將排序結(jié)果轉(zhuǎn)化為網(wǎng)絡(luò)結(jié)構(gòu)。本文提出了集成特征重要性遺傳算法(Ensemble Feature Importance-Genetic Algorithm,EFI-GA),結(jié)合集成特征選擇算法和遺傳算法構(gòu)建基因調(diào)控網(wǎng)絡(luò)。首先利用集成特征選擇方法為目標(biāo)基因的每個(gè)潛在調(diào)控者計(jì)算一個(gè)重要性分值,該分值表示在該調(diào)控基因和目標(biāo)基因間存在真實(shí)調(diào)控關(guān)系的可信度。然后利用遺傳算法在具有較高可信度的調(diào)控者中篩選出最優(yōu)的調(diào)控者子集。在逆向工程評(píng)估與方法對(duì)話(Dialogue for Reverse Engineering Assessments and Methods,DREAM)數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果表明了該方法的有效性。為了應(yīng)對(duì)外部環(huán)境刺激或者完成某種生命過(guò)程,轉(zhuǎn)錄因子通過(guò)調(diào)控目標(biāo)基因來(lái)執(zhí)行相應(yīng)的功能共同參與同一生命過(guò)程,因此兩者之間往往具有相同或相近的功能,考慮轉(zhuǎn)錄因子和目標(biāo)基因之間的功能相關(guān)性將有助于提高構(gòu)建調(diào)控網(wǎng)絡(luò)的準(zhǔn)確性。本文提出了一種融合基因表達(dá)數(shù)據(jù)、基因序列數(shù)據(jù)以及基因本體(Gene Ontology,GO)數(shù)據(jù)構(gòu)建基因調(diào)控網(wǎng)絡(luò)的多特征融合方法,以有效運(yùn)用不同數(shù)據(jù)源提供的相關(guān)特性提高基因調(diào)控網(wǎng)絡(luò)構(gòu)建的準(zhǔn)確性。利用多種數(shù)據(jù)源構(gòu)建特征向量,并使用支持向量機(jī)建立分類模型,預(yù)測(cè)轉(zhuǎn)錄因子和目標(biāo)基因之間的調(diào)控關(guān)系。在擬南芥數(shù)據(jù)集和番茄數(shù)據(jù)集上的交叉驗(yàn)證結(jié)果表明本文方法具有更高的準(zhǔn)確率。
[Abstract]:With the rapid development of bio-information technology, the emergence of massive genome data into the post-genome era, researchers are no longer limited to the study of the function of a single gene, It is hoped that the complex life process of maintaining biological life can be understood from the point of view of system. Under this background, system biology has been developed rapidly. In the field of system biology, one of the challenges is the construction of gene regulatory networks, which graphically describe the interactions between genes. The construction of genetic regulatory networks through reverse engineering can help us to better understand the molecular mechanism that remains stable in organisms when environmental conditions fluctuate. With the development of DNA microarray technology, there are a lot of methods to construct gene regulation network with the rapid accumulation of gene expression data. In addition, gene sequence data and functional annotation data are also emerging. Different types of data often provide different information. How to make effective use of the complementarities of multiple data sources is very important for the accurate construction of gene regulatory networks. In view of the deficiency of using feature selection method to construct gene regulation network based on gene expression data, that is to say, the importance score of each potential edge of the network is often given. No appropriate threshold is determined to convert the sorting result into a network structure. This paper proposes an integrated feature importance genetic algorithm (Ensemble Feature Importance-Genetic Algorithm,EFI-GA), which combines integrated feature selection algorithm and genetic algorithm to construct gene regulation network. Firstly, the integrated feature selection method is used to calculate an importance score for each potential regulator of the target gene, which indicates the credibility of the real regulatory relationship between the regulatory gene and the target gene. Then the genetic algorithm is used to screen the optimal subset of regulators with high reliability. The experimental results on the data set of reverse engineering evaluation and method dialogue (Dialogue for Reverse Engineering Assessments and Methods,DREAM) show the effectiveness of the proposed method. In order to respond to external environmental stimulation or to complete a certain life process, transcription factors participate in the same life process by regulating the target genes to perform the corresponding functions, so they often have the same or similar functions. Considering the functional correlation between transcription factors and target genes will help to improve the accuracy of constructing regulatory networks. In this paper, a multi-feature fusion method for constructing gene regulation network based on fusion gene expression data, gene sequence data and gene ontology (Gene Ontology,GO) data is proposed. In order to effectively use the characteristics provided by different data sources to improve the accuracy of the construction of gene regulatory networks. Feature vectors are constructed from multiple data sources, and classification models are built by using support vector machines to predict the regulatory relationship between transcription factors and target genes. The cross-validation results on Arabidopsis and tomato datasets show that the proposed method has higher accuracy.
【學(xué)位授予單位】:大連理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:Q811.4;TP18

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 張家軍;蔡傳政;王翼飛;;基因調(diào)控網(wǎng)絡(luò)中的延滯動(dòng)力學(xué)[J];應(yīng)用科學(xué)學(xué)報(bào);2007年01期

2 郭子龍;紀(jì)兆華;涂華偉;梁艷春;;基因調(diào)控網(wǎng)絡(luò)的研究?jī)?nèi)容及其數(shù)據(jù)分析方法[J];電腦知識(shí)與技術(shù);2008年15期

3 陳少白;羅嘉;;一類基因調(diào)控網(wǎng)絡(luò)的定性分析[J];南京信息工程大學(xué)學(xué)報(bào)(自然科學(xué)版);2010年05期

4 李慶偉;全俊龍;劉欣;;基因調(diào)控網(wǎng)絡(luò)研究進(jìn)展[J];遼寧師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年01期

5 葉緯明;呂彬彬;趙琛;狄增如;;少節(jié)點(diǎn)基因調(diào)控網(wǎng)絡(luò)的控制[J];物理學(xué)報(bào);2013年01期

6 王沛;呂金虎;;基因調(diào)控網(wǎng)絡(luò)的控制:機(jī)遇與挑戰(zhàn)[J];自動(dòng)化學(xué)報(bào);2013年12期

7 易東,李輝智;基因調(diào)控網(wǎng)絡(luò)研究與數(shù)學(xué)模型的建立[J];中國(guó)現(xiàn)代醫(yī)學(xué)雜志;2003年24期

8 雷耀山,史定華,王翼飛;基因調(diào)控網(wǎng)絡(luò)的生物信息學(xué)研究[J];自然雜志;2004年01期

9 姜偉;李霞;郭政;李傳星;王麗虹;饒紹奇;;時(shí)間延遲基因調(diào)控網(wǎng)絡(luò)重構(gòu)的決策樹方法研究[J];中國(guó)科學(xué)(C輯:生命科學(xué));2005年06期

10 張晗,宋滿根,陳國(guó)強(qiáng),駱建華;一種改進(jìn)的多元回歸估計(jì)基因調(diào)控網(wǎng)絡(luò)的方法[J];上海交通大學(xué)學(xué)報(bào);2005年02期

相關(guān)會(huì)議論文 前3條

1 熊江輝;李瑩輝;;基因芯片數(shù)據(jù)分析的新方法與基因調(diào)控網(wǎng)絡(luò)推理[A];全面建設(shè)小康社會(huì):中國(guó)科技工作者的歷史責(zé)任——中國(guó)科協(xié)2003年學(xué)術(shù)年會(huì)論文集(上)[C];2003年

2 王亞麗;周彤;;大規(guī);蛘{(diào)控網(wǎng)絡(luò)因果關(guān)系的辨識(shí)[A];第二十九屆中國(guó)控制會(huì)議論文集[C];2010年

3 馮晶;許勇;李娟娟;;非高斯噪聲激勵(lì)下基因調(diào)控網(wǎng)絡(luò)的研究[A];第十四屆全國(guó)非線性振動(dòng)暨第十一屆全國(guó)非線性動(dòng)力學(xué)和運(yùn)動(dòng)穩(wěn)定性學(xué)術(shù)會(huì)議摘要集與會(huì)議議程[C];2013年

相關(guān)重要報(bào)紙文章 前1條

1 吳佳s,

本文編號(hào):2466104


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jiyingongcheng/2466104.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶f552e***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com