天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于多組學(xué)的結(jié)核病發(fā)病分子機(jī)制核心復(fù)雜網(wǎng)絡(luò)系統(tǒng)的發(fā)現(xiàn)及預(yù)測診斷模型構(gòu)建的研究

發(fā)布時(shí)間:2018-06-19 15:25

  本文選題:多組學(xué) + 結(jié)核病 ; 參考:《北京市結(jié)核病胸部腫瘤研究所》2017年博士論文


【摘要】:目的:當(dāng)前全球結(jié)核病的防控形式嚴(yán)峻,亟待對其發(fā)病機(jī)制有更深的認(rèn)識,并基于此發(fā)展更有效的防控策略和方法。本研究希望從多生物組學(xué)的角度尋找結(jié)核病的發(fā)病分子機(jī)制核心復(fù)雜網(wǎng)絡(luò)系統(tǒng),并基于此構(gòu)建可用于結(jié)核病高危人群篩選的結(jié)核病預(yù)測診斷模型,為新的結(jié)核病防控策略提供可用方法和工具,實(shí)現(xiàn)結(jié)核病的精準(zhǔn)預(yù)防。同時(shí),希望在研究中論證多組學(xué)層面類中心法則的相關(guān)證據(jù),為今后理論生物學(xué)的發(fā)展提供依據(jù)。方法:本研究為基于多生物組學(xué)大數(shù)據(jù)和計(jì)算機(jī)算法的Transomics研究。首先,通過各大國際生物組學(xué)數(shù)據(jù)庫獲得基因組、核組、轉(zhuǎn)錄組和蛋白組的相關(guān)數(shù)據(jù)。而后,對來自基因組和轉(zhuǎn)錄組的數(shù)據(jù),分別通過PLINK和limma等常規(guī)分析流程,獲得每個(gè)基因變異位點(diǎn)和基因表達(dá)的疾病相關(guān)統(tǒng)計(jì)值。然后,對現(xiàn)有的核組數(shù)據(jù)進(jìn)行整合、網(wǎng)絡(luò)叢集化,并聯(lián)合基因組和轉(zhuǎn)錄組的疾病相關(guān)統(tǒng)計(jì)結(jié)果進(jìn)行染色質(zhì)疾病相關(guān)性分析,獲得結(jié)核病相關(guān)染色質(zhì)疾病模塊。再后,基于獲得的結(jié)核病相關(guān)染色質(zhì)模塊中的基因變異位點(diǎn)信息,利用機(jī)器學(xué)習(xí)的方法初步構(gòu)建結(jié)核病預(yù)測診斷模型。繼之,對蛋白組數(shù)據(jù)進(jìn)行整合,構(gòu)建蛋白質(zhì)間相互作用網(wǎng)絡(luò),對核組與蛋白組學(xué)的標(biāo)準(zhǔn)化網(wǎng)絡(luò)矩陣進(jìn)行相關(guān)性分析,論證多組學(xué)層面類中心法則。最后,將蛋白質(zhì)相互作用網(wǎng)絡(luò)與已獲得的結(jié)核病相關(guān)染色質(zhì)疾病模塊進(jìn)行整合,通過網(wǎng)絡(luò)分解的方法獲得結(jié)核病發(fā)病分子機(jī)制核心復(fù)雜網(wǎng)絡(luò)系統(tǒng),以此對結(jié)核病預(yù)測診斷模型進(jìn)行優(yōu)化,并通過ROC分析驗(yàn)證其分類效果。結(jié)果:基因組學(xué)數(shù)據(jù)經(jīng)分析后,在未統(tǒng)計(jì)矯正狀態(tài)下共有49236個(gè)p值0.05的SNPs位點(diǎn)被發(fā)現(xiàn)。轉(zhuǎn)錄組數(shù)據(jù)經(jīng)基因差異表達(dá)分析,共篩選到1594個(gè)差異表達(dá)基因,其中738個(gè)基因上調(diào),856個(gè)基因下調(diào)。核組數(shù)據(jù)經(jīng)整合成為3044*3044的標(biāo)準(zhǔn)化矩陣,通過叢集的劃分和染色質(zhì)疾病相關(guān)性分析獲得的結(jié)核病相關(guān)染色質(zhì)疾病模塊中包含101417個(gè)SNPs,以此構(gòu)建的結(jié)核病預(yù)測診斷模型AUC達(dá)到了0.926,敏感性和特異性均分別為0.87和0.866,均超過了0.85,處于高分類效果的水平。蛋白質(zhì)組數(shù)據(jù)以核組的標(biāo)準(zhǔn)化矩陣為基準(zhǔn)進(jìn)行整合并與核組的矩陣進(jìn)行分析后發(fā)現(xiàn)兩者之間存在相關(guān)性,證明染色質(zhì)高級結(jié)構(gòu)與蛋白質(zhì)相互作用之間有生物學(xué)關(guān)聯(lián)。經(jīng)結(jié)核病相關(guān)染色質(zhì)模塊與蛋白質(zhì)間相互作用網(wǎng)絡(luò)整合后,形成的結(jié)核病發(fā)病分子機(jī)制復(fù)雜網(wǎng)絡(luò)系統(tǒng)包含了5846個(gè)節(jié)點(diǎn)和458653條邊,經(jīng)層次聚類分析后得到的結(jié)核病發(fā)病分子核心網(wǎng)絡(luò)包含2015個(gè)節(jié)點(diǎn)和61318條邊,再通過iNP算法分析后獲得了15個(gè)內(nèi)核網(wǎng)絡(luò)單位,包含228個(gè)基因;趦(nèi)核單位和前向搜索算法優(yōu)化后的結(jié)核病預(yù)測診斷模型的AUC為0.841,敏感性和特異性分別為0.768和0.769,而所包含的SNPs參數(shù)數(shù)量為2260個(gè),在沒有大幅犧牲分類效能的情況下,將模型參數(shù)減少到原模型的1/50水平,實(shí)現(xiàn)了應(yīng)用現(xiàn)有技術(shù)成本可控的目的。結(jié)論:本研究應(yīng)用基因組、核組、轉(zhuǎn)錄組和蛋白組等多組學(xué)的生物大數(shù)據(jù),通過Transomics的研究方法,初步找到了TB發(fā)病分子機(jī)制核心復(fù)雜網(wǎng)絡(luò)系統(tǒng)及其內(nèi)核單位,并在此基礎(chǔ)之上應(yīng)用機(jī)器學(xué)習(xí)的方法構(gòu)建了有較佳分類效果的TB預(yù)測診斷模型,為TB高危人群的篩選提供了備用工具。在理論生物學(xué)層面,本研究找到了一些多組學(xué)類中心法則存在的線索,為今后該理論的形成與完善做了部分先期探索性工作。復(fù)雜性疾病發(fā)病分子機(jī)制方面,本研究初步探索構(gòu)建了可能適合一般復(fù)雜性疾病的多組學(xué)分析流程,同時(shí)找到了驗(yàn)證基因間相互關(guān)系重要性的部分證據(jù),提示在今后相關(guān)的研究中需要對其予以重視。最后,由于本研究為基于生物大數(shù)據(jù)和計(jì)算機(jī)算法的Transomics研究,研究所得到的結(jié)論還有待于今后的實(shí)驗(yàn)室工作、臨床及流行病學(xué)等層面研究的進(jìn)一步深入剖析與驗(yàn)證。
[Abstract]:Objective: the current global tuberculosis prevention and control form is severe, and it is urgent to have a deeper understanding of its pathogenesis and to develop more effective prevention and control strategies and methods based on this. This study hopes to find the core complex network system of the molecular mechanism of tuberculosis from the perspective of multi bioomics, and based on this construction, it can be used to screen the high-risk population of tuberculosis. The selected model of tuberculosis predictive diagnosis provides the available methods and tools for the new tuberculosis prevention and control strategy to achieve the precise prevention of tuberculosis. At the same time, we hope to demonstrate the relevant evidence of the central rules of the multicomponent level in the study, and provide the basis for the development of the future theoretical biology. Transomics study of computer algorithms. First, the related data of genome, nuclear group, transcriptional group and protein group are obtained through the major international biomics database. Then, the data from the genome and transcriptional group are obtained through the routine analysis process such as PLINK and limma, respectively, to obtain the disease correlation of each gene mutation site and gene expression. Then, the existing nuclear group data are integrated, the network clustering, and the association of the disease related statistics of the genome and transcriptome to analyze the chromatin disease correlation, and obtain the tuberculosis related chromatin disease module. Then, based on the information of the genetic variation point in the acquired tuberculosis related color chromatin module, the use of the machine The method of learning is preliminarily constructed for the model of tuberculosis prediction and diagnosis. Then, the protein group data are integrated, the interprotein interaction network is constructed, the standardized network matrix of the nuclear group and the proteomics is analyzed, and the central rule of the multi group level is demonstrated. Finally, the protein interaction network and the obtained tuberculosis are obtained. The related chromatin disease module was integrated, and the core complex network system of the molecular mechanism of tuberculosis was obtained through network decomposition. In order to optimize the model of tuberculosis prediction and diagnosis, the classification results were verified by ROC analysis. Results: after the analysis of genomic data, there were 49236 P values of 0. in the uncorrected state. 05 SNPs loci were found. The transcriptional data were analyzed by gene differential expression, and 1594 differentially expressed genes were screened, of which 738 were up-regulated and 856 genes were down. The nuclear group data were integrated into the standardized matrix of 3044*3044, and the tuberculosis related chromatin disease was obtained by clustering and chromatin disease correlation analysis. The module contains 101417 SNPs, and the model AUC has reached 0.926. The sensitivity and specificity of the model are 0.87 and 0.866 respectively, which are more than 0.85, and are at the level of high classification effect. There is a correlation between the interaction of the chromatin advanced structure and the protein interaction. After the integration of the tuberculosis related chromatin module and the protein interplay network, the complex network system of the molecular mechanism of tuberculosis is composed of 5846 nodes and 458653 sides, and obtained after hierarchical cluster analysis. The core network of tuberculosis is composed of 2015 nodes and 61318 sides, and then 15 kernel network units are obtained by iNP algorithm, and 228 genes are included. The AUC based on the kernel unit and the forward search algorithm is 0.841, and the sensitivity and specificity are 0.768 and 0.769 respectively, and the S is included. The number of NPs parameters is 2260. Without the significant sacrifice of classification efficiency, the model parameters are reduced to the 1/50 level of the original model, and the purpose of controlling the application of the existing technology costs is realized. Conclusion: This study applies the biological large data of the genome, the nuclear group, the transcriptional group and the protein group, and the preliminary study method of the Transomics. The core complex network system and its kernel unit of the molecular mechanism of TB are found, and on this basis, the TB predictive diagnostic model with better classification effect is constructed by using machine learning method, which provides a backup tool for the screening of high risk population of TB. In the context of the formation and improvement of the theory in the future, a part of the molecular mechanism of the pathogenesis of complex diseases has been explored. This study has initially explored and constructed a multi group analysis process that may be suitable for general complex diseases, and also found some evidence to verify the importance of INTERGENE interrelationships, suggesting that it is related in the future. It needs to be paid attention to in the study. Finally, because this research is based on the Transomics research of large data and computer algorithms, the conclusion of the research is still to be further analyzed and verified in the future laboratory work, clinical and epidemiological studies.
【學(xué)位授予單位】:北京市結(jié)核病胸部腫瘤研究所
【學(xué)位級別】:博士
【學(xué)位授予年份】:2017
【分類號】:R52

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 戴W歐,

本文編號:2040335


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/yxlbs/2040335.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶ee107***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com