混合線性模型方法探索復(fù)雜性狀的遺傳結(jié)構(gòu)及其軟件開發(fā)
本文關(guān)鍵詞: 復(fù)雜性狀 作物種子性狀 混合線性模型 連鎖分析 全基因組關(guān)聯(lián)分析 上位性 基因與環(huán)境互作效應(yīng) 出處:《浙江大學(xué)》2016年博士論文 論文類型:學(xué)位論文
【摘要】:農(nóng)作物種子作為人類食物、動物飼料和工業(yè)原材料的重要來源,主要由二倍體的胚胎和三倍體的胚乳組成。大多數(shù)農(nóng)藝性狀,包括種子性狀都是復(fù)雜性狀,它不僅僅受到單個基因控制,同時受到上位性效應(yīng)和基因與環(huán)境互作效應(yīng)的影響。隨著高通量測序技術(shù)的發(fā)展,全基因組關(guān)聯(lián)分析方法是檢測人類疾病和農(nóng)業(yè)復(fù)雜性狀遺傳變異的一種有效手段。但是目前大多數(shù)關(guān)聯(lián)分析方法均是基于簡單的加性模型,且僅對單個數(shù)量性狀進行分析。針對作物種子性狀連鎖分析和多性狀全基因組關(guān)聯(lián)分析存在的問題,我們基于混合線性模型發(fā)展了對應(yīng)的新方法剖析復(fù)雜性狀的遺傳結(jié)構(gòu)。蒙特卡洛模擬和實例數(shù)據(jù)分析均證明了新方法的無偏性和可靠性。本論文主要內(nèi)容包括以下三個章節(jié):第一章首先介紹近年來連鎖分析和關(guān)聯(lián)分析的研究進展以及發(fā)展的相應(yīng)軟件。此外,我們還介紹了在混合線性模型中,假設(shè)檢驗和參數(shù)估計常用的統(tǒng)計方法。第二章介紹了新發(fā)展的基于混合線性模型種子性狀定位的試驗設(shè)計和統(tǒng)計方法。開花植物的種子來源于雙受精,不但在繁衍中發(fā)揮重要作用,而且還是動物飼料和人類食物的主要來源。種子的發(fā)育包含多個遺傳體系,比如母體基因組、胚基因組和胚乳基因組。由于其復(fù)雜的多遺傳體系,尤其是來自同一基因組內(nèi)和來自不同基因組間的上位性以及各項遺傳分量與環(huán)境互作效應(yīng)的存在,使得研究種子性狀的遺傳機制面臨巨大的挑戰(zhàn)。根據(jù)種子性狀的遺傳特征,我們提出了兩個統(tǒng)計遺傳模型,該模型中包含母體加性和顯性效應(yīng),胚或胚乳的加性和顯性效應(yīng),母體基因組內(nèi)的加加上位性效應(yīng),胚或胚乳基因組內(nèi)的加加上位性效應(yīng),母體和胚或胚乳基因組間的加加上位性效應(yīng)以及這些效應(yīng)與環(huán)境的互作效應(yīng)。遺傳作圖群體可以由永久F2隨機交配產(chǎn)生,或是由永久F2與雙親的雙向回交產(chǎn)生,或是由永久F2群體自交產(chǎn)生。模特卡洛模擬驗證了不同的遺傳率和不同的模型對參數(shù)估計的影響。棉花種子性狀的實例分析也驗證了方法的可靠性;谔岢龅姆椒,我們開發(fā)了QTLNetwork-Seed-1.0.exe軟件,用于種子性狀的定位分析。第三章介紹了新發(fā)展的基于混合線性模型的多性狀全基因組關(guān)聯(lián)分析方法和統(tǒng)計軟件。隨著高通量測序技術(shù)的發(fā)展,全基因組關(guān)聯(lián)分析已經(jīng)變成了廣泛使用的探索復(fù)雜性狀遺傳結(jié)構(gòu)的新方法。但是關(guān)聯(lián)分析中主要存在的問題是個體和位點之間的關(guān)聯(lián)會造成假陽性,而混合線性模型是一種有效的控制群體結(jié)構(gòu)的方法。此外,大多數(shù)復(fù)雜疾病綜合癥狀包含一系列高度關(guān)聯(lián)的臨床或分子表現(xiàn)型,因此應(yīng)該把這些性狀聯(lián)合起來分析檢測影響多個性狀共有的遺傳變異。而目前的大多數(shù)方法都是基于加性效應(yīng)的單性狀關(guān)聯(lián)分析的模型。因此,我們拓展了多變量混合線性模型,其中包含了上位性和基因與環(huán)境互作效應(yīng)。我們提出的新方法不但能檢測多效性基因,同時還能檢測性狀特異表達(dá)的基因。大量的模擬研究調(diào)查了不同的殘差相關(guān)系數(shù),不同的遺傳率以及不同的模型對定位功效和效應(yīng)估計精度的影響。水稻實例數(shù)據(jù)也證明了方法的有效性。基于提出的方法,我們開發(fā)了相應(yīng)的軟件JAMT (Joint Analysis for Multiple Traits),用于多性狀聯(lián)合關(guān)聯(lián)分析。
[Abstract]:Seed is an important source of human food, animal feed and industrial raw materials, mainly by the diploid embryo and the triploid endosperm. Most agronomic traits, including seed traits are complex traits, it is not only a single gene control, is also affected by the epistatic effects and the interaction between gene and environment. With the development of high the amount of sequencing technology, genome-wide association analysis method is the detection of human disease and agricultural complex genetic variation characteristics of an effective means. But most of the current relevance analysis method is based on the simple additive model, and only on the number of single trait were analyzed. According to the analysis of crop seed trait linkage analysis and genome-wide association traits the problems of our genetic structure of mixed linear model to develop a new method corresponding to the analysis of complex traits based on Monte Carlo model. Unbiasedness and reliability to data analysis and examples show that the new method. The main contents of this thesis include the following three chapters: the first chapter introduces the corresponding software in recent years, linkage analysis and association analysis of the progress of research and development. In addition, we also introduced in the mixed linear model, statistical methods commonly used assumptions test and parameter estimation. The second chapter introduces the new development of the experimental design and statistical method of mixed linear model of seed traits based on location. Flowering plants derived from seeds of double fertilization, not only play an important role in reproduction, and the main source and animal feed and human food. The development of the seed contains more than one genetic system. For example, the maternal genome, embryo and endosperm genome genome. Because of its complex genetic system, especially from the same genome and from among different genomes And the genetic components and environment interaction effect exists, the genetic mechanism of seed traits is facing enormous challenges. According to the genetic characteristics of seed traits, we proposed two statistical genetic model, the model contains maternal additive and dominant effect, additive and dominant effects of embryo and endosperm, the effect of Gaga the maternal genome of the embryo and endosperm in the genome of the epistatic effect, the interaction effects of maternal and embryo and endosperm genomes between epistatic effects and these effects and environment. The genetic mapping population can be made permanent F2 random mating, or is produced by a two-way backcross and parents or permanent F2. F2 is produced by permanent inbreeding models. Carlo simulation verifies the effect of the heritability of different models for parameter estimation. Examples analysis of cotton seed traits also verified the method Reliability. The proposed method is based on, we developed QTLNetwork-Seed-1.0.exe software for analysis of location of seed traits. The third chapter introduces the new development of the multi trait mixed linear model of genome-wide association analysis and statistical software. With the development of high-throughput sequencing technology, genome-wide association analysis has become a new method to explore the complex genetic structure characters of widely used. But the main problems in the correlation analysis there is correlation between the individual and the site will cause false positives, and the mixed linear model is an effective method of control group structure. In addition, most of the complex disease syndrome contains a series of highly related clinical or molecular type, so we should take combined detection of multiple traits in common genetic variation in these traits. Most current methods are based on additive Analysis of single trait correlation model. Therefore, we expand the multivariate mixed linear model, including epistasis and gene environment interaction effects. Our proposed method can not only detect the pleiotropic gene, but also detect the expression characters of specific genes. Simulation study on large amount of investigation of residuals of different correlation coefficients the heritability of different model and different influence on the estimation accuracy of positioning function and effect of rice. Examples data also show the effectiveness of the method. The proposed method is based on, we developed the corresponding software JAMT (Joint Analysis for Multiple Traits), for the analysis of multi trait association.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP311.52;Q348
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 周永正;混合線性模型聯(lián)合估計的一個注記[J];數(shù)學(xué)的實踐與認(rèn)識;2002年06期
2 石磊,向黎明,,王學(xué)仁;混合線性模型效應(yīng)參數(shù)的影響分析[J];數(shù)學(xué)物理學(xué)報;1996年03期
3 朱軍;運用混合線性模型定位復(fù)雜數(shù)量性狀基因的方法[J];浙江大學(xué)學(xué)報(自然科學(xué)版);1999年03期
4 汪咬元;有約束條件時混合線性模型的最優(yōu)估計量公式[J];數(shù)學(xué)雜志;1986年04期
5 周永正;;一般混合線性模型固定效應(yīng)、隨機效應(yīng)與另一隨機向量的聯(lián)合估計[J];數(shù)學(xué)的實踐與認(rèn)識;2011年19期
6 石磊,張寶華,雷森;混合線性模型效應(yīng)參數(shù)的Bayes影響分析[J];云南大學(xué)學(xué)報(自然科學(xué)版);1999年06期
7 王勝初,閻新甫;混合線性模型生成模擬數(shù)據(jù)的方法和軟件設(shè)計[J];浙江農(nóng)業(yè)大學(xué)學(xué)報;1998年02期
8 石磊,李興緒,周汝良,雷森;混合線性模型效應(yīng)參數(shù)的Bayes局部影響分析[J];數(shù)學(xué)物理學(xué)報;2002年04期
9 胡希遠(yuǎn);利用SASPROC MIXED分析混合線性模型非平衡試驗數(shù)據(jù)[J];數(shù)理統(tǒng)計與管理;2005年01期
10 王愛國,D Laloe,LR Schaeffer CGIL;混合線性模型下豬群間遺傳聯(lián)系的度量[J];遺傳;2000年05期
相關(guān)會議論文 前2條
1 周永正;;混合線性模型中β與δ的同時估計[A];中國現(xiàn)場統(tǒng)計研究會第九屆學(xué)術(shù)年會論文集[C];1999年
2 童春發(fā);施季森;李力;;一般遺傳模型的方差分析和協(xié)方差分析[A];持續(xù)發(fā)展,再創(chuàng)輝煌——中國林學(xué)會林木遺傳育種分會第五屆年會文集[C];2002年
相關(guān)博士學(xué)位論文 前3條
1 魏巨龍;混合線性模型解析數(shù)量性狀遺傳基礎(chǔ)的研究[D];中國農(nóng)業(yè)大學(xué);2016年
2 祁婷;混合線性模型方法探索復(fù)雜性狀的遺傳結(jié)構(gòu)及其軟件開發(fā)[D];浙江大學(xué);2016年
3 尤薩夫;基于混合線性模型進行遺傳數(shù)據(jù)分析的異常值檢測方法[D];浙江大學(xué);2008年
本文編號:1514854
本文鏈接:http://sikaile.net/shoufeilunwen/jckxbs/1514854.html