四倍體單體型重建算法研究與軟件開發(fā)
發(fā)布時間:2018-12-08 10:41
【摘要】:基因多態(tài)性源于單核苷酸多態(tài)性(Single Nucleotide Polymorphisms,SNP),對SNP進(jìn)行分析研究在生物遺傳學(xué)領(lǐng)域具有重要意義。而由SNP位點(diǎn)序列組成的單體型,比單個SNP攜帶更多遺傳信息。單體型的分析和檢測對于了解基因功能,診斷復(fù)雜疾病及精確定位物種遺傳基因等具有重大作用。遺憾的是,當(dāng)前利用生物學(xué)手段直接測定單體型的花銷太過昂貴。所以,使用計算機(jī)技術(shù)確定并重建出單體型,具有重大現(xiàn)實(shí)意義。過去此研究主要圍繞二倍體進(jìn)行,隨著研究水平進(jìn)一步發(fā)展和適應(yīng)社會發(fā)展的需求,有更多倍體的重建問題展開研究。本文主要圍繞四倍體單體型重建問題研究,提出了基于MEC/GI模型(the Minimum Error Correction with Genotype Information,帶有基因信息最少錯誤更正模型)的EHTS算法和EHTD算法。EHTS算法計算單體型每個位點(diǎn)所有排列情況的支持度,并選取支持度值最大的排列作為該位點(diǎn)的SNP值。此過程反復(fù)迭代,直至確定所有位點(diǎn)取值確定出該單體型。通過算法對比實(shí)驗,EHTS算法在各種參數(shù)條件下性能良好,運(yùn)行速度較快,且與W-GA,Q-PSO算法相比有更高的重建率。EHTD算法主要通過計算各個位點(diǎn)的差異度,選取差異度值最小的排列情況重建出單體型。實(shí)驗表明,該算法比W-GA,Q-PSO算法有更好的重建效果。少數(shù)情況下,EHTD比EHTS算法重建率高。本文在EHTD和EHTS算法實(shí)驗基礎(chǔ)上,設(shè)計了一個針對四倍體單體型重建的應(yīng)用軟件。該軟件使用C#語言開發(fā),軟件功能主要分為輸入模塊,算法重建模塊和輸出模塊。軟件輸入模塊中主要以讀文件方式輸入數(shù)據(jù);軟件的運(yùn)行模塊主要實(shí)現(xiàn)單體型重建,該模塊是整個軟件核心部分,集成了 EHTD和EHTS算法,可以高效重建出四倍體單體型;輸出模塊中,該軟件重建的四條單體型顯示在輸出窗口,同時將輸出結(jié)果寫入文件方便數(shù)據(jù)留存。本軟件參照通用片段數(shù)據(jù)規(guī)則,設(shè)計片段數(shù)據(jù)模塊,可推廣下性較好。綜上所述,本文對四倍體單體型重建問題進(jìn)行研究,提出了有效的重建方法,并設(shè)計了相關(guān)應(yīng)用軟件。這些研究工作具有一定科研價值和應(yīng)用價值,為進(jìn)一步深入展開四倍體物種研究奠定基礎(chǔ)。
[Abstract]:Gene polymorphism originates from single nucleotide polymorphism (Single Nucleotide Polymorphisms,SNP). The analysis of SNP is of great significance in the field of biogenetics. Haplotypes composed of SNP locus carry more genetic information than single SNP. Haplotype analysis and detection play an important role in understanding gene function, diagnosing complex diseases and accurately locating genetic genes of species. Unfortunately, the current cost of using biological methods to measure haplotypes directly is too expensive. Therefore, the use of computer technology to determine and reconstruct haplotypes, has great practical significance. In the past, this research mainly focused on diploid. With the further development of the research level and the need of social development, there are more polyploid reconstruction problems. This paper focuses on the study of tetraploid haplotype reconstruction and proposes a (the Minimum Error Correction with Genotype Information, model based on MEC/GI model. EHTS algorithm and EHTD algorithm with minimum error correction model of gene information. EHTS algorithm calculates the support degree of each locus of haplotype and selects the arrangement with the largest support value as the SNP value of the locus. This process iterates over and over until all loci are determined to determine the haplotype. Compared with W-GAQ-PSO algorithm, EHTS algorithm has better performance, faster running speed and higher reconstruction rate than W-GAQ-PSO algorithm. EHTD algorithm mainly calculates the difference degree of each locus. The haplotype was reconstructed by selecting the arrangement with the lowest difference value. Experiments show that this algorithm has better reconstruction effect than W-GAQ-PSO algorithm. In a few cases, the reconstruction rate of EHTD is higher than that of EHTS. Based on the experiments of EHTD and EHTS algorithms, an application software for tetraploid haplotype reconstruction is designed in this paper. The software is developed in C # language. The software functions are divided into input module, algorithm reconstruction module and output module. In the software input module, the data is mainly input by the way of reading files, the running module of the software mainly realizes the haplotype reconstruction, this module is the core part of the whole software, which integrates EHTD and EHTS algorithms, and can efficiently reconstruct tetraploid haplotype. In the output module, the four haplotypes reconstructed by the software are displayed in the output window, and the output results are written to the file for easy data retention. According to the general rules of segment data, this software designs fragment data module, which can be popularized. To sum up, this paper studies the problem of tetraploid haplotype reconstruction, puts forward an effective reconstruction method, and designs related application software. These studies have certain scientific research value and application value, and lay a foundation for further research on tetraploid species.
【學(xué)位授予單位】:廣西師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:Q811.4;TP311.52
[Abstract]:Gene polymorphism originates from single nucleotide polymorphism (Single Nucleotide Polymorphisms,SNP). The analysis of SNP is of great significance in the field of biogenetics. Haplotypes composed of SNP locus carry more genetic information than single SNP. Haplotype analysis and detection play an important role in understanding gene function, diagnosing complex diseases and accurately locating genetic genes of species. Unfortunately, the current cost of using biological methods to measure haplotypes directly is too expensive. Therefore, the use of computer technology to determine and reconstruct haplotypes, has great practical significance. In the past, this research mainly focused on diploid. With the further development of the research level and the need of social development, there are more polyploid reconstruction problems. This paper focuses on the study of tetraploid haplotype reconstruction and proposes a (the Minimum Error Correction with Genotype Information, model based on MEC/GI model. EHTS algorithm and EHTD algorithm with minimum error correction model of gene information. EHTS algorithm calculates the support degree of each locus of haplotype and selects the arrangement with the largest support value as the SNP value of the locus. This process iterates over and over until all loci are determined to determine the haplotype. Compared with W-GAQ-PSO algorithm, EHTS algorithm has better performance, faster running speed and higher reconstruction rate than W-GAQ-PSO algorithm. EHTD algorithm mainly calculates the difference degree of each locus. The haplotype was reconstructed by selecting the arrangement with the lowest difference value. Experiments show that this algorithm has better reconstruction effect than W-GAQ-PSO algorithm. In a few cases, the reconstruction rate of EHTD is higher than that of EHTS. Based on the experiments of EHTD and EHTS algorithms, an application software for tetraploid haplotype reconstruction is designed in this paper. The software is developed in C # language. The software functions are divided into input module, algorithm reconstruction module and output module. In the software input module, the data is mainly input by the way of reading files, the running module of the software mainly realizes the haplotype reconstruction, this module is the core part of the whole software, which integrates EHTD and EHTS algorithms, and can efficiently reconstruct tetraploid haplotype. In the output module, the four haplotypes reconstructed by the software are displayed in the output window, and the output results are written to the file for easy data retention. According to the general rules of segment data, this software designs fragment data module, which can be popularized. To sum up, this paper studies the problem of tetraploid haplotype reconstruction, puts forward an effective reconstruction method, and designs related application software. These studies have certain scientific research value and application value, and lay a foundation for further research on tetraploid species.
【學(xué)位授予單位】:廣西師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:Q811.4;TP311.52
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 張倩;吳t熇,
本文編號:2368212
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2368212.html
最近更新
教材專著