天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 計算機論文 >

DNA序列比對結(jié)果的存儲與壓縮

發(fā)布時間:2018-04-08 17:00

  本文選題:DNA序列比對結(jié)果 切入點:存儲 出處:《復旦大學》2012年碩士論文


【摘要】:隨著生物信息學、分子生物學等學科研究的深入,以及人類基因計劃的完成,越來越多的人類基因和其他模式生命體的基因被測序。序列比對是處理測序結(jié)果的方法,可以發(fā)現(xiàn)生物序列之間存在的結(jié)構(gòu)、功能和進化的關(guān)系,是生物信息學的基礎(chǔ)。 隨著這些測序項目的展開,每天都有海量的DNA序列數(shù)據(jù)產(chǎn)生,DNA序列數(shù)據(jù)經(jīng)過序列比對處理,比對結(jié)果數(shù)據(jù)也隨之出現(xiàn)。雖然存儲設備的快速發(fā)展已經(jīng)在一定程度上緩解了相關(guān)數(shù)據(jù)量急劇膨脹的問題。然而隨著比對研究的深入,單純依靠增加硬件設備已經(jīng)無法滿足DNA比對結(jié)果數(shù)據(jù)量快速增長的需求,存儲和使用這些數(shù)據(jù)的成本也終將增加至無法承擔的規(guī)模。 下一代測序技術(shù)平臺(NGS)在很大程度上減少了測序的成本開銷,使得基因序列分析在實踐醫(yī)療場景之中的應用成為可能。因此,不論是從存儲方面,還是應用方面考慮,序列比對結(jié)果的壓縮在DNA數(shù)據(jù)的存儲、管理和傳輸中起到了重要作用。DNA序列數(shù)據(jù)的壓縮目前已經(jīng)引起了國內(nèi)外學術(shù)界的廣泛關(guān)注,然而,很少有學者研究如何在實際醫(yī)療場景下壓縮比對結(jié)果;虮葘Y(jié)果的存儲在未來的發(fā)展中仍面臨著巨大挑戰(zhàn)。 在本文中,我們從醫(yī)療場景的應用角度出發(fā),設計出滿足需求的存儲結(jié)構(gòu),并在此基礎(chǔ)上設計出兩種不同的壓縮策略,以降低空間存儲代價。實驗數(shù)據(jù)表明,當覆蓋率提升時,我們的壓縮方案略微優(yōu)于RAR標準壓縮和ZIP標準壓縮;谝陨戏椒ㄍ瓿闪恕癉NA序列比對結(jié)果存儲與壓縮系統(tǒng)”,系統(tǒng)實現(xiàn)了對海量DNA比對結(jié)果的存儲,并提供了圖形化界面。
[Abstract]:With the development of bioinformatics, molecular biology and other subjects, and the completion of human gene project, more and more genes of human genes and other model organisms have been sequenced.Although the rapid development of storage devices has to some extent alleviated the problem of the rapid expansion of related data.However, with the deepening of the comparative research, it is no longer possible to meet the demand of increasing the amount of data from DNA comparison results simply by increasing the hardware devices, and the cost of storing and using these data will eventually increase to an unaffordable scale.The next generation sequencing technology platform (NGS) greatly reduces the cost of sequencing, which makes the application of gene sequence analysis in practical medical scenarios possible.Therefore, whether in terms of storage or application, the compression of sequence alignment results in the storage of DNA data,The compression of DNA sequence data plays an important role in the field of management and transmission. At present, the compression of DNA sequence data has attracted extensive attention in academic circles at home and abroad. However, few scholars have studied how to compress the results in actual medical scenarios.The storage of gene comparison results is still facing great challenges in the future.In this paper, we design a storage structure to meet the requirements from the perspective of medical scenarios, and then design two different compression strategies to reduce the cost of space storage.Experimental data show that our compression scheme is slightly better than that of RAR standard and ZIP standard when coverage increases.Based on the above methods, a "DNA sequence alignment result storage and compression system" is completed. The system realizes the storage of massive DNA alignment results, and provides a graphical interface.
【學位授予單位】:復旦大學
【學位級別】:碩士
【學位授予年份】:2012
【分類號】:TP333

【參考文獻】

相關(guān)期刊論文 前1條

1 張春霆;生物信息學的現(xiàn)狀與展望[J];中國青年科技;2001年01期

,

本文編號:1722517

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1722517.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶9b994***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com