天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 醫(yī)學(xué)論文 > 腫瘤論文 >

基于SVM算法的癌癥基因數(shù)據(jù)分類研究

發(fā)布時(shí)間:2018-04-27 11:53

  本文選題:DNA微列陣 + 基因表達(dá)數(shù)據(jù) ; 參考:《蘇州大學(xué)》2015年碩士論文


【摘要】:癌癥是對(duì)人類生命構(gòu)成嚴(yán)重威脅的主要疾病之一,而癌癥的早診斷是提高癌癥患者成活率的關(guān)鍵。隨著DNA微列陣技術(shù)的飛速發(fā)展,海量的癌癥基因表達(dá)數(shù)據(jù)得以積累。在分子生物學(xué)的基礎(chǔ)上,如何根據(jù)這些龐大的基因表達(dá)數(shù)據(jù)進(jìn)行癌癥的早期診斷已成為后基因組時(shí)代的研究熱點(diǎn),但是癌癥基因表達(dá)數(shù)據(jù)一般都具有高維數(shù)、樣本數(shù)量少、非線性等特征,這就給基因數(shù)據(jù)的分類帶來(lái)了很多困難。針對(duì)以上基因表達(dá)數(shù)據(jù)的普遍特征,本文運(yùn)用一種基于支持向量機(jī)的分類方法對(duì)癌癥數(shù)據(jù)樣本進(jìn)行分類。SVM是在統(tǒng)計(jì)學(xué)理論的基礎(chǔ)上發(fā)展起來(lái)的新一代機(jī)器學(xué)習(xí)方法,它采用結(jié)構(gòu)風(fēng)險(xiǎn)化原則,代替了經(jīng)驗(yàn)最小化原則,成功應(yīng)用核函數(shù)將非線性問(wèn)題轉(zhuǎn)化為線性問(wèn)題,在解決有限樣本、非線性及高維模式識(shí)別問(wèn)題中表現(xiàn)出了許多特有的優(yōu)勢(shì)。盡管SVM有效的解決了欠學(xué)習(xí)和過(guò)學(xué)習(xí)的問(wèn)題,但是基因表達(dá)數(shù)據(jù)樣本數(shù)少、維數(shù)高的特性對(duì)數(shù)據(jù)分類準(zhǔn)確度的影響難以避免。如果直接對(duì)原始數(shù)據(jù)進(jìn)行分類,工作量大且得不到比較滿意的結(jié)果。因此,數(shù)據(jù)降維就成為癌癥基因數(shù)據(jù)分類的關(guān)鍵性問(wèn)題。本文首先運(yùn)用數(shù)據(jù)降維方法,對(duì)原始基因表達(dá)數(shù)據(jù)進(jìn)行降維,得到較低維度的數(shù)據(jù)之后,再對(duì)其進(jìn)行SVM分類。通過(guò)多種降維方法的比較以及SVM參數(shù)的合理設(shè)置,可以取得較高的癌癥診斷精度。文章中使用的數(shù)據(jù)降維方法有稀疏主成分分析,廣義判別分析和拉普拉斯特征值映射法等。本文的研究重點(diǎn)是如何利用降維方法優(yōu)化數(shù)據(jù),通過(guò)選擇兩組網(wǎng)絡(luò)公開(kāi)的數(shù)據(jù)集進(jìn)行相關(guān)實(shí)驗(yàn),可得對(duì)于Prostate Tumor數(shù)據(jù),GDA的降維效果最佳,而對(duì)于Leukemia數(shù)據(jù),MDS的降維效果最佳。實(shí)驗(yàn)結(jié)果表明:尋求最優(yōu)的降維方法以及合理的調(diào)整SVM參數(shù),可以有效的優(yōu)化基因數(shù)據(jù),提高SVM的分類性能,取得較高的分類精度。
[Abstract]:Cancer is one of the major diseases that pose a serious threat to human life, and the early diagnosis of cancer is the key to improve the survival rate of cancer patients. With the rapid development of DNA microarray technology, huge amounts of cancer gene expression data have been accumulated. On the basis of molecular biology, how to make early diagnosis of cancer based on these huge gene expression data has become a hot topic in the post-genomic era, but the cancer gene expression data generally have high dimension and few samples. Nonlinear and other characteristics, this brings a lot of difficulties to the classification of genetic data. In view of the general characteristics of the above gene expression data, this paper uses a classification method based on support vector machine to classify cancer data samples. SVM is a new generation machine learning method developed on the basis of statistical theory. It adopts the principle of structural risk, replaces the principle of empirical minimization, and successfully transforms the nonlinear problem into a linear problem by using kernel function. It has many unique advantages in solving the problem of finite sample, nonlinear and high-dimensional pattern recognition. Although SVM can effectively solve the problem of underlearning and overlearning, it is difficult to avoid the influence of the characteristics of high dimension on the accuracy of data classification because of the small number of samples of gene expression data. If the original data is classified directly, the workload is large and the result is not satisfactory. Therefore, data dimensionality reduction has become a key issue in cancer gene data classification. In this paper, we first use data dimension reduction method to reduce the dimension of the original gene expression data, get the lower dimension data, then classify them with SVM. Through the comparison of various dimensionality reduction methods and the reasonable setting of SVM parameters, a high accuracy of cancer diagnosis can be obtained. The data dimension reduction methods used in this paper include sparse principal component analysis, generalized discriminant analysis and Laplace eigenvalue mapping. The key point of this paper is how to optimize the data by using the dimension reduction method. By selecting two groups of data sets published in the network to carry on the related experiments, we can get the best dimensionality reduction effect for Prostate Tumor data and the best for Leukemia data. The experimental results show that the optimal dimensionality reduction method and the reasonable adjustment of SVM parameters can effectively optimize gene data, improve the classification performance of SVM, and achieve higher classification accuracy.
【學(xué)位授予單位】:蘇州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2015
【分類號(hào)】:R730.4;TP311.13

【參考文獻(xiàn)】

相關(guān)期刊論文 前5條

1 丁世飛;齊丙娟;譚紅艷;;支持向量機(jī)理論與算法研究綜述[J];電子科技大學(xué)學(xué)報(bào);2011年01期

2 王立強(qiáng),陸祖康,倪旭翔,鄭旭峰,李映笙;共聚焦生物芯片掃描儀中PMT電流增益的自動(dòng)控制[J];光子學(xué)報(bào);2004年03期

3 William CS CHO;南娟;;miRNAs作為癌癥預(yù)測(cè)和預(yù)后標(biāo)志物的巨大潛能[J];中國(guó)肺癌雜志;2013年01期

4 羅記平,屠大維;基因芯片CCD熒光檢測(cè)及圖像處理[J];紅外技術(shù);2003年05期

5 祁亨年;支持向量機(jī)及其應(yīng)用研究綜述[J];計(jì)算機(jī)工程;2004年10期

相關(guān)博士學(xué)位論文 前1條

1 陸慧娟;基于基因表達(dá)數(shù)據(jù)的腫瘤分類算法研究[D];中國(guó)礦業(yè)大學(xué);2012年



本文編號(hào):1810630

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/yixuelunwen/zlx/1810630.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶c25fb***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
黄色在线免费高清观看| 日韩不卡一区二区在线| 国产精品自拍杆香蕉视频| 国产精品亚洲精品亚洲| 日韩欧美一区二区黄色| 欧美黑人黄色一区二区| 国产高清视频一区不卡| 大伊香蕉一区二区三区| 国产免费人成视频尤物| 激情中文字幕在线观看 | 亚洲男女性生活免费视频| 日韩在线一区中文字幕| 欧美偷拍一区二区三区四区| 国产中文另类天堂二区| 欧美日韩乱一区二区三区| 国产精品一区二区三区欧美| 国内午夜精品视频在线观看| 日韩女优精品一区二区三区| 成人综合网视频在线观看| 午夜久久久精品国产精品| 国产精品视频久久一区| 人妻人妻人人妻人人澡| 99久久人妻精品免费一区| 久草国产精品一区二区| 日韩精品人妻少妇一区二区| 国产欧美一区二区久久| 在线观看欧美视频一区| 深夜视频在线观看免费你懂| 午夜国产成人福利视频| 成年人黄片大全在线观看| 人妻内射精品一区二区| 成在线人免费视频一区二区| 久久精品国产亚洲av久按摩| 国产中文字幕久久黄色片| 午夜福利国产精品不卡| 色婷婷视频免费在线观看| 国产日韩欧美在线亚洲| 色丁香之五月婷婷开心| 日本免费一本一二区三区| 亚洲熟女熟妇乱色一区| 黑丝国产精品一区二区|