基于拉曼光譜的乳腺良惡性腫瘤識(shí)別模型研究
發(fā)布時(shí)間:2018-05-04 18:11
本文選題:乳腺癌 + 拉曼光譜; 參考:《東北師范大學(xué)》2017年碩士論文
【摘要】:乳腺癌是世界上最常見(jiàn)的女性癌癥之一,其發(fā)病率逐年增加。拉曼光譜技術(shù)可以從分子水平的基礎(chǔ)上對(duì)組織成分改變進(jìn)行表征和解釋,應(yīng)用在疾病的診斷和活體組織的原位檢測(cè)具有高靈敏度、無(wú)損的優(yōu)點(diǎn)。但是拉曼光譜數(shù)據(jù)維度較大,測(cè)量過(guò)程中存在噪聲,如果直接用來(lái)鑒別乳腺良惡性腫瘤有一定的難度。因此,針對(duì)這一問(wèn)題,急需建立一個(gè)可以判別乳腺腫瘤良惡性模型,從而開(kāi)展更有針對(duì)性的治療。結(jié)合拉曼光譜數(shù)據(jù),應(yīng)用機(jī)器學(xué)習(xí)算法構(gòu)建識(shí)別模型,這樣乳腺腫瘤識(shí)別率提高,同時(shí)人工會(huì)診的效率也得到了提高,達(dá)到更好的治療效果。本文采集168例女性樣本的拉曼光譜數(shù)據(jù),檢測(cè)樣本由吉林大學(xué)第一醫(yī)院乳腺外科提供。采集到的拉曼光譜數(shù)據(jù)比較復(fù)雜,存在數(shù)據(jù)維度大,數(shù)據(jù)樣本量少的問(wèn)題,直接用于構(gòu)建分類模型,容易產(chǎn)生過(guò)擬合的問(wèn)題,因此,根據(jù)研究者之前的工作,歸納出具有代表意義的乳腺組織良惡性的拉曼光譜數(shù)據(jù)特征峰,研究表明這些特征峰可以表征乳腺組織發(fā)生病變時(shí)組織成分的變化。經(jīng)過(guò)這一步驟,數(shù)據(jù)維度降低,使用支持向量機(jī)(SVM)、極限學(xué)習(xí)機(jī)(ELM)和K近鄰(KNN)方法建立分類模型。實(shí)驗(yàn)發(fā)現(xiàn)使用歸納出的峰值構(gòu)建模型,得到的分類預(yù)測(cè)精度從51.67%到85.00%不等,并且模型有明顯的傾向惡性組織類,分類目的不明確。為了解決上訴問(wèn)題,采取特征選擇和特征提取的方法找出最優(yōu)的特征子集組合,以達(dá)到更高的分類準(zhǔn)確率且更穩(wěn)定的模型。分別使用序列前向選擇(SFS)、Relief-F和聯(lián)合稀疏判別分析(JSDA)對(duì)乳腺組織的特征峰進(jìn)行分析,找到最優(yōu)的特征子集組合。接著分別使用上面提到的分類方法構(gòu)建模型。實(shí)驗(yàn)結(jié)果表明:使用特征選擇和特征提取方法選取的特征子集組合構(gòu)建的分類模型預(yù)測(cè)精度優(yōu)于使用全部特征峰構(gòu)建分類模型的預(yù)測(cè)精度。其中,基于KNN和JSDA構(gòu)建的分類模型(KNN-JSDA)獲得了最好的分類精度。KNN-JSDA模型對(duì)乳腺腫瘤良惡性的識(shí)別準(zhǔn)確率為93.12%。總之,建立的KNN-JSDA模型的Kappa系數(shù)為0.84,說(shuō)明分類效果具有參考價(jià)值。這些表明本文建立的KNN-JSDA模型具有良好的識(shí)別能力,能夠識(shí)別乳腺腫瘤的良惡性。
[Abstract]:Breast cancer is one of the most common female cancers in the world, and its incidence is increasing year by year. Raman spectroscopy can be used to characterize and explain the change of tissue composition on the basis of molecular level. It has the advantages of high sensitivity and nondestructive in diagnosis of disease and in situ detection of living tissues. However, the dimension of Raman spectrum data is large, and there is noise in the measurement process, so it is difficult to distinguish breast benign and malignant tumors directly. Therefore, to solve this problem, it is urgent to establish a model to distinguish benign and malignant breast tumors, so as to carry out more targeted treatment. Combined with Raman spectrum data, machine learning algorithm is used to construct the recognition model, so that the recognition rate of breast tumor is improved, and the efficiency of artificial consultation is also improved to achieve a better therapeutic effect. The Raman spectrum data of 168 female samples were collected. The samples were provided by breast surgery department of the first Hospital of Jilin University. The Raman spectrum data collected are complex, have the problem of large data dimension and small sample size, which can be used directly to construct classification model, and it is easy to produce over-fitting problem. Therefore, according to the previous work of the researcher, The characteristic peaks of Raman spectrum data of breast tissues are summarized. The results show that these peaks can be used to characterize the changes of tissue composition in breast lesions. After this step, the data dimension is reduced. Support vector machine (SVM), extreme learning machine (ELM) and K-nearest neighbor (KNN) are used to establish classification model. The experimental results show that the prediction accuracy of the model is from 51.67% to 85.00%, and the model has an obvious tendency to malignant tissue, and the purpose of classification is not clear. In order to solve the problem of appeal, the methods of feature selection and feature extraction are used to find out the optimal combination of feature subsets to achieve a higher classification accuracy and more stable model. The feature peaks of mammary tissue were analyzed by SFSS-Relief-F and JSDAs, respectively, and the optimal combination of feature subsets was found. Then the model is constructed using the classification method mentioned above. The experimental results show that the prediction accuracy of the classification model constructed by the combination of feature subsets selected by feature selection and feature extraction methods is better than that of the classification model constructed by using all the feature peaks. Among them, the classification model based on KNN and JSDA (KNN-JSDAA) obtained the best classification accuracy. The accuracy of the KNN-JSDA model for the identification of benign and malignant breast tumors was 93.1212. In a word, the Kappa coefficient of the established KNN-JSDA model is 0.84, which shows that the classification effect has reference value. These results show that the proposed KNN-JSDA model has a good ability to identify benign and malignant breast tumors.
【學(xué)位授予單位】:東北師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:R737.9;TP181
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 苗紅星;余建坤;;基于決策樹(shù)的ID3算法和C4.5算法的比較[J];現(xiàn)代計(jì)算機(jī)(專業(yè)版);2014年15期
2 黃莉莉;湯進(jìn);孫登第;羅斌;;基于多標(biāo)簽ReliefF的特征選擇算法[J];計(jì)算機(jī)應(yīng)用;2012年10期
3 袁前飛;蔡從中;肖漢光;劉興華;孔春陽(yáng);;基于支持向量機(jī)的乳腺癌預(yù)后狀態(tài)預(yù)測(cè)和療效評(píng)估[J];北京生物醫(yī)學(xué)工程;2007年04期
4 蘭天鴿;方勇華;;紅外光譜信號(hào)預(yù)處理的新方法[J];紅外與激光工程;2007年02期
5 張靜,宋銳,郁文賢,夏勝平,胡衛(wèi)東;基于混淆矩陣和Fisher準(zhǔn)則構(gòu)造層次化分類器[J];軟件學(xué)報(bào);2005年09期
相關(guān)碩士學(xué)位論文 前1條
1 桑應(yīng)賓;基于K近鄰的分類算法研究[D];重慶大學(xué);2009年
,本文編號(hào):1844051
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1844051.html
最近更新
教材專著