基于機(jī)器學(xué)習(xí)的卵巢腫瘤預(yù)測(cè)與分析研究
本文選題:機(jī)器學(xué)習(xí) + 數(shù)據(jù)挖掘; 參考:《吉林大學(xué)》2016年碩士論文
【摘要】:21世紀(jì)以來(lái)隨著信息科技的飛速發(fā)展,計(jì)算機(jī)在社會(huì)發(fā)展中發(fā)揮著越來(lái)越重要的作用。隨著醫(yī)院信息化的發(fā)展(醫(yī)院信息系統(tǒng)和電子病歷的應(yīng)用)、數(shù)據(jù)儲(chǔ)存技術(shù)的發(fā)展,醫(yī)院數(shù)據(jù)庫(kù)積累了大規(guī)模的數(shù)據(jù)。然而,目前大多數(shù)醫(yī)院對(duì)于數(shù)據(jù)的處理還僅僅停留在“增、刪、改、查”的低端數(shù)據(jù)處理操作,缺乏數(shù)據(jù)集成和分析的技術(shù),更加無(wú)法利用已經(jīng)獲取的數(shù)據(jù)進(jìn)行輔助醫(yī)學(xué)決策和自動(dòng)獲取知識(shí)。另一方面,面對(duì)大量的數(shù)據(jù),傳統(tǒng)的數(shù)據(jù)分析和處理方法已經(jīng)無(wú)法獲得數(shù)據(jù)之間的隱藏信息和內(nèi)在關(guān)聯(lián),現(xiàn)在我們遇到的問題是,數(shù)據(jù)收集的手段得到飛速發(fā)展,數(shù)據(jù)存儲(chǔ)的技術(shù)得到顯著提高,但是如何利用這些來(lái)之不易的數(shù)據(jù)學(xué)以致用是我們現(xiàn)在主要面臨的問題。本文在研究了數(shù)據(jù)挖掘的相關(guān)理論基礎(chǔ)后,首先利用數(shù)據(jù)挖掘的相關(guān)理論基礎(chǔ)對(duì)收集到的用于評(píng)價(jià)卵巢腫瘤的關(guān)鍵醫(yī)學(xué)數(shù)據(jù)進(jìn)行篩選和預(yù)處理。通過學(xué)習(xí)機(jī)器學(xué)習(xí)算法選取了機(jī)器學(xué)習(xí)中適合于醫(yī)學(xué)數(shù)據(jù)挖掘的四種分類器:支持向量機(jī),對(duì)于小的樣本集、非線性樣本集及需要進(jìn)行高維降維的模式識(shí)別中有較好的效果,并且可以拓展到函數(shù)擬合等其他問題中。樸素貝葉斯分類器,樸素貝葉斯模型有堅(jiān)實(shí)的數(shù)學(xué)基礎(chǔ),分類效果穩(wěn)定,并且所需要的樣本空間很小,對(duì)有缺陷的數(shù)據(jù)集不敏感,算法簡(jiǎn)單。最近鄰分類器,此方法對(duì)于類域的交叉或重疊較多的待分樣本集來(lái)說,分類效果較好。隨機(jī)森林算法對(duì)于很多種資料,可以產(chǎn)生高準(zhǔn)確度的分類器,適合處理大量的輸入變量,并且學(xué)習(xí)過程快。并且本文針對(duì)所采集數(shù)據(jù)設(shè)計(jì)了一個(gè)人工神經(jīng)網(wǎng)絡(luò)算法,由于其具有自學(xué)習(xí)能力、高速尋找最優(yōu)解能力和聯(lián)想存儲(chǔ)功能,在構(gòu)建數(shù)據(jù)分類算法方面,效果顯著。本文分別用這五種算法進(jìn)行分類預(yù)測(cè)分析,通過統(tǒng)計(jì)學(xué)理論知識(shí)對(duì)實(shí)驗(yàn)結(jié)果進(jìn)行檢驗(yàn),并且將實(shí)驗(yàn)結(jié)果與國(guó)內(nèi)外研究結(jié)果的準(zhǔn)確性進(jìn)行分析比較。從機(jī)器學(xué)習(xí)的角度認(rèn)識(shí)、理解實(shí)驗(yàn)結(jié)果,并且進(jìn)行算法的整體性能評(píng)價(jià),通過分析本文的實(shí)驗(yàn)結(jié)果,提取出有關(guān)于卵巢腫瘤臨床醫(yī)學(xué)數(shù)據(jù)的分類提取規(guī)則,實(shí)現(xiàn)針對(duì)卵巢癌早期預(yù)測(cè)的目的,以輔助臨床診斷。做到早預(yù)測(cè),早治療,提高卵巢癌患者的生存率。
[Abstract]:With the rapid development of information technology in the 21st century, computer plays an increasingly important role in social development. With the development of hospital information (the application of hospital information system and electronic medical records, and the development of data storage technology), the hospital database has accumulated a large scale of data. However, at present, the data processing in most hospitals only stays at the low-end data processing operation of "increase, delete, change and check", and lacks the technology of data integration and analysis. It is even more difficult to use the acquired data to assist medical decision making and automatic acquisition of knowledge. On the other hand, in the face of a large amount of data, the traditional methods of data analysis and processing have been unable to obtain the hidden information and internal correlation between the data. The problem we now encounter is the rapid development of the means of data collection. Data storage technology has been greatly improved, but how to use these hard-won data for practical use is the main problem we now face. After studying the theoretical basis of data mining, the key medical data collected for the evaluation of ovarian tumors are screened and preprocessed by using the relevant theoretical basis of data mining. Through learning machine learning algorithm, four kinds of classifiers suitable for medical data mining in machine learning are selected: support vector machine (SVM), which has a good effect on small sample set, nonlinear sample set and pattern recognition requiring high dimension reduction. And it can be extended to other problems such as function fitting. Naive Bayesian classifier, naive Bayesian model has a solid mathematical foundation, the classification effect is stable, and the required sample space is very small, is not sensitive to the defective data sets, and the algorithm is simple. The nearest neighbor classifier has a good classification effect for the sample set with more crossover or overlap. For many kinds of data, the stochastic forest algorithm can produce high accuracy classifier, which is suitable for dealing with a large number of input variables, and the learning process is fast. In this paper, an artificial neural network algorithm is designed for the collected data. Because of its self-learning ability, high-speed ability to find the best solution and associative storage, it has a remarkable effect in constructing data classification algorithm. In this paper, the five algorithms are used for classification and prediction analysis, and the experimental results are tested by statistical theory knowledge, and the accuracy of the experimental results is analyzed and compared with the domestic and foreign research results. From the point of view of machine learning, the experimental results are understood, and the whole performance of the algorithm is evaluated. By analyzing the experimental results in this paper, the rules of classification and extraction of clinical medical data about ovarian tumors are extracted. To achieve early prediction of ovarian cancer to assist clinical diagnosis. To achieve early prediction, early treatment, improve the survival rate of patients with ovarian cancer.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP181;TP311.13
【相似文獻(xiàn)】
相關(guān)期刊論文 前2條
1 盧朝暉;王敏;王寧;胡琬;;食物與乳腺癌、卵巢癌風(fēng)險(xiǎn)關(guān)系的流行病學(xué)文獻(xiàn)統(tǒng)計(jì)分析[J];中華醫(yī)學(xué)圖書情報(bào)雜志;2010年12期
2 ;[J];;年期
相關(guān)會(huì)議論文 前10條
1 李利;王超英;林忠乙;;老年卵巢腫瘤82例分析[A];中國(guó)抗癌協(xié)會(huì)婦科腫瘤專業(yè)委員會(huì)第六次全國(guó)學(xué)術(shù)會(huì)議論文匯編[C];2001年
2 楊幼易;;老年婦女卵巢腫瘤手術(shù)治療140例臨床分析[A];中國(guó)抗癌協(xié)會(huì)婦科腫瘤專業(yè)委員會(huì)第六次全國(guó)學(xué)術(shù)會(huì)議論文匯編[C];2001年
3 楊帆;楊太珠;羅紅;朱琦;郭文琪;田雨;陳嬌;;生育前期女性卵巢腫瘤39例超聲診斷[A];2005年全國(guó)醫(yī)學(xué)影像技術(shù)學(xué)術(shù)會(huì)議西部論壇論文匯編[C];2005年
4 張海;李光展;吳瑛;盧俊;王慧芳;鄧偉蓮;;經(jīng)陰道彩色多普勒血流圖檢測(cè)卵巢腫瘤血管的臨床價(jià)值[A];中華醫(yī)學(xué)會(huì)第六次全國(guó)超聲醫(yī)學(xué)學(xué)術(shù)年會(huì)論文匯編[C];2001年
5 洪樹勛;許紅;曹良杰;;801例卵巢腫瘤臨床分析[A];紀(jì)念卓越的人民醫(yī)學(xué)家林巧稚大夫誕辰100周年——全國(guó)婦產(chǎn)科高級(jí)學(xué)術(shù)論壇論文集[C];2001年
6 梁元姣;葉小勤;;老年婦女雙側(cè)卵巢巨大腫瘤1例報(bào)告[A];中國(guó)抗癌協(xié)會(huì)婦科腫瘤專業(yè)委員會(huì)第六次全國(guó)學(xué)術(shù)會(huì)議論文匯編[C];2001年
7 劉力;李冰琳;張啟培;;836例卵巢腫瘤臨床病理分析[A];第八次全國(guó)婦產(chǎn)科學(xué)學(xué)術(shù)會(huì)議論文匯編[C];2004年
8 陳曉玲;紀(jì)莉;吳曉燕;魚紅菊;王琳;;彩色多普勒超聲在卵巢腫瘤診斷中的應(yīng)用[A];第一屆全國(guó)婦產(chǎn)科超聲學(xué)術(shù)會(huì)議論文匯編[C];2006年
9 許幼峰;郭e,
本文編號(hào):1799146
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1799146.html