基于機器學習算法的糖尿病預測模型研究
本文選題:糖尿病 切入點:危險因素 出處:《哈爾濱工業(yè)大學》2016年碩士論文 論文類型:學位論文
【摘要】:目前,我國慢性病患者人數(shù)居世界首位,而糖尿病及其相關并發(fā)癥是其中重要的一個組成部分,居民對健康需求強烈,因此,非常有建立糖尿病的預測模型的需要,通過建模對普通人群的糖尿病發(fā)病風險進行預估,發(fā)現(xiàn)高危人群,進而對糖尿病發(fā)病進行預報和預警。本文在總結前人研究的基礎上,對導致糖尿病的危險因素進行分析,通過對哈爾濱工業(yè)大學2014年校醫(yī)院體檢數(shù)據(jù)集的特征變量進行逐步回歸分析,得到與糖尿病顯著相關的危險因素,保留其作為BP神經(jīng)網(wǎng)絡模型、支持向量機模型和集成學習模型的輸入變量。機器學習算法在處理較為復雜的問題上有較好的準確度和泛化能力。將樣本集中2728條數(shù)據(jù)根據(jù)要求按照7:2:1的比例劃分成訓練集、測試集和獨立樣本集;贐P人工神經(jīng)網(wǎng)絡、支持向量機和集成學習模型分別建立進行機器學習仿真模擬。輸入變量和模型的各種參數(shù)、核函數(shù)的選擇都對預測結果產(chǎn)生有或多或少的影響。本研究中觀察了如網(wǎng)絡結構、學習率、懲罰因子、核函數(shù)及相關參數(shù)的改變對預測結果的影響,然后經(jīng)過對參數(shù)進行調(diào)試選擇,找到各個算法的最優(yōu)模型。最后使用獨立樣本進行測試,三個模型的預測結果與原始數(shù)據(jù)相關性強,證明建模具有統(tǒng)計意義,其中人工神經(jīng)網(wǎng)絡的最優(yōu)模型的測試集AUC更高,運行時間更短。所以,最終選擇以網(wǎng)絡結構為7-1-1的人工神經(jīng)網(wǎng)絡模型為本研究中糖尿病預測的最適模型。
[Abstract]:At present, the number of chronic disease patients in China ranks first in the world, and diabetes mellitus and its related complications are one of the important components, and the residents have a strong demand for health. Therefore, there is a great need to establish a predictive model of diabetes. In this paper, the risk factors of diabetes mellitus are analyzed on the basis of summarizing the previous studies, by modeling and predicting the risk of diabetes in the general population, and finding out the high risk group, and then forecasting and forewarning the onset of diabetes mellitus. By stepwise regression analysis on the characteristic variables of the medical examination data set of Harbin University of Technology in 2014, the risk factors associated with diabetes were obtained, and the risk factors were retained as BP neural network model. The input variables of support vector machine model and integrated learning model. The machine learning algorithm has better accuracy and generalization ability in dealing with more complex problems. 2728 pieces of data in the sample set are divided into training sets according to the requirement of 7: 2: 1. Test set and independent sample set. Based on BP artificial neural network, support vector machine and integrated learning model are built to simulate machine learning. The selection of kernel functions has a more or less effect on the prediction results. In this study, the effects of network structure, learning rate, penalty factors, kernel functions and related parameters on the prediction results were observed. Then the parameters are debugged and selected to find the optimal model of each algorithm. Finally, independent samples are used to test. The prediction results of the three models have strong correlation with the original data, which proves that the modeling has statistical significance. The test set of the optimal model of artificial neural network is higher and the running time is shorter. Therefore, the model of artificial neural network with network structure of 7-1-1 is chosen as the optimal model for predicting diabetes mellitus in this study.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:F224;R587.1
【相似文獻】
相關期刊論文 前10條
1 任宏;人工神經(jīng)網(wǎng)絡及其在預防醫(yī)學領域的應用[J];上海預防醫(yī)學雜志;2003年01期
2 趙應征,趙愛國,魯翠濤,韓鐵 ,梅興國;人工神經(jīng)網(wǎng)絡在藥學研究中的應用進展[J];解放軍藥學學報;2003年06期
3 楊鈞,周新華,馬大慶;人工神經(jīng)網(wǎng)絡及其在胸部放射學中的應用[J];中華放射學雜志;2004年02期
4 金玉琴;趙群;施誠;;人工神經(jīng)網(wǎng)絡及其在中藥研究中的應用[J];醫(yī)學信息;2007年06期
5 鄭義;陸輝;;人工神經(jīng)網(wǎng)絡及其在藥學上的應用[J];黑龍江醫(yī)藥;2008年06期
6 李麗霞;張敏;郜艷暉;張丕德;周舒冬;;人工神經(jīng)網(wǎng)絡在醫(yī)學研究中的應用[J];數(shù)理醫(yī)藥學雜志;2009年01期
7 徐俊芳;周曉農(nóng);;人工神經(jīng)網(wǎng)絡在傳染病研究中的應用[J];中國寄生蟲學與寄生蟲病雜志;2011年01期
8 王欣萍;孫昕;孫堯;;基于BP人工神經(jīng)網(wǎng)絡模型構建電子病歷系統(tǒng)的數(shù)據(jù)分析[J];中國組織工程研究與臨床康復;2011年35期
9 李光芬,強勇;人工神經(jīng)網(wǎng)絡在醫(yī)學中的應用[J];醫(yī)學信息;1998年12期
10 艾超,聶邦畿;人工神經(jīng)網(wǎng)絡在醫(yī)學中的應用與展望[J];現(xiàn)代醫(yī)學儀器與應用;1999年01期
相關會議論文 前10條
1 吳兵;;一種具有語義分布的自構造的新人工神經(jīng)網(wǎng)絡系統(tǒng)及其應用[A];1999年中國神經(jīng)網(wǎng)絡與信號處理學術會議論文集[C];1999年
2 劉R,
本文編號:1603613
本文鏈接:http://sikaile.net/yixuelunwen/nfm/1603613.html