基于多標(biāo)簽體檢數(shù)據(jù)的疾病風(fēng)險分析方法研究

發(fā)布時間：2019-03-15 21:01

【摘要】：健康體檢是疾病預(yù)防很重要的環(huán)節(jié)。醫(yī)生可以根據(jù)個人的健康體檢結(jié)果及時分析潛在的病癥,進(jìn)而對其進(jìn)行健康指導(dǎo)。針對健康體檢結(jié)果的分析,傳統(tǒng)的處理方式為有經(jīng)驗的醫(yī)生針對身體各部分的體檢結(jié)果給出整體的健康狀況和疾病風(fēng)險分析,隨著數(shù)據(jù)的日益增多,以及醫(yī)生經(jīng)驗的良莠不齊等現(xiàn)狀,人工的分析方法在效率和精度方面無法滿足日益增多的體檢需求。隨著數(shù)據(jù)挖掘技術(shù)的發(fā)展,人工智能、機器學(xué)習(xí)方法已被廣泛用于醫(yī)療輔助診斷和疾病風(fēng)險分析。數(shù)據(jù)預(yù)處理是機器學(xué)習(xí)的重要環(huán)節(jié)之一,在醫(yī)療體檢數(shù)據(jù)中,體檢結(jié)果往往存在個體性差異。體現(xiàn)在對于某一個特征,整個人群的特征數(shù)值分布的標(biāo)準(zhǔn)差相對較大,而且在均值以下的數(shù)量遠(yuǎn)超在均值以上的數(shù)量,表現(xiàn)為數(shù)據(jù)分布極為不平穩(wěn)。然而,傳統(tǒng)化的數(shù)據(jù)歸一化方法并不能很好的規(guī)避這一問題。通過數(shù)學(xué)變換可以較好地解決這一問題并在一定程度上提高模型的收斂速度以及精度。本文主要工作包括:1、提出FN(Fusion normalization)方法來進(jìn)行特征的平穩(wěn)化處理,并將特征值歸一化至(0,1)區(qū)間;2、針對多標(biāo)簽問題,本文分別建立以SVM、GBDT、LR為基礎(chǔ)分類器的三種組合模型SVMs、GBDTs、LRs來處理醫(yī)學(xué)多標(biāo)簽數(shù)據(jù);3、針對醫(yī)學(xué)體檢指標(biāo)值正常人群數(shù)量大于異常人群數(shù)量,而造成的數(shù)據(jù)不平衡問題,本文根據(jù)標(biāo)簽數(shù)據(jù)的比值采用對不同的標(biāo)簽設(shè)置不同的懲罰因子的方法來處理。本文的數(shù)據(jù)集包含性別、空腹血糖等62個特征,高血壓、糖尿病、脂肪肝3個標(biāo)簽。數(shù)據(jù)集中數(shù)據(jù)類型有字符型和數(shù)值型。實驗結(jié)果表明:FN(Fusion normalization)方法處理過后的體檢數(shù)據(jù)相比于不做歸一化的的數(shù)據(jù),Max_min歸一化以及標(biāo)準(zhǔn)歸一化方法,在組合模型SVMs、GBDTs、LRs上的準(zhǔn)確率均有不同程度的提高。
[Abstract]:Health check-up is a very important part of disease prevention. Doctors can analyze the underlying symptoms on the basis of individual health check-up results, and then provide health guidance to them. According to the analysis of the health examination results, the traditional treatment method is to give the whole health condition and disease risk analysis for the experienced doctors according to the physical examination results of each part of the body. With the increasing of the data, As well as the mixed experience of doctors and so on, the artificial analysis method can not meet the increasing demand for physical examination in terms of efficiency and accuracy. With the development of data mining technology, artificial intelligence and machine learning methods have been widely used in medical assistant diagnosis and disease risk analysis. Data preprocessing is one of the important links in machine learning. In medical physical examination data, there are often individual differences in the results of physical examination. For a certain feature, the standard deviation of the distribution of the characteristic values of the whole population is relatively large, and the number below the mean value is far higher than the number above the mean value, which shows that the distribution of the data is extremely uneven. However, the traditional method of data normalization is not a good way to avoid this problem. This problem can be solved by mathematical transformation and the convergence speed and precision of the model can be improved to a certain extent. The main work of this paper is as follows: (1) the FN (Fusion normalization) method is proposed to stabilize the features and normalize the eigenvalues to (0,1); 2. Aiming at the multi-label problem, this paper establishes three combination models based on SVM,GBDT,LR classifier, SVMs,GBDTs,LRs, to deal with medical multi-label data. 3. In view of the imbalance of data caused by the number of normal population is larger than that of abnormal population, according to the ratio of label data, the method of setting different punishment factors for different labels is adopted to deal with the problem. This data set contains 62 features such as gender, fasting blood glucose, hypertension, diabetes, and fatty liver. The data types in the dataset are character type and numeric type. The experimental results show that the accuracy of the: FN (Fusion normalization) method in combination model SVMs,GBDTs,LRs is improved to some extent compared with the non-normalized data, the Max_min normalization method and the standard normalization method.
【學(xué)位授予單位】：鄭州大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：R194.3;TP18

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 董健;鄧國輝;李金武;;基于二維傅里葉變換實現(xiàn)圖像變換的研究[J];福建電腦;2015年09期

2 東珍;;健康體檢數(shù)據(jù)分析肥胖及相關(guān)疾病——以中央民族大學(xué)退休教工為例[J];中央民族大學(xué)學(xué)報(自然科學(xué)版);2015年01期

3 王霄;周李威;陳耿;朱玉全;;一種基于標(biāo)簽相關(guān)性的多標(biāo)簽分類算法[J];計算機應(yīng)用研究;2014年09期

4 米國蓮;王春艷;司潤輝;陶麗;;健康體檢人群體重指數(shù)與高血壓和高血糖關(guān)系的調(diào)查分析[J];河北醫(yī)藥;2013年19期

5 李思男;李寧;李戰(zhàn)懷;;多標(biāo)簽數(shù)據(jù)挖掘技術(shù):研究綜述[J];計算機科學(xué);2013年04期

6 鄭曦;時榮海;姚道闊;卓瑪次仁;唐杰;賀燕;;拉薩1370名藏族群眾高血壓患病情況及影響因素的Logistic回歸分析[J];公共衛(wèi)生與預(yù)防醫(yī)學(xué);2013年01期

7 王燕華;;某高校教職員工健康體檢數(shù)據(jù)分析[J];華南國防醫(yī)學(xué)雜志;2012年06期

8 馬正甲;;健康體檢中脂肪肝檢驗結(jié)果與相關(guān)的影響因素研究[J];醫(yī)學(xué)檢驗與臨床;2012年06期

9 劉博;常玲;盧云濤;;高校教職工體檢人群高血壓危險因素的病例對照研究[J];中國全科醫(yī)學(xué);2012年26期

10 趙文華;寧光;;2010年中國慢性病監(jiān)測項目的內(nèi)容與方法[J];中華預(yù)防醫(yī)學(xué)雜志;2012年05期

，

本文編號：2440983

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2440983.html

上一篇：基于模糊PID的超高壓控制方法研究
下一篇：基于PLC的面向加熱水箱大滯后系統(tǒng)控制算法實現(xiàn)與優(yōu)化

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于多標(biāo)簽體檢數(shù)據(jù)的疾病風(fēng)險分析方法研究