基于多標簽體檢數(shù)據(jù)的疾病風險分析方法研究
[Abstract]:Health check-up is a very important part of disease prevention. Doctors can analyze the underlying symptoms on the basis of individual health check-up results, and then provide health guidance to them. According to the analysis of the health examination results, the traditional treatment method is to give the whole health condition and disease risk analysis for the experienced doctors according to the physical examination results of each part of the body. With the increasing of the data, As well as the mixed experience of doctors and so on, the artificial analysis method can not meet the increasing demand for physical examination in terms of efficiency and accuracy. With the development of data mining technology, artificial intelligence and machine learning methods have been widely used in medical assistant diagnosis and disease risk analysis. Data preprocessing is one of the important links in machine learning. In medical physical examination data, there are often individual differences in the results of physical examination. For a certain feature, the standard deviation of the distribution of the characteristic values of the whole population is relatively large, and the number below the mean value is far higher than the number above the mean value, which shows that the distribution of the data is extremely uneven. However, the traditional method of data normalization is not a good way to avoid this problem. This problem can be solved by mathematical transformation and the convergence speed and precision of the model can be improved to a certain extent. The main work of this paper is as follows: (1) the FN (Fusion normalization) method is proposed to stabilize the features and normalize the eigenvalues to (0,1); 2. Aiming at the multi-label problem, this paper establishes three combination models based on SVM,GBDT,LR classifier, SVMs,GBDTs,LRs, to deal with medical multi-label data. 3. In view of the imbalance of data caused by the number of normal population is larger than that of abnormal population, according to the ratio of label data, the method of setting different punishment factors for different labels is adopted to deal with the problem. This data set contains 62 features such as gender, fasting blood glucose, hypertension, diabetes, and fatty liver. The data types in the dataset are character type and numeric type. The experimental results show that the accuracy of the: FN (Fusion normalization) method in combination model SVMs,GBDTs,LRs is improved to some extent compared with the non-normalized data, the Max_min normalization method and the standard normalization method.
【學位授予單位】:鄭州大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:R194.3;TP18
【參考文獻】
相關(guān)期刊論文 前10條
1 董健;鄧國輝;李金武;;基于二維傅里葉變換實現(xiàn)圖像變換的研究[J];福建電腦;2015年09期
2 東珍;;健康體檢數(shù)據(jù)分析肥胖及相關(guān)疾病——以中央民族大學退休教工為例[J];中央民族大學學報(自然科學版);2015年01期
3 王霄;周李威;陳耿;朱玉全;;一種基于標簽相關(guān)性的多標簽分類算法[J];計算機應(yīng)用研究;2014年09期
4 米國蓮;王春艷;司潤輝;陶麗;;健康體檢人群體重指數(shù)與高血壓和高血糖關(guān)系的調(diào)查分析[J];河北醫(yī)藥;2013年19期
5 李思男;李寧;李戰(zhàn)懷;;多標簽數(shù)據(jù)挖掘技術(shù):研究綜述[J];計算機科學;2013年04期
6 鄭曦;時榮海;姚道闊;卓瑪次仁;唐杰;賀燕;;拉薩1370名藏族群眾高血壓患病情況及影響因素的Logistic回歸分析[J];公共衛(wèi)生與預防醫(yī)學;2013年01期
7 王燕華;;某高校教職員工健康體檢數(shù)據(jù)分析[J];華南國防醫(yī)學雜志;2012年06期
8 馬正甲;;健康體檢中脂肪肝檢驗結(jié)果與相關(guān)的影響因素研究[J];醫(yī)學檢驗與臨床;2012年06期
9 劉博;常玲;盧云濤;;高校教職工體檢人群高血壓危險因素的病例對照研究[J];中國全科醫(yī)學;2012年26期
10 趙文華;寧光;;2010年中國慢性病監(jiān)測項目的內(nèi)容與方法[J];中華預防醫(yī)學雜志;2012年05期
,本文編號:2440983
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2440983.html