天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 自動化論文 >

基于多標簽體檢數(shù)據(jù)的疾病風險分析方法研究

發(fā)布時間:2019-03-15 21:01
【摘要】:健康體檢是疾病預防很重要的環(huán)節(jié)。醫(yī)生可以根據(jù)個人的健康體檢結(jié)果及時分析潛在的病癥,進而對其進行健康指導。針對健康體檢結(jié)果的分析,傳統(tǒng)的處理方式為有經(jīng)驗的醫(yī)生針對身體各部分的體檢結(jié)果給出整體的健康狀況和疾病風險分析,隨著數(shù)據(jù)的日益增多,以及醫(yī)生經(jīng)驗的良莠不齊等現(xiàn)狀,人工的分析方法在效率和精度方面無法滿足日益增多的體檢需求。隨著數(shù)據(jù)挖掘技術(shù)的發(fā)展,人工智能、機器學習方法已被廣泛用于醫(yī)療輔助診斷和疾病風險分析。數(shù)據(jù)預處理是機器學習的重要環(huán)節(jié)之一,在醫(yī)療體檢數(shù)據(jù)中,體檢結(jié)果往往存在個體性差異。體現(xiàn)在對于某一個特征,整個人群的特征數(shù)值分布的標準差相對較大,而且在均值以下的數(shù)量遠超在均值以上的數(shù)量,表現(xiàn)為數(shù)據(jù)分布極為不平穩(wěn)。然而,傳統(tǒng)化的數(shù)據(jù)歸一化方法并不能很好的規(guī)避這一問題。通過數(shù)學變換可以較好地解決這一問題并在一定程度上提高模型的收斂速度以及精度。本文主要工作包括:1、提出FN(Fusion normalization)方法來進行特征的平穩(wěn)化處理,并將特征值歸一化至(0,1)區(qū)間;2、針對多標簽問題,本文分別建立以SVM、GBDT、LR為基礎(chǔ)分類器的三種組合模型SVMs、GBDTs、LRs來處理醫(yī)學多標簽數(shù)據(jù);3、針對醫(yī)學體檢指標值正常人群數(shù)量大于異常人群數(shù)量,而造成的數(shù)據(jù)不平衡問題,本文根據(jù)標簽數(shù)據(jù)的比值采用對不同的標簽設(shè)置不同的懲罰因子的方法來處理。本文的數(shù)據(jù)集包含性別、空腹血糖等62個特征,高血壓、糖尿病、脂肪肝3個標簽。數(shù)據(jù)集中數(shù)據(jù)類型有字符型和數(shù)值型。實驗結(jié)果表明:FN(Fusion normalization)方法處理過后的體檢數(shù)據(jù)相比于不做歸一化的的數(shù)據(jù),Max_min歸一化以及標準歸一化方法,在組合模型SVMs、GBDTs、LRs上的準確率均有不同程度的提高。
[Abstract]:Health check-up is a very important part of disease prevention. Doctors can analyze the underlying symptoms on the basis of individual health check-up results, and then provide health guidance to them. According to the analysis of the health examination results, the traditional treatment method is to give the whole health condition and disease risk analysis for the experienced doctors according to the physical examination results of each part of the body. With the increasing of the data, As well as the mixed experience of doctors and so on, the artificial analysis method can not meet the increasing demand for physical examination in terms of efficiency and accuracy. With the development of data mining technology, artificial intelligence and machine learning methods have been widely used in medical assistant diagnosis and disease risk analysis. Data preprocessing is one of the important links in machine learning. In medical physical examination data, there are often individual differences in the results of physical examination. For a certain feature, the standard deviation of the distribution of the characteristic values of the whole population is relatively large, and the number below the mean value is far higher than the number above the mean value, which shows that the distribution of the data is extremely uneven. However, the traditional method of data normalization is not a good way to avoid this problem. This problem can be solved by mathematical transformation and the convergence speed and precision of the model can be improved to a certain extent. The main work of this paper is as follows: (1) the FN (Fusion normalization) method is proposed to stabilize the features and normalize the eigenvalues to (0,1); 2. Aiming at the multi-label problem, this paper establishes three combination models based on SVM,GBDT,LR classifier, SVMs,GBDTs,LRs, to deal with medical multi-label data. 3. In view of the imbalance of data caused by the number of normal population is larger than that of abnormal population, according to the ratio of label data, the method of setting different punishment factors for different labels is adopted to deal with the problem. This data set contains 62 features such as gender, fasting blood glucose, hypertension, diabetes, and fatty liver. The data types in the dataset are character type and numeric type. The experimental results show that the accuracy of the: FN (Fusion normalization) method in combination model SVMs,GBDTs,LRs is improved to some extent compared with the non-normalized data, the Max_min normalization method and the standard normalization method.
【學位授予單位】:鄭州大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:R194.3;TP18

【參考文獻】

相關(guān)期刊論文 前10條

1 董健;鄧國輝;李金武;;基于二維傅里葉變換實現(xiàn)圖像變換的研究[J];福建電腦;2015年09期

2 東珍;;健康體檢數(shù)據(jù)分析肥胖及相關(guān)疾病——以中央民族大學退休教工為例[J];中央民族大學學報(自然科學版);2015年01期

3 王霄;周李威;陳耿;朱玉全;;一種基于標簽相關(guān)性的多標簽分類算法[J];計算機應(yīng)用研究;2014年09期

4 米國蓮;王春艷;司潤輝;陶麗;;健康體檢人群體重指數(shù)與高血壓和高血糖關(guān)系的調(diào)查分析[J];河北醫(yī)藥;2013年19期

5 李思男;李寧;李戰(zhàn)懷;;多標簽數(shù)據(jù)挖掘技術(shù):研究綜述[J];計算機科學;2013年04期

6 鄭曦;時榮海;姚道闊;卓瑪次仁;唐杰;賀燕;;拉薩1370名藏族群眾高血壓患病情況及影響因素的Logistic回歸分析[J];公共衛(wèi)生與預防醫(yī)學;2013年01期

7 王燕華;;某高校教職員工健康體檢數(shù)據(jù)分析[J];華南國防醫(yī)學雜志;2012年06期

8 馬正甲;;健康體檢中脂肪肝檢驗結(jié)果與相關(guān)的影響因素研究[J];醫(yī)學檢驗與臨床;2012年06期

9 劉博;常玲;盧云濤;;高校教職工體檢人群高血壓危險因素的病例對照研究[J];中國全科醫(yī)學;2012年26期

10 趙文華;寧光;;2010年中國慢性病監(jiān)測項目的內(nèi)容與方法[J];中華預防醫(yī)學雜志;2012年05期

,

本文編號:2440983

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2440983.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶f3149***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com