天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 軟件論文 >

關(guān)聯(lián)分類算法研究及其在海量慢病醫(yī)療數(shù)據(jù)挖掘中的應(yīng)用

發(fā)布時(shí)間:2018-06-17 09:11

  本文選題:關(guān)聯(lián)分類 + Hadoop; 參考:《北京郵電大學(xué)》2016年碩士論文


【摘要】:關(guān)聯(lián)分類是將關(guān)聯(lián)規(guī)則挖掘和分類技術(shù)結(jié)合而產(chǎn)生的一種算法,它首先使用關(guān)聯(lián)規(guī)則挖掘技術(shù)生成分類關(guān)聯(lián)規(guī)則,然后基于這些規(guī)則構(gòu)建分類器用于分類過程。與決策樹、神經(jīng)網(wǎng)絡(luò)等傳統(tǒng)的分類算法相比,它具有分類準(zhǔn)確率高、模型可理解性強(qiáng)的優(yōu)點(diǎn),尤其適合于醫(yī)療數(shù)據(jù)挖掘等需要分類模型易于理解、易于應(yīng)用的場(chǎng)景。高血壓、心腦血管病等慢性疾病給人類的健康帶來了極大危害,有必要借助數(shù)據(jù)挖掘技術(shù)建立慢病分類決策模型,進(jìn)行患病預(yù)測(cè)和輔助診斷。但是,慢病數(shù)據(jù)特有的數(shù)值型屬性多、屬性重要性差異大的特點(diǎn)會(huì)導(dǎo)致現(xiàn)有關(guān)聯(lián)分類技術(shù)的應(yīng)用效果不理想。本文針對(duì)慢病數(shù)據(jù)的特點(diǎn),提出了基于信息增益比的模糊加權(quán)關(guān)聯(lián)分類算法,以提升算法的分類準(zhǔn)確性。同時(shí),還對(duì)單節(jié)點(diǎn)的關(guān)聯(lián)分類算法進(jìn)行并行化改造和優(yōu)化來提升算法的擴(kuò)展性,從而滿足對(duì)海量數(shù)據(jù)高效處理的需求。論文研究工作主要圍繞模糊加權(quán)關(guān)聯(lián)分類算法設(shè)計(jì),慢病數(shù)據(jù)挖掘方案設(shè)計(jì),算法的并行化改造和性能評(píng)估等方面展開。首先,融合模糊集和信息增益比提出了能夠提高分類器性能的GRWFAC算法;然后結(jié)合心血管患病風(fēng)險(xiǎn)預(yù)測(cè)場(chǎng)景,設(shè)計(jì)了海量慢病數(shù)據(jù)挖掘方案和模型輸入輸出參數(shù);最后基于Hadoop分布式平臺(tái)重新設(shè)計(jì)實(shí)現(xiàn)了并行化關(guān)聯(lián)分類MRWFAC算法,并開展海量慢病數(shù)據(jù)挖掘?qū)嶒?yàn)來驗(yàn)證算法性能的提升。論文最終驗(yàn)證了慢病數(shù)據(jù)挖掘方案的可行性以及算法性能的提升。與C4.5算法和CBA算法相比,GRWFAC算法的準(zhǔn)確率和穩(wěn)定性獲得提升,而并行化實(shí)現(xiàn)的MRWFAC算法在加速比和擴(kuò)展性評(píng)估中也體現(xiàn)了對(duì)海量慢病數(shù)據(jù)的適應(yīng)性。本課題的研究成果對(duì)于慢病防治和輔助診斷具有積極的意義。
[Abstract]:Association classification is an algorithm which combines association rule mining with classification technology. It first uses association rule mining technology to generate classification association rules, and then constructs classifier based on these rules for classification process. Compared with the traditional classification algorithms such as decision tree and neural network, it has the advantages of high classification accuracy and strong model comprehensibility. It is especially suitable for medical data mining, where classification models are easy to understand and apply. Chronic diseases such as hypertension and cardiovascular and cerebrovascular diseases have brought great harm to human health. It is necessary to establish a classification decision model of chronic diseases by using data mining technology to predict disease and assist diagnosis. However, there are many numerical attributes and great differences in the importance of attributes in slow disease data, which will lead to unsatisfactory application of existing association classification techniques. According to the characteristics of slow disease data, a fuzzy weighted association classification algorithm based on information gain ratio is proposed to improve the classification accuracy of the algorithm. At the same time, the single node association classification algorithm is parallelized and optimized to improve the scalability of the algorithm, so as to meet the demand for efficient processing of mass data. This paper mainly focuses on the design of fuzzy weighted association classification algorithm, the scheme design of slow disease data mining, the parallelization of the algorithm and the performance evaluation. Firstly, a GRWFAC algorithm which can improve the performance of classifier is proposed by combining fuzzy set and information gain ratio, and then the massive slow disease data mining scheme and the input and output parameters of the model are designed according to the forecast scenario of cardiovascular disease risk. Finally, the parallel association classification MRWFAC algorithm is redesigned based on Hadoop distributed platform, and the massive slow sickness data mining experiment is carried out to verify the performance of the algorithm. Finally, the paper verifies the feasibility of slow disease data mining and the improvement of algorithm performance. Compared with C4.5 algorithm and CBA algorithm, the accuracy and stability of GRWFAC algorithm are improved, and the parallel MRWFAC algorithm has the adaptability to mass slow disease data in speedup and scalability evaluation. The research results of this paper have positive significance for the prevention and treatment of chronic diseases and auxiliary diagnosis.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:R-05;TP311.13

【參考文獻(xiàn)】

相關(guān)期刊論文 前8條

1 文婷;;卡方檢驗(yàn)在醫(yī)學(xué)資料處理中的應(yīng)用[J];長(zhǎng)江大學(xué)學(xué)報(bào)(自科版);2013年24期

2 陸秋;程小輝;;基于MapReduce的決策樹算法并行化[J];計(jì)算機(jī)應(yīng)用;2012年09期

3 姚旭;王曉丹;張玉璽;權(quán)文;;特征選擇方法綜述[J];控制與決策;2012年02期

4 ;中國(guó)心血管病預(yù)防指南[J];中華心血管病雜志;2011年01期

5 周水紅;聶紹發(fā);王重建;魏晟;許奕華;李雪華;宋恩民;;應(yīng)用人工神經(jīng)網(wǎng)絡(luò)預(yù)測(cè)個(gè)體患原發(fā)性高血壓病危險(xiǎn)度[J];中華流行病學(xué)雜志;2008年06期

6 劉業(yè)政;焦寧;姜元春;;連續(xù)屬性離散化算法比較研究[J];計(jì)算機(jī)應(yīng)用研究;2007年09期

7 毛利鋒,瞿海斌;一種基于決策樹的乳腺癌計(jì)算機(jī)輔助診斷新方法[J];江南大學(xué)學(xué)報(bào);2004年03期

8 朱凌云,吳寶明;醫(yī)學(xué)數(shù)據(jù)挖掘的技術(shù)、方法及應(yīng)用[J];生物醫(yī)學(xué)工程學(xué)雜志;2003年03期

相關(guān)博士學(xué)位論文 前1條

1 朱林;基于特征加權(quán)與特征選擇的數(shù)據(jù)挖掘算法研究[D];上海交通大學(xué);2013年

相關(guān)碩士學(xué)位論文 前2條

1 胡賢利;混合型數(shù)據(jù)的缺失數(shù)據(jù)的填補(bǔ)[D];中南大學(xué);2013年

2 許立莎;基于關(guān)聯(lián)規(guī)則挖掘的分類算法研究[D];西安科技大學(xué);2012年

,

本文編號(hào):2030510

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2030510.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶4607f***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
99福利一区二区视频| 精品亚洲香蕉久久综合网| 玩弄人妻少妇一区二区桃花| 日韩欧美中文字幕人妻| 国产精品成人免费精品自在线观看| 日韩性生活片免费观看| 精品日韩av一区二区三区| 日韩精品一级一区二区| 国产熟女一区二区不卡| 青青操日老女人的穴穴| 人妻熟女中文字幕在线| 国产偷拍精品在线视频| 男女一进一出午夜视频| 日韩夫妻午夜性生活视频| 老司机精品视频免费入口 | 欧美日韩国产精品自在自线| 五月婷日韩中文字幕四虎| 99秋霞在线观看视频| 国产伦精品一一区二区三区高清版 | 国产大屁股喷水在线观看视频| 日韩国产精品激情一区| 91亚洲国产—区=区a| 国产精品一区二区成人在线| 国产在线一区中文字幕| 亚洲欧美日韩国产自拍| 日韩精品一级片免费看| 亚洲av日韩av高潮无打码| 国产肥妇一区二区熟女精品 | 日本亚洲精品在线观看| 老司机精品福利视频在线播放| 伊人欧美一区二区三区| 久久综合亚洲精品蜜桃| 日韩欧美综合在线播放| 亚洲伦片免费偷拍一区| 日本免费一区二区三女| 国产精品不卡高清在线观看 | 91久久精品中文内射| 九九热精品视频在线观看 | 欧美综合色婷婷欧美激情| 久久国产成人精品国产成人亚洲 | 在线免费视频你懂的观看|