天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 軟件論文 >

基于HADOOP的海事大數(shù)據(jù)分析處理平臺(tái)的研究與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-10-20 20:49
【摘要】:對(duì)海事數(shù)據(jù)進(jìn)行數(shù)據(jù)挖掘,在現(xiàn)有的海事數(shù)據(jù)中找出影響水運(yùn)安全事故的成因,建立模型對(duì)水運(yùn)事故進(jìn)行預(yù)測(cè),可以減少或避免水運(yùn)事故的發(fā)生。海事數(shù)據(jù)有如下特征:數(shù)據(jù)規(guī)模大,數(shù)據(jù)種類(lèi)多,數(shù)據(jù)價(jià)值密度低,要求數(shù)據(jù)處理速度快。這些特征使得海事數(shù)據(jù)區(qū)別于傳統(tǒng)的數(shù)據(jù),具有大數(shù)據(jù)的特點(diǎn)。所以海事大數(shù)據(jù)平臺(tái)的研究與實(shí)現(xiàn)具有很高的價(jià)值。目前有很多數(shù)據(jù)挖掘工具,如weka、SPSS等,但是這些工具只能運(yùn)行在單機(jī),當(dāng)數(shù)據(jù)量很大時(shí)會(huì)耗費(fèi)很長(zhǎng)的計(jì)算時(shí)間。Mahout提供了一些常用的機(jī)器學(xué)習(xí)算法的分布式實(shí)現(xiàn),但是當(dāng)數(shù)據(jù)發(fā)生變化時(shí),都需要對(duì)完整的數(shù)據(jù)重新進(jìn)行運(yùn)算。但由于海事數(shù)據(jù)平臺(tái)所要處理的數(shù)據(jù)是不斷增加的,針對(duì)目前的數(shù)據(jù)挖掘平臺(tái)不能同時(shí)滿足大數(shù)據(jù)的分布式分析和增量計(jì)算的現(xiàn)狀,本文根據(jù)海事數(shù)據(jù)的特點(diǎn),研究了如何在Hadoop上實(shí)現(xiàn)分布式的數(shù)據(jù)挖掘算法,并在此基礎(chǔ)上設(shè)計(jì)了一套增量計(jì)算的方案,最后基于Hadoop實(shí)現(xiàn)了海事大數(shù)據(jù)分析處理平臺(tái)。本文的創(chuàng)新點(diǎn)是實(shí)現(xiàn)了樸素貝葉斯算法、DBSCAN算法、Apriori算法在Hadoop平臺(tái)上的增量計(jì)算,,并提出了增量數(shù)據(jù)檢測(cè)的方式,通過(guò)增量計(jì)算提高數(shù)據(jù)處理效率。實(shí)驗(yàn)表明,本文設(shè)計(jì)并實(shí)現(xiàn)的大數(shù)據(jù)平臺(tái),能夠滿足對(duì)海事數(shù)據(jù)進(jìn)行分布式數(shù)據(jù)挖掘的需求,能夠高效、準(zhǔn)確地完成數(shù)據(jù)分類(lèi)、數(shù)據(jù)聚類(lèi)、關(guān)聯(lián)分析的任務(wù)。同時(shí)通過(guò)增量計(jì)算,在不影響結(jié)果準(zhǔn)確率的情況下,有效的減少了運(yùn)行時(shí)間。
[Abstract]:Based on the data mining of maritime data, the causes of marine safety accidents can be found out in the existing maritime data, and a model can be established to predict the waterway accidents, which can reduce or avoid the occurrence of waterway accidents. Maritime data has the following characteristics: large scale, large data types, low data value density and high speed of data processing. These characteristics make maritime data different from the traditional data, with big data's characteristics. Therefore, the research and implementation of maritime big data platform has high value. At present, there are many data mining tools, such as weka,SPSS and so on, but these tools can only run on a single machine. When the amount of data is very large, it will take a long time to compute. Mahout provides some distributed implementation of machine learning algorithms in common use. But when the data changes, the complete data needs to be recomputed. However, because the data to be processed by the maritime data platform is increasing, the current data mining platform can not meet the current situation of big data's distributed analysis and incremental calculation. According to the characteristics of maritime data, This paper studies how to realize the distributed data mining algorithm on Hadoop, and designs a set of incremental computing scheme based on it. Finally, the analysis and processing platform of maritime big data based on Hadoop is implemented. The innovation of this paper is to realize incremental computation of naive Bayes algorithm, DBSCAN algorithm and Apriori algorithm on Hadoop platform, and to improve the efficiency of data processing through incremental calculation. Experiments show that the big data platform designed and implemented in this paper can meet the requirements of distributed data mining for maritime data and can efficiently and accurately complete the tasks of data classification data clustering and association analysis. At the same time, by incremental calculation, the running time is reduced effectively without affecting the accuracy of the results.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP311.13

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 ;2014年國(guó)內(nèi)沿海貨運(yùn)船舶運(yùn)力情況分析[J];中國(guó)海事;2015年04期

2 方匡南;吳見(jiàn)彬;朱建平;謝邦昌;;隨機(jī)森林方法研究綜述[J];統(tǒng)計(jì)與信息論壇;2011年03期

,

本文編號(hào):2284359

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2284359.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶2f4d0***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com