MapReduce并行化壓縮近鄰算法

發(fā)布時(shí)間：2018-06-08 07:11

本文選題：壓縮近鄰 + K-近鄰��；參考：《小型微型計(jì)算機(jī)系統(tǒng)》2017年12期

【摘要】：壓縮近鄰(CNN:Condensed Nearest Neighbors)是Hart針對(duì)K-近鄰(K-NN:K-Nearest Neighbors)提出的樣例選擇算法,目的是為了降低K-NN算法的內(nèi)存需求和計(jì)算負(fù)擔(dān).但在最壞情況下,CNN算法的計(jì)算時(shí)間復(fù)雜度為O(n3),n為訓(xùn)練集中包含的樣例數(shù).當(dāng)CNN算法應(yīng)用于大數(shù)據(jù)環(huán)境時(shí),高計(jì)算時(shí)間復(fù)雜度會(huì)成為其應(yīng)用的瓶頸.針對(duì)這一問(wèn)題,本文提出了基于MapReduce并行化壓縮近鄰算法.在Hadoop環(huán)境下,編程實(shí)現(xiàn)了并行化的CNN,并與原始的CNN算法在6個(gè)數(shù)據(jù)集上進(jìn)行了實(shí)驗(yàn)比較.實(shí)驗(yàn)結(jié)果顯示,本文提出的算法是行之有效的,能解決上述問(wèn)題.
[Abstract]:CNN: Condensed nearest neighbor (CNN: Condensed nearest neighbor) is a sample selection algorithm proposed by Hart for K-NN: K-nearest neighbors. The aim of this algorithm is to reduce the memory requirement and computational burden of K-NN algorithm. But in the worst case, the computational complexity of CNN algorithm is the number of samples contained in the training set. When CNN algorithm is applied to big data environment, high computational time complexity will become the bottleneck of its application. To solve this problem, this paper proposes a parallel compressed nearest neighbor algorithm based on MapReduce. In Hadoop environment, the parallel CNNs are programmed and compared with the original CNN algorithm on 6 datasets. Experimental results show that the proposed algorithm is effective and can solve the above problems.
【作者單位】：河北大學(xué)數(shù)學(xué)與信息科學(xué)學(xué)院河北省機(jī)器學(xué)習(xí)與計(jì)算智能重點(diǎn)實(shí)驗(yàn)室;浙江師范大學(xué)數(shù)理與信息工程學(xué)院;
【基金】：國(guó)家自然科學(xué)基金項(xiàng)目(71371063)資助河北省自然科學(xué)基金項(xiàng)目(F2017201026)資助浙江省計(jì)算機(jī)科學(xué)與技術(shù)重中之重學(xué)科(浙江師范大學(xué))課題項(xiàng)目資助
【分類(lèi)號(hào)】：TP311.13
，

本文編號(hào)：1995065

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1995065.html

上一篇：基于圖像流式細(xì)胞技術(shù)的微藻檢測(cè)系統(tǒng)設(shè)計(jì)
下一篇：閉式葉輪熒光滲透檢測(cè)缺陷尺寸雙目測(cè)量

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

MapReduce并行化壓縮近鄰算法