當(dāng)前位置：主頁(yè) > 科技論文 > 計(jì)算機(jī)論文 >

Hadoop環(huán)境下近似概念格的并行構(gòu)造算法研究

發(fā)布時(shí)間：2019-03-26 08:40

【摘要】：隨著信息技術(shù)的迅速發(fā)展,全球的數(shù)據(jù)量正在爆炸性地增長(zhǎng)。大數(shù)據(jù)是指用新處理模式才能挖掘有價(jià)值信息的大型而復(fù)雜的數(shù)據(jù)集。大數(shù)據(jù)處理通常會(huì)用到分布式計(jì)算。分布式計(jì)算分割復(fù)雜問(wèn)題成很小的部分,并分配給多臺(tái)計(jì)算機(jī)進(jìn)行處理,綜合計(jì)算結(jié)果后得到最終結(jié)果。分布式計(jì)算能極大地縮減程序運(yùn)行時(shí)間。概念格是一個(gè)能夠高效地分析數(shù)據(jù)并獲取知識(shí)的工具,已被用于眾多領(lǐng)域,例如機(jī)器學(xué)習(xí),信息檢索和專(zhuān)家系統(tǒng)等。概念格能直觀(guān)地顯示對(duì)象和屬性之間的關(guān)系。現(xiàn)實(shí)中,信息系統(tǒng)常帶有缺失值,包含缺值信息的形式背景稱(chēng)為不完備形式背景,在此基礎(chǔ)上建立的概念格模型稱(chēng)為近似概念格。面對(duì)海量數(shù)據(jù),傳統(tǒng)串行的近似概念格構(gòu)造算法效率較低。為了解決此問(wèn)題,通過(guò)深度分析近似概念格及不完備信息系統(tǒng)的特征,提出了基于Hadoop環(huán)境的MapReduce框架的兩種近似概念格的并行構(gòu)造算法,即并行合并算法和并行增量算法。具體如下:(1)并行合并算法:在MapReduce框架中,首先生成兩個(gè)概念格,然后把兩個(gè)概念格進(jìn)行合并。試驗(yàn)采用LD2011__2014數(shù)據(jù)集作為實(shí)驗(yàn)數(shù)據(jù),實(shí)驗(yàn)結(jié)果表明,該并行算法是可行且高效的。(2)并行增量算法:在經(jīng)典增量算法基礎(chǔ)上提出了并行算法,算法可以直接生成近似概念格,不需要合并。試驗(yàn)采用LD2011__2014數(shù)據(jù)集作為實(shí)驗(yàn)數(shù)據(jù),實(shí)驗(yàn)結(jié)果顯示,此算法是可行且高效的。
[Abstract]:With the rapid development of information technology, the amount of data around the world is increasing explosively. Big data is a large and complex data set that uses new processing patterns to mine valuable information. Big data usually uses distributed computing for processing. The complex problem of distributed computing is divided into a small part, which is assigned to several computers for processing, and the final result is obtained by synthesizing the calculated results. Distributed computing can greatly reduce program running time. Concept lattice is a tool which can efficiently analyze data and obtain knowledge. It has been used in many fields, such as machine learning, information retrieval and expert system. Concept lattices can visualize the relationship between objects and attributes. In reality, information systems often have missing values, and the formal background containing missing information is called incomplete formal background. On this basis, the concept lattice model established on this basis is called approximate concept lattice. In the face of massive data, the traditional serial approximate concept lattice construction algorithm is less efficient. In order to solve this problem, by analyzing the characteristics of approximate concept lattices and incomplete information systems in depth, two parallel construction algorithms of approximate concept lattices based on MapReduce framework in Hadoop environment, namely parallel merging algorithm and parallel incremental algorithm, are proposed. The main contents are as follows: (1) parallel merging algorithm: in the framework of MapReduce, two concept lattices are first generated, and then two concept lattices are merged. The experimental results show that the parallel algorithm is feasible and efficient. (2) parallel incremental algorithm: based on the classical incremental algorithm, a parallel algorithm is proposed. The algorithm can generate approximate concept lattice directly without merging. The LD2011__2014 data set is used as the experimental data. The experimental results show that the algorithm is feasible and efficient.
【學(xué)位授予單位】：昆明理工大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類(lèi)號(hào)】：TP311.13;TP338.6

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 李怡婷;;大數(shù)據(jù)行業(yè)應(yīng)用現(xiàn)狀及發(fā)展趨勢(shì)分析[J];數(shù)碼世界;2017年02期

2 張萌欣;;大數(shù)據(jù)金融產(chǎn)業(yè)進(jìn)入共享合作新紀(jì)元——中國(guó)大數(shù)據(jù)金融產(chǎn)業(yè)創(chuàng)新戰(zhàn)略聯(lián)盟在貴陽(yáng)成立[J];中國(guó)科技產(chǎn)業(yè);2016年02期

3 張慧雯;劉文奇;李金海;;不完備形式背景下近似概念格的公理化方法[J];計(jì)算機(jī)科學(xué);2015年06期

4 程陳;;大數(shù)據(jù)挖掘分析[J];軟件;2014年04期

5 何清;莊福振;;基于云計(jì)算的大數(shù)據(jù)挖掘平臺(tái)[J];中興通訊技術(shù);2013年04期

6 陳明;;大數(shù)據(jù)問(wèn)題[J];計(jì)算機(jī)教育;2013年05期

7 李國(guó)杰;程學(xué)旗;;大數(shù)據(jù)研究:未來(lái)科技及經(jīng)濟(jì)社會(huì)發(fā)展的重大戰(zhàn)略領(lǐng)域——大數(shù)據(jù)的研究現(xiàn)狀與科學(xué)思考[J];中國(guó)科學(xué)院院刊;2012年06期

8 陳如明;;大數(shù)據(jù)時(shí)代的挑戰(zhàn)、價(jià)值與應(yīng)對(duì)策略[J];移動(dòng)通信;2012年17期

9 畢強(qiáng);滕廣青;;國(guó)外形式概念分析與概念格理論應(yīng)用研究的前沿進(jìn)展及熱點(diǎn)分析[J];現(xiàn)代圖書(shū)情報(bào)技術(shù);2010年11期

10 智慧來(lái);智東杰;劉宗田;;概念格合并原理與算法[J];電子學(xué)報(bào);2010年02期

相關(guān)博士學(xué)位論文前2條

1 智慧來(lái);概念格構(gòu)造與應(yīng)用中的關(guān)鍵技術(shù)研究[D];上海大學(xué);2010年

2 李云;概念格分布處理及其框架下的知識(shí)發(fā)現(xiàn)研究[D];上海大學(xué);2005年

相關(guān)碩士學(xué)位論文前1條

1 米允龍;大數(shù)據(jù)下粗糙關(guān)聯(lián)規(guī)則算法研究[D];昆明理工大學(xué);2014年

，

本文編號(hào)：2447368

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2447368.html

上一篇：采用DM3730的高清視頻采集與處理系統(tǒng)研究
下一篇：字線(xiàn)脈沖控制解決異步雙端口SRAM中的寫(xiě)干擾

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

Hadoop環(huán)境下近似概念格的并行構(gòu)造算法研究