Hadoop環(huán)境下近似概念格的并行構(gòu)造算法研究
[Abstract]:With the rapid development of information technology, the amount of data around the world is increasing explosively. Big data is a large and complex data set that uses new processing patterns to mine valuable information. Big data usually uses distributed computing for processing. The complex problem of distributed computing is divided into a small part, which is assigned to several computers for processing, and the final result is obtained by synthesizing the calculated results. Distributed computing can greatly reduce program running time. Concept lattice is a tool which can efficiently analyze data and obtain knowledge. It has been used in many fields, such as machine learning, information retrieval and expert system. Concept lattices can visualize the relationship between objects and attributes. In reality, information systems often have missing values, and the formal background containing missing information is called incomplete formal background. On this basis, the concept lattice model established on this basis is called approximate concept lattice. In the face of massive data, the traditional serial approximate concept lattice construction algorithm is less efficient. In order to solve this problem, by analyzing the characteristics of approximate concept lattices and incomplete information systems in depth, two parallel construction algorithms of approximate concept lattices based on MapReduce framework in Hadoop environment, namely parallel merging algorithm and parallel incremental algorithm, are proposed. The main contents are as follows: (1) parallel merging algorithm: in the framework of MapReduce, two concept lattices are first generated, and then two concept lattices are merged. The experimental results show that the parallel algorithm is feasible and efficient. (2) parallel incremental algorithm: based on the classical incremental algorithm, a parallel algorithm is proposed. The algorithm can generate approximate concept lattice directly without merging. The LD2011__2014 data set is used as the experimental data. The experimental results show that the algorithm is feasible and efficient.
【學(xué)位授予單位】:昆明理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13;TP338.6
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 李怡婷;;大數(shù)據(jù)行業(yè)應(yīng)用現(xiàn)狀及發(fā)展趨勢(shì)分析[J];數(shù)碼世界;2017年02期
2 張萌欣;;大數(shù)據(jù)金融產(chǎn)業(yè)進(jìn)入共享合作新紀(jì)元——中國(guó)大數(shù)據(jù)金融產(chǎn)業(yè)創(chuàng)新戰(zhàn)略聯(lián)盟在貴陽成立[J];中國(guó)科技產(chǎn)業(yè);2016年02期
3 張慧雯;劉文奇;李金海;;不完備形式背景下近似概念格的公理化方法[J];計(jì)算機(jī)科學(xué);2015年06期
4 程陳;;大數(shù)據(jù)挖掘分析[J];軟件;2014年04期
5 何清;莊福振;;基于云計(jì)算的大數(shù)據(jù)挖掘平臺(tái)[J];中興通訊技術(shù);2013年04期
6 陳明;;大數(shù)據(jù)問題[J];計(jì)算機(jī)教育;2013年05期
7 李國(guó)杰;程學(xué)旗;;大數(shù)據(jù)研究:未來科技及經(jīng)濟(jì)社會(huì)發(fā)展的重大戰(zhàn)略領(lǐng)域——大數(shù)據(jù)的研究現(xiàn)狀與科學(xué)思考[J];中國(guó)科學(xué)院院刊;2012年06期
8 陳如明;;大數(shù)據(jù)時(shí)代的挑戰(zhàn)、價(jià)值與應(yīng)對(duì)策略[J];移動(dòng)通信;2012年17期
9 畢強(qiáng);滕廣青;;國(guó)外形式概念分析與概念格理論應(yīng)用研究的前沿進(jìn)展及熱點(diǎn)分析[J];現(xiàn)代圖書情報(bào)技術(shù);2010年11期
10 智慧來;智東杰;劉宗田;;概念格合并原理與算法[J];電子學(xué)報(bào);2010年02期
相關(guān)博士學(xué)位論文 前2條
1 智慧來;概念格構(gòu)造與應(yīng)用中的關(guān)鍵技術(shù)研究[D];上海大學(xué);2010年
2 李云;概念格分布處理及其框架下的知識(shí)發(fā)現(xiàn)研究[D];上海大學(xué);2005年
相關(guān)碩士學(xué)位論文 前1條
1 米允龍;大數(shù)據(jù)下粗糙關(guān)聯(lián)規(guī)則算法研究[D];昆明理工大學(xué);2014年
,本文編號(hào):2447368
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2447368.html