一種Spark環(huán)境下的高效率大規(guī)模圖數(shù)據(jù)處理機制

發(fā)布時間：2018-11-28 13:52

【摘要】：針對現(xiàn)有的圖處理和圖管理框架存在的效率低下以及數(shù)據(jù)存儲結(jié)構(gòu)等問題,提出了一種適合大規(guī)模圖數(shù)據(jù)的處理機制。首先分析了目前的一些圖處理模型以及圖存儲框架的優(yōu)勢與存在的不足。其次,通過對分布式計算的特性分析采取適合大規(guī)模圖的分割算法、數(shù)據(jù)抽取的優(yōu)化以及緩存、計算層與持久層結(jié)合機制三方面來設計圖數(shù)據(jù)處理框架。最后通過PageRank和SSSP算法設計實驗,與MapReduce框架和采用HDFS作持久層的Spark框架進行性能對比。實驗證明提出的框架要比MapReduce框架快90倍,比采用HDFS作持久層的Spark框架快2倍,能夠滿足高效率圖數(shù)據(jù)處理的應用前景。
[Abstract]:Aiming at the inefficiency of the existing graph processing and graph management framework and the data storage structure, a processing mechanism suitable for large-scale graph data is proposed. Firstly, the advantages and disadvantages of some current graph processing models and graph storage framework are analyzed. Secondly, by analyzing the characteristics of distributed computing, we design the graph data processing framework from three aspects: the segmentation algorithm suitable for large-scale graph, the optimization of data extraction and the mechanism of cache, the combination of computing layer and persistence layer. Finally, the performance of PageRank and SSSP algorithm is compared with that of MapReduce framework and Spark framework with HDFS as persistence layer. Experiments show that the proposed framework is 90 times faster than the MapReduce framework and 2 times faster than the Spark framework using HDFS as the persistence layer. It can meet the application prospect of high efficiency graph data processing.
【作者單位】：云南大學信息學院;
【基金】：國家自然科學基金資助項目(61170222)
【分類號】：TP311.13
，

本文編號：2363025

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2363025.html

上一篇：基于NoSQL和SQL sharding的電商系統(tǒng)并發(fā)處理模式的研究和應用
下一篇：小程序及其未來

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

一種Spark環(huán)境下的高效率大規(guī)模圖數(shù)據(jù)處理機制