天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于BSP的大規(guī)模圖處理系統(tǒng)中通信和緩存技術(shù)研究

發(fā)布時(shí)間:2018-02-27 03:20

  本文關(guān)鍵詞: BSP 圖處理 消息通信 磁盤緩存 云計(jì)算 出處:《東北大學(xué)》2012年碩士論文 論文類型:學(xué)位論文


【摘要】:隨著計(jì)算機(jī)以及網(wǎng)絡(luò)技術(shù)的發(fā)展,在計(jì)算機(jī)集群中采用并行的分布式計(jì)算方式提高計(jì)算處理能力已經(jīng)成為發(fā)展趨勢(shì)。云計(jì)算(Cloud Computing)的一個(gè)最主要的優(yōu)勢(shì)就是它的強(qiáng)大的并行計(jì)算處理能力,而這種能力是建立在一個(gè)簡(jiǎn)便高效的并行編程模型的基礎(chǔ)上的。其中,最有代表性的就是Google提出的MapReduce分布式并行編程模型。然而,隨著近年來(lái)互聯(lián)網(wǎng)應(yīng)用的迅猛發(fā)展,Web網(wǎng)絡(luò)、社交網(wǎng)絡(luò)等大規(guī)模網(wǎng)絡(luò)圖數(shù)據(jù)的分析處理成為了研究熱點(diǎn),例如社交網(wǎng)絡(luò)中的最短路徑、網(wǎng)頁(yè)搜索的PageRank等。這些圖處理問(wèn)題通常需要多次迭代,而MapReduce適合于通用的大數(shù)據(jù)集計(jì)算問(wèn)題,在處理具有多次迭代性質(zhì)的圖挖掘問(wèn)題時(shí)會(huì)導(dǎo)致次優(yōu)的性能。因而這些圖算法往往更適合于采用基于消息傳遞的并行模型來(lái)處理。BSP (Bulk Synchronous Parallel)整體同步并行模型就是一種支持消息傳遞的塊內(nèi)異步并行,塊間顯式同步的并行計(jì)算模型。隨著Google基于BSP模型實(shí)現(xiàn)的大規(guī)模圖處理系統(tǒng)Pregel的提出,在云環(huán)境中采用BSP模型實(shí)現(xiàn)大規(guī)模圖處理系統(tǒng)成為了主要的解決途徑。 本文旨在以BSP模型為核心,研究基于BSP模型的大規(guī)模圖處理系統(tǒng)中的消息通信原理和磁盤緩存技術(shù)的設(shè)計(jì)方案及其實(shí)現(xiàn)等問(wèn)題。提出了一種基于隊(duì)列的消息組織方式和通信方案,并在此基礎(chǔ)上提出了基于消息打包、多發(fā)送者線程池以及支持消息合并的優(yōu)化通信方案。針對(duì)基于BSP的大圖處理系統(tǒng)可能存在的內(nèi)存不足以存放計(jì)算中所有的圖和消息數(shù)據(jù)的問(wèn)題,本文建立了數(shù)據(jù)的內(nèi)存管理模型,并基于內(nèi)存優(yōu)先(Memory First)的思想,分別提出了圖數(shù)據(jù)和消息數(shù)據(jù)的磁盤緩存策略及相應(yīng)的算法:MF-GHIC算法、MLF圖數(shù)據(jù)遍歷算法和基于消息隊(duì)列優(yōu)先級(jí)的消息數(shù)據(jù)磁盤緩存算法等。 將本文提出的通信和緩存技術(shù)應(yīng)用于NEU-BSP系統(tǒng)中,我們通過(guò)實(shí)驗(yàn),首先分析了通信方案中各類參數(shù)的較優(yōu)值及其相互的制約關(guān)系;其次證明了在磁盤緩存率低于30%時(shí),系統(tǒng)的時(shí)間性能下降并不顯著;最后,我們以PageRank和單源最短路徑為例,通過(guò)與Hadoop系統(tǒng)的對(duì)比實(shí)驗(yàn),證明了在數(shù)據(jù)完全駐留內(nèi)存時(shí),NEU-BSP系統(tǒng)比Hadoop系統(tǒng)快1.2到18倍,在數(shù)據(jù)超過(guò)30%以上緩存到磁盤時(shí),NEU-BSP系統(tǒng)仍然能保持與Hadoop系統(tǒng)基本持平的時(shí)間性能。
[Abstract]:With the development of computer and network technology, the computer cluster used in parallel distributed computing capacity increase has become the development trend of computing mode. Cloud computing (Cloud Computing) is one of the main advantages is that its powerful parallel computing ability, and this ability is based on a simple and efficient parallel programming model on the basis of MapReduce. Among them, the most representative is the distributed parallel programming model proposed by Google. However, in recent years with the rapid development of Internet application, Web network, social network analysis and other large-scale network data has become a hot research topic, such as social networks in the shortest path, web search PageRank these problems. Graph processing usually requires multiple iterations, while the MapReduce big data set is suitable for general computation, with multiple iterations in processing The problem of qualitative graph mining will lead to suboptimal performance. These tend to be more suitable for the graph algorithm based on parallel model to deal with the.BSP message (Bulk Synchronous Parallel) bulk synchronous parallel model is in an asynchronous message passing parallel support, inter block explicit synchronization parallel computing model with Google. BSP model to realize large-scale graph processing system is proposed based on the Pregel, in a cloud environment using BSP model to realize large-scale graph processing system has become the main way.
The purpose of this paper is to use BSP model as the core, the research of BSP model in the large-scale system message communication principle and disk caching graph processing design and Realization Based on the presented. A news organization and communication scheme based on the queue, and on this basis put forward based on message package, optimized communication scheme for multi sender the thread pool and support message merge. In picture processing system of BSP are not enough memory to store all the maps and message data problems in calculation based on memory management model is established in this paper based on the data, and memory priority (Memory First) thoughts, respectively put forward disk caching strategy map data and message the data and the corresponding algorithm: MF-GHIC algorithm, MLF graph traversal algorithm and based on message queue priority message data disk cache algorithm.
The proposed communication and cache technology in NEU-BSP system, we first analyzed by experiments, the optimal value and the relation between the various parameters of mutual communication in the solution; it is proved that in the disk cache rate is less than 30%, the time performance of the system is not significantly decreased; at last, we use PageRank and single source shortest path as an example, through the contrast experiment with Hadoop system, proved that the complete data memory, NEU-BSP System 1.2 to 18 times faster than the Hadoop system, the data of more than 30% of the cache to disk, NEU-BSP system can still maintain the performance of time with basically the same as the Hadoop system.

【學(xué)位授予單位】:東北大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP338.6;TP333

【參考文獻(xiàn)】

相關(guān)期刊論文 前3條

1 宋青;汪小帆;;最短路徑算法加速技術(shù)研究綜述[J];電子科技大學(xué)學(xué)報(bào);2012年02期

2 李稚楹;楊武;謝治軍;;PageRank算法研究綜述[J];計(jì)算機(jī)科學(xué);2011年S1期

3 于戈;谷峪;鮑玉斌;王志剛;;云計(jì)算環(huán)境下的大規(guī)模圖數(shù)據(jù)處理技術(shù)[J];計(jì)算機(jī)學(xué)報(bào);2011年10期

,

本文編號(hào):1540905

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1540905.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶51076***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
邻居人妻人公侵犯人妻视频| 欧美一本在线免费观看| 国产av熟女一区二区三区蜜桃| 日本不卡一区视频欧美| 日本道播放一区二区三区| 又黄又爽禁片视频在线观看| 国产色第一区不卡高清| 视频一区二区黄色线观看| 久久机热频这里只精品| 欧美日韩国产精品第五页| 成人国产激情在线视频| 真实偷拍一区二区免费视频| 熟女少妇久久一区二区三区| 亚洲欧美日韩综合在线成成| 91亚洲精品综合久久| 午夜精品麻豆视频91| 日本在线不卡高清欧美| 国产二级一级内射视频播放| 国产欧美亚洲精品自拍| 老富婆找帅哥按摩抠逼视频| 亚洲国产四季欧美一区| 欧美精品二区中文乱码字幕高清| 亚洲中文字幕在线观看黑人| 欧美欧美欧美欧美一区| 免费一级欧美大片免费看| 国产福利一区二区久久| 亚洲丁香婷婷久久一区| 尹人大香蕉一级片免费看| 中文字幕久久精品亚洲乱码| 日本精品理论在线观看| 欧美一级黄片欧美精品| 欧美又黑又粗大又硬又爽| 精品亚洲一区二区三区w竹菊| 美国黑人一级黄色大片| 污污黄黄的成年亚洲毛片| 亚洲专区一区中文字幕| 91香蕉国产观看免费人人| 亚洲综合伊人五月天中文| 欧美日韩乱码一区二区三区| 一区二区三区日韩中文| 亚洲国产av在线视频|