天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 軟件論文 >

面向Hadoop的應(yīng)用特性分析及系統(tǒng)性能優(yōu)化

發(fā)布時(shí)間:2018-09-16 21:48
【摘要】:Hadoop是目前使用最為廣泛的大數(shù)據(jù)處理系統(tǒng)。盡管Hadoop為大規(guī)模分布式數(shù)據(jù)處理提供了高效的解決方案,但是Hadoop系統(tǒng)仍然面臨著一系列的挑戰(zhàn):1)Hadoop對(duì)外提供的抽象編程接口隱藏了底層具體的實(shí)現(xiàn)細(xì)節(jié),難以對(duì)應(yīng)用程序進(jìn)行性能分析;2)Hadoop系統(tǒng)配置參數(shù)對(duì)系統(tǒng)性能有重要的影響,但默認(rèn)配置模式不能保證所有應(yīng)用程序獲得最佳的性能,需要有針對(duì)性地進(jìn)行配置參數(shù)調(diào)優(yōu);3)數(shù)據(jù)的頻繁移動(dòng)嚴(yán)重制約大數(shù)據(jù)系統(tǒng)的性能,需要尋求新的解決方案以降低數(shù)據(jù)移動(dòng)對(duì)大數(shù)據(jù)系統(tǒng)性能造成的不利影響。本文主要針對(duì)Hadoop系統(tǒng)中應(yīng)用程序的性能特性分析和性能優(yōu)化方案加以研究。首先,本文基于二進(jìn)制字節(jié)碼動(dòng)態(tài)追蹤技術(shù)設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)輕量級(jí)、非侵入式的分布式Hadoop應(yīng)用性能分析框架,能夠動(dòng)態(tài)獲取應(yīng)用程序的運(yùn)行時(shí)狀態(tài)并進(jìn)行性能分析,幫助用戶(hù)了解應(yīng)用程序在Hadoop系統(tǒng)中運(yùn)行時(shí)的性能特性,進(jìn)而為應(yīng)用程序的優(yōu)化指明方向。其次,本文提出了一種針對(duì)動(dòng)態(tài)資源分配場(chǎng)景的Hadoop應(yīng)用程序性能模型,并以該性能模型為基礎(chǔ)使用遺傳算法對(duì)全局的高維配置參數(shù)空間進(jìn)行搜索,從而解決Hadoop系統(tǒng)配置參數(shù)的調(diào)優(yōu)問(wèn)題。本文提出的Hadoop應(yīng)用程序性能模型的預(yù)測(cè)錯(cuò)誤率低于6%;相比于默認(rèn)配置,使用本文方案優(yōu)化后平均可以獲得9.52倍的性能提升,最高可獲得18.76倍的性能提升。最后,本文針對(duì)Hadoop系統(tǒng)中MapReduce應(yīng)用的數(shù)據(jù)并行處理特性提出了一種近數(shù)據(jù)處理系統(tǒng),提供了完整的軟硬件接口、動(dòng)態(tài)任務(wù)遷移機(jī)制和運(yùn)行時(shí)環(huán)境,并實(shí)現(xiàn)了 一個(gè)輕量級(jí)的MapReduce框架,支持將Map任務(wù)和Reduce任務(wù)遷移至近數(shù)據(jù)處理單元中完成。相比于不采用近數(shù)據(jù)處理的基準(zhǔn)系統(tǒng),本文提出的近數(shù)據(jù)處理系統(tǒng)獲得了4.83倍性能提升,系統(tǒng)功耗可以降低26%;相比于采用近數(shù)據(jù)處理但不支持?jǐn)?shù)據(jù)并行處理的SMC系統(tǒng),本文提出的近數(shù)據(jù)處理系統(tǒng)功耗增加了37%,但獲得了2.32倍的性能提升。
[Abstract]:Hadoop is the most widely used big data processing system. Although Hadoop provides an efficient solution for large-scale distributed data processing, Hadoop systems still face a series of challenges: 1) the abstract programming interface provided by Hadoop hides the underlying implementation details. Hadoop system configuration parameters have a significant impact on system performance, but default configuration mode does not guarantee optimal performance for all applications. In order to reduce the adverse effect of data mobility on the performance of big data system, the frequent movement of configuration parameters is needed to restrict the performance of big data system seriously, and a new solution is needed to reduce the adverse effect caused by data mobility on the performance of big data system. In this paper, the performance characteristic analysis and performance optimization scheme of application program in Hadoop system are studied. Firstly, this paper designs and implements a lightweight, non-intrusive distributed Hadoop application performance analysis framework based on binary bytecode dynamic tracing technology, which can dynamically obtain the runtime state of the application and analyze its performance. To help users understand the performance characteristics of applications running in Hadoop systems, and then point out the direction of application optimization. Secondly, this paper proposes a Hadoop application performance model for dynamic resource allocation scenarios. Based on the performance model, genetic algorithm is used to search the global high-dimensional configuration parameter space. In order to solve the Hadoop system configuration parameters optimization problem. The prediction error rate of the Hadoop application performance model proposed in this paper is less than 6. Compared with the default configuration, the optimized scheme can achieve an average performance improvement of 9.52 times and a maximum performance improvement of 18.76 times. Finally, this paper presents a near data processing system based on the data parallel processing characteristics of MapReduce application in Hadoop system, which provides complete hardware and software interface, dynamic task migration mechanism and runtime environment. A lightweight MapReduce framework is implemented to support the migration of Map and Reduce tasks to near data processing units. Compared with the reference system without near data processing, the proposed near data processing system has achieved a 4.83 times performance improvement, and the power consumption of the system can be reduced by 26. Compared with the SMC system which uses near data processing but does not support data parallel processing, the proposed near data processing system can improve the performance of the system by 4.83 times and reduce the power consumption of the system by 26%. The power consumption of the proposed near data processing system is increased by 37 times, but the performance is improved by 2.32 times.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:TP311.13

【參考文獻(xiàn)】

相關(guān)期刊論文 前3條

1 程學(xué)旗;靳小龍;王元卓;郭嘉豐;張鐵贏;李國(guó)杰;;大數(shù)據(jù)系統(tǒng)和分析技術(shù)綜述[J];軟件學(xué)報(bào);2014年09期

2 宮學(xué)慶;金澈清;王曉玲;張蓉;周傲英;;數(shù)據(jù)密集型科學(xué)與工程:需求和挑戰(zhàn)[J];計(jì)算機(jī)學(xué)報(bào);2012年08期

3 王鵬;孟丹;詹劍鋒;涂碧波;;數(shù)據(jù)密集型計(jì)算編程模型研究進(jìn)展[J];計(jì)算機(jī)研究與發(fā)展;2010年11期

,

本文編號(hào):2244911

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2244911.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)12af8***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
女生更色还是男生更色| 久久精品福利在线观看| 欧美日韩乱一区二区三区| 亚洲伊人久久精品国产| 国产av精品高清一区二区三区| 日本深夜福利视频在线| 午夜精品在线视频一区| 国产av乱了乱了一区二区三区| 亚洲精品福利视频你懂的| 隔壁的日本人妻中文字幕版 | 欧美一级内射一色桃子| 国产精品免费福利在线| 大香蕉伊人一区二区三区| 国产又粗又猛又爽色噜噜| 日韩中文字幕人妻精品| 亚洲欧美黑人一区二区| 丝袜破了有美女肉体免费观看| 东京热加勒比一区二区三区 | 激情丁香激情五月婷婷| 国产精品一级香蕉一区| 日韩一区二区三区嘿嘿| 国产熟女一区二区三区四区| 欧美一区二区三区十区| 欧美日韩国产另类一区二区| 黄男女激情一区二区三区| 五月婷婷缴情七月丁香 | 91人人妻人人爽人人狠狠| 免费大片黄在线观看国语| 国产精品久久精品毛片| 欧美国产日本免费不卡| 少妇成人精品一区二区| 亚洲欧美日韩色图七区| 国产精品欧美在线观看| 亚洲国产成人精品福利| 午夜国产成人福利视频| 国内精品一区二区欧美| 日韩aa一区二区三区| 日本道播放一区二区三区| 夫妻性生活动态图视频| 久久天堂夜夜一本婷婷| 欧美欧美欧美欧美一区|