面向大規(guī)模圖數(shù)據(jù)處理的虛擬機管理系統(tǒng)研究與實現(xiàn)
本文選題:大規(guī)模圖數(shù)據(jù)處理 + Pregel系統(tǒng); 參考:《東南大學》2016年碩士論文
【摘要】:隨著電子商務(wù)、移動互聯(lián)網(wǎng)、物聯(lián)網(wǎng)等技術(shù)的進一步發(fā)展,數(shù)據(jù)的規(guī)模、產(chǎn)生速度、復(fù)雜性均日益增長,標志著人類社會已經(jīng)進入了大數(shù)據(jù)時代。隨著數(shù)據(jù)之間的聯(lián)系變得更加緊密、依賴關(guān)系更加復(fù)雜,部分數(shù)據(jù)的分布模式逐漸具有圖的特征。傳統(tǒng)的大數(shù)據(jù)處理技術(shù),如MapReduce批處理框架不適用于關(guān)系復(fù)雜、需要多次迭代的圖數(shù)據(jù)。谷歌的Pregel系統(tǒng)通過并行化的思想,并行地進行頂點計算,大大提高了計算性能,為大規(guī)模圖數(shù)據(jù)處理提供了新的思路。現(xiàn)有大規(guī)模圖數(shù)據(jù)處理的研究工作均基于Pregel的思想,部分解決了大規(guī)模圖數(shù)據(jù)處理的問題,但仍存在如下的問題:一方面忽略了未隔離應(yīng)用間的資源競爭而造成的性能衰減;另一方面則忽略了應(yīng)用各階段對資源的彈性需求而造成的性能下降或者資源浪費問題。為了解決以上的問題,本碩士論文將虛擬化技術(shù)引入圖數(shù)據(jù)處理中,通過深入分析圖數(shù)據(jù)處理過程的執(zhí)行特性,并結(jié)合虛擬化良好的進程隔離性以及靈活的資源彈性可管理性,提出面向應(yīng)用的圖劃分以及資源分配與調(diào)度機制,實現(xiàn)根據(jù)應(yīng)用的具體執(zhí)行模式來進行資源的彈性供給,以提高圖數(shù)據(jù)處理系統(tǒng)的整體執(zhí)行效率。綜上所述,本論文從以下四個方面開展研究工作:首先,研究大規(guī)模圖數(shù)據(jù)處理應(yīng)用執(zhí)行模式的抽取和分析機制;陂_源類Pregel系統(tǒng)進行二次開發(fā),完成對應(yīng)用執(zhí)行模式的抽取,并建立執(zhí)行模式與底層資源需求間的映射關(guān)系,為后續(xù)虛擬資源的分配與調(diào)度提供可靠的理論依據(jù),是本文研究工作的基礎(chǔ)。其次,研究應(yīng)用感知的大規(guī)模圖數(shù)據(jù)劃分方法。大規(guī)模圖數(shù)據(jù)劃分是并行計算的前提。本文在虛擬化環(huán)境中,根據(jù)應(yīng)用的執(zhí)行模式,對圖數(shù)據(jù)進行合理地劃分,從而減少網(wǎng)絡(luò)通訊并且實現(xiàn)負載均衡。對圖數(shù)據(jù)進行合理地劃分一方面有助于更好地進行資源分配與調(diào)度,另一方面可以提升應(yīng)用的執(zhí)行性能。再次,研究面向應(yīng)用執(zhí)行模式的虛擬資源分配與調(diào)度機制;趫(zhí)行模式與底層資源需求的映射關(guān)系,設(shè)計面向應(yīng)用的虛擬資源分配與調(diào)度機制,根據(jù)上層應(yīng)用執(zhí)行模式對資源進行細粒度的分配與調(diào)度,在保證上層應(yīng)用性能的前提下提高資源利用率。最后,通過部署Openstack軟件以實現(xiàn)虛擬化環(huán)境,在此基礎(chǔ)上,實現(xiàn)了相關(guān)理論的研究工作,設(shè)計并開發(fā)了大規(guī)模圖數(shù)據(jù)處理平臺nutcat以集成應(yīng)用特征抽取模塊,應(yīng)用感知的超塊劃分模塊以及面向應(yīng)用執(zhí)行模式的虛擬資源分配與調(diào)度模塊,并部署于東南大學云計算中心(SEU CLOUD)真實環(huán)境中。通過在真實的東南大學云計算中心環(huán)境中的實驗結(jié)果表明,本文提出的應(yīng)用感知的大規(guī)模圖數(shù)據(jù)劃分方法以及面向應(yīng)用執(zhí)行模式的虛擬資源分配與調(diào)度機制可以顯著地提高應(yīng)用的執(zhí)行性能并提升虛擬資源利用率,并為大數(shù)據(jù)應(yīng)用與虛擬化環(huán)境相結(jié)合提供了面向應(yīng)用進行資源分配與調(diào)度的新思路。
[Abstract]:With the further development of e-commerce, mobile Internet, Internet of things and other technologies, the scale, speed and complexity of the data are increasing, which indicates that the human society has entered the era of big data. With the connection of data become more closely, the dependence relationship is more complex, and the distribution pattern of some data has the characteristics of the graph gradually. The traditional large data processing technology, such as the MapReduce batch processing framework, is not suitable for complex relations and needs multiple iterations of graph data. Google's Pregel system performs vertex computation in parallel through parallel thinking, which greatly improves the computing performance and provides a new way of thinking for large scale graph data processing. The research work is based on the idea of Pregel, which partly solves the problem of data processing in large scale maps, but there are still some problems as follows: on the one hand, it ignores the performance attenuation caused by the resource competition between non isolated applications, and on the other hand, it neglects the performance degradation or resource wave caused by the elastic demand of resources at various stages of application. In order to solve the above problems, this thesis introduces the virtualization technology into the graph data processing, analyzes the execution characteristics of the data processing process, and proposes the application oriented graph division and resource allocation and scheduling mechanism by combining the good process isolation of the virtualization process and flexible resource flexibility manageability. To implement the flexible supply of resources according to the specific implementation mode of the application to improve the overall execution efficiency of the map data processing system. In summary, this thesis will carry out the research work from the following four aspects: first, study the extraction and analysis mechanism of the application execution mode of large-scale map data processing. Based on the open source Pregel system The two development, completing the extraction of application execution pattern, and establishing the mapping relationship between the execution mode and the underlying resource requirements, providing a reliable theoretical basis for the distribution and scheduling of the subsequent virtual resources, is the foundation of the research work. Secondly, it studies the method of data partition of the large-scale pattern map of the application perception. In this paper, in the virtualization environment, according to the implementation mode of the application, the graph data is divided reasonably, thus reducing the network communication and realizing the load balancing. The rational partition of the graph data is helpful to the better allocation and scheduling of resources, on the other hand, the performance of the application can be improved. Again, Research on the virtual resource allocation and scheduling mechanism oriented to application implementation mode. Based on the mapping relationship between the execution mode and the underlying resource requirements, the application oriented allocation and scheduling mechanism of virtual resources is designed. According to the upper application execution mode, the resource is distributed and scheduling, and the performance of the upper application is guaranteed. In the end, by deploying Openstack software to realize the virtualization environment, on this basis, the research work of related theories is realized, and a large scale map data processing platform nutcat is designed and developed to integrate application feature extraction modules, use perceived block partition block and Virtual Resource Allocation Oriented to application execution mode. With the scheduling module, and deployed in the real environment of the SEU CLOUD center of Southeast University, the experimental results in the real environment of the Southeast University Cloud Computing Center show that the proposed application aware mass graph data partition method and the application execution mode oriented virtual resource allocation and scheduling mechanism can be significant It improves the performance of the application and improves the utilization of virtual resources, and provides a new idea for the application of resource allocation and scheduling for the combination of large data application and virtual environment.
【學位授予單位】:東南大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP311.52;TP302
【相似文獻】
相關(guān)期刊論文 前1條
1 黃顯凱;;EuP導入建議與生態(tài)化設(shè)計符合性執(zhí)行模式[J];電子與電腦;2008年07期
相關(guān)重要報紙文章 前10條
1 本報記者 舒沁 本報通訊員 余寧;分段集約機制 優(yōu)化執(zhí)行模式[N];人民法院報;2011年
2 牡丹江市委市政府督查辦暨“三實兩創(chuàng)”辦公室;探索城市的執(zhí)行模式[N];學習時報;2013年
3 王國軍 記者 姜文明;創(chuàng)新執(zhí)行模式 破解執(zhí)行難題[N];北方法制報;2009年
4 北京市西城區(qū)人民法院 張緯;“分段集約”誠可貴 傳統(tǒng)優(yōu)勢不可拋[N];人民法院報;2012年
5 記者 婁銀生;徐州“泉山執(zhí)行模式”帶動各地破解司法難題[N];人民法院報;2012年
6 記者 謝曉曦 通訊員 張瑞雪;重慶打破包案到底執(zhí)行模式[N];人民法院報;2009年
7 倪志祥;傳統(tǒng)執(zhí)行模式存在弊端[N];江蘇經(jīng)濟報;2013年
8 特約通訊員 曉梅 秋蘇;出彩的“全員能動執(zhí)行模式”[N];徐州日報;2011年
9 吳歡 朱益虎;常熟多維度創(chuàng)新執(zhí)行模式提升執(zhí)法效果[N];江蘇經(jīng)濟報;2014年
10 江西省石城縣人民法院 陳默;“泉山模式”以能動司法破解司法難題[N];人民法院報;2012年
相關(guān)碩士學位論文 前5條
1 徐海茹;盲量子計算協(xié)議與執(zhí)行模式研究[D];廣東工業(yè)大學;2016年
2 李小龍;廣州市新能源公交車推廣政策執(zhí)行模式分析及效果評價研究[D];華南農(nóng)業(yè)大學;2016年
3 張駿雪;面向大規(guī)模圖數(shù)據(jù)處理的虛擬機管理系統(tǒng)研究與實現(xiàn)[D];東南大學;2016年
4 王子睿;非訴行政執(zhí)行模式研究[D];西南政法大學;2015年
5 陳靜;生產(chǎn)性服務(wù)業(yè)市場導向執(zhí)行模式與績效相關(guān)性研究[D];沈陽工業(yè)大學;2012年
,本文編號:1898539
本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/1898539.html