基于MPI的多層容錯高性能云計算平臺關鍵技術研究
發(fā)布時間:2018-05-27 09:42
本文選題:MPI + 容錯。 參考:《武漢理工大學》2013年碩士論文
【摘要】:隨著全球信息化浪潮的推進和計算機應用技術的不斷迭代更新,各行業(yè)需要處理的信息量越來越大,尤其實在航空航天、海洋開發(fā)、天氣預報等諸多領域,數(shù)據規(guī)模已經達到TB甚至PB級,而如何存儲并處理這種規(guī)模的數(shù)據顯得至關重要,為了解決這一問題,引入云計算平臺這一概念。一方面,對于云計算平臺而言有兩個特點,一個是能分布式存儲大數(shù)據,另一個特點是將視任務執(zhí)行失敗為正常情況;但另外一方面,許多云平臺不適用于低延遲服務,并且在面對計算密集型任務時候顯得效率不高,而MPI擅長計算密集型,并且通信迅速,消息傳遞延遲少,因而用MPI實現(xiàn)一個云平臺則顯得十分有意義。在本研究當中將主要研究如何構建并實現(xiàn)能夠支持大數(shù)據存儲存并擁有多層容錯功能的MPI云平臺。 針對上述問題,本文提出并實現(xiàn)出一個基于MPI的云平臺,為了讓此平臺能夠支持大數(shù)據存儲,因而實現(xiàn)了一個由MySQL構建的分布式集群,并且多個MySQL節(jié)點存儲不一樣的數(shù)據,在此之上增加一個數(shù)據庫中間件層,以便能將這些數(shù)據庫節(jié)點聯(lián)立在一起。而用戶在使用的時候,并不需要考慮此存儲架構,使用起來就和單個MySQL的效果是類似的。另外一方面,考慮到MPI自身沒有提供響應的容錯機制,因而本研究者設計出3層容錯機制,分別是:任務失敗重調度、任務的CheckPoint/Restart以及進程遷徙,并且將此容錯機制獨立分離出接口,以便平臺開發(fā)者可以依據自身需求來定制其具體需求,也便于對此功能進行二次開發(fā),而對于用戶而言,則可以依據其實際需求來設定容錯級別。 經過測試和評估,證明基于MySQL的分布式集群之上運行的數(shù)據庫中間件能夠處理用戶的SQL請求,實現(xiàn)數(shù)據的查找以及基本的增刪改功能,并且本平臺可以很好地應對節(jié)點服務失效問題并能最終給用戶反饋正確的結果。原型系統(tǒng)的可行性、可靠性、健壯性、高效性均達到設計預期。
[Abstract]:Along with the advance of the global informationization tide and the constant iteration of computer application technology , the amount of information to be processed by each industry is becoming more and more important , especially in the fields of aerospace , ocean development , weather forecast and so on , and how to store and process the data of this scale is very important . In order to solve this problem , the concept of cloud computing platform is introduced . In one aspect , for the cloud computing platform , there are two characteristics , one is the distributed storage big data , and the other characteristic is that the task execution failure is normal .
However , on the other hand , many cloud platforms are not suitable for low - latency services , and are inefficient in the face of computing - intensive tasks , while MPI is good at computing - intensive , and communication is fast , messaging latency is less , and it is meaningful to implement a cloud platform with MPI . In this study , we will focus on how to build and implement MPI cloud platforms that support large data storage and multi - layer fault tolerance .
In view of the above problems , a cloud platform based on MPI is proposed and implemented . In order to enable this platform to support large data storage , a distributed cluster constructed by MySQL is implemented , and a database middleware layer is added on the platform so that the database nodes can be connected together .
Through testing and evaluation , it is proved that the database middleware running on the distributed cluster based on MySQL can handle the user ' s SQL request , realize the searching of the data and the basic addition and deletion function , and the platform can well deal with the problem of the failure of the node service and finally feed back the correct result to the user . The feasibility , the reliability , the robustness and the efficiency of the prototype system reach the design expectation .
【學位授予單位】:武漢理工大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP333;TP302.8
【參考文獻】
相關期刊論文 前1條
1 鄭啟龍;吳曉偉;房明;王昊;汪勝;王向前;;HPMR在并行矩陣計算中的應用[J];計算機工程;2010年08期
相關博士學位論文 前1條
1 謝e,
本文編號:1941532
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1941532.html
最近更新
教材專著