基于MPI的多層容錯(cuò)高性能云計(jì)算平臺(tái)關(guān)鍵技術(shù)研究
發(fā)布時(shí)間:2018-05-27 09:42
本文選題:MPI + 容錯(cuò) ; 參考:《武漢理工大學(xué)》2013年碩士論文
【摘要】:隨著全球信息化浪潮的推進(jìn)和計(jì)算機(jī)應(yīng)用技術(shù)的不斷迭代更新,各行業(yè)需要處理的信息量越來(lái)越大,尤其實(shí)在航空航天、海洋開發(fā)、天氣預(yù)報(bào)等諸多領(lǐng)域,數(shù)據(jù)規(guī)模已經(jīng)達(dá)到TB甚至PB級(jí),而如何存儲(chǔ)并處理這種規(guī)模的數(shù)據(jù)顯得至關(guān)重要,為了解決這一問(wèn)題,引入云計(jì)算平臺(tái)這一概念。一方面,對(duì)于云計(jì)算平臺(tái)而言有兩個(gè)特點(diǎn),一個(gè)是能分布式存儲(chǔ)大數(shù)據(jù),另一個(gè)特點(diǎn)是將視任務(wù)執(zhí)行失敗為正常情況;但另外一方面,許多云平臺(tái)不適用于低延遲服務(wù),并且在面對(duì)計(jì)算密集型任務(wù)時(shí)候顯得效率不高,而MPI擅長(zhǎng)計(jì)算密集型,并且通信迅速,消息傳遞延遲少,因而用MPI實(shí)現(xiàn)一個(gè)云平臺(tái)則顯得十分有意義。在本研究當(dāng)中將主要研究如何構(gòu)建并實(shí)現(xiàn)能夠支持大數(shù)據(jù)存儲(chǔ)存并擁有多層容錯(cuò)功能的MPI云平臺(tái)。 針對(duì)上述問(wèn)題,本文提出并實(shí)現(xiàn)出一個(gè)基于MPI的云平臺(tái),為了讓此平臺(tái)能夠支持大數(shù)據(jù)存儲(chǔ),因而實(shí)現(xiàn)了一個(gè)由MySQL構(gòu)建的分布式集群,并且多個(gè)MySQL節(jié)點(diǎn)存儲(chǔ)不一樣的數(shù)據(jù),在此之上增加一個(gè)數(shù)據(jù)庫(kù)中間件層,以便能將這些數(shù)據(jù)庫(kù)節(jié)點(diǎn)聯(lián)立在一起。而用戶在使用的時(shí)候,并不需要考慮此存儲(chǔ)架構(gòu),使用起來(lái)就和單個(gè)MySQL的效果是類似的。另外一方面,考慮到MPI自身沒(méi)有提供響應(yīng)的容錯(cuò)機(jī)制,因而本研究者設(shè)計(jì)出3層容錯(cuò)機(jī)制,分別是:任務(wù)失敗重調(diào)度、任務(wù)的CheckPoint/Restart以及進(jìn)程遷徙,并且將此容錯(cuò)機(jī)制獨(dú)立分離出接口,以便平臺(tái)開發(fā)者可以依據(jù)自身需求來(lái)定制其具體需求,也便于對(duì)此功能進(jìn)行二次開發(fā),而對(duì)于用戶而言,則可以依據(jù)其實(shí)際需求來(lái)設(shè)定容錯(cuò)級(jí)別。 經(jīng)過(guò)測(cè)試和評(píng)估,證明基于MySQL的分布式集群之上運(yùn)行的數(shù)據(jù)庫(kù)中間件能夠處理用戶的SQL請(qǐng)求,實(shí)現(xiàn)數(shù)據(jù)的查找以及基本的增刪改功能,并且本平臺(tái)可以很好地應(yīng)對(duì)節(jié)點(diǎn)服務(wù)失效問(wèn)題并能最終給用戶反饋正確的結(jié)果。原型系統(tǒng)的可行性、可靠性、健壯性、高效性均達(dá)到設(shè)計(jì)預(yù)期。
[Abstract]:Along with the advance of the global informationization tide and the constant iteration of computer application technology , the amount of information to be processed by each industry is becoming more and more important , especially in the fields of aerospace , ocean development , weather forecast and so on , and how to store and process the data of this scale is very important . In order to solve this problem , the concept of cloud computing platform is introduced . In one aspect , for the cloud computing platform , there are two characteristics , one is the distributed storage big data , and the other characteristic is that the task execution failure is normal .
However , on the other hand , many cloud platforms are not suitable for low - latency services , and are inefficient in the face of computing - intensive tasks , while MPI is good at computing - intensive , and communication is fast , messaging latency is less , and it is meaningful to implement a cloud platform with MPI . In this study , we will focus on how to build and implement MPI cloud platforms that support large data storage and multi - layer fault tolerance .
In view of the above problems , a cloud platform based on MPI is proposed and implemented . In order to enable this platform to support large data storage , a distributed cluster constructed by MySQL is implemented , and a database middleware layer is added on the platform so that the database nodes can be connected together .
Through testing and evaluation , it is proved that the database middleware running on the distributed cluster based on MySQL can handle the user ' s SQL request , realize the searching of the data and the basic addition and deletion function , and the platform can well deal with the problem of the failure of the node service and finally feed back the correct result to the user . The feasibility , the reliability , the robustness and the efficiency of the prototype system reach the design expectation .
【學(xué)位授予單位】:武漢理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP333;TP302.8
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 鄭啟龍;吳曉偉;房明;王昊;汪勝;王向前;;HPMR在并行矩陣計(jì)算中的應(yīng)用[J];計(jì)算機(jī)工程;2010年08期
相關(guān)博士學(xué)位論文 前1條
1 謝e,
本文編號(hào):1941532
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1941532.html
最近更新
教材專著