天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

分布式系統(tǒng)中回卷恢復(fù)技術(shù)研究

發(fā)布時(shí)間:2018-02-27 22:11

  本文關(guān)鍵詞: 分布式系統(tǒng) 回卷恢復(fù) 檢查點(diǎn) XMPP協(xié)議 原型系統(tǒng) 出處:《重慶大學(xué)》2012年博士論文 論文類型:學(xué)位論文


【摘要】:分布式系統(tǒng)具有用戶投資風(fēng)險(xiǎn)小、結(jié)構(gòu)可擴(kuò)展性好、用戶可繼承原有的軟硬件資源、構(gòu)造簡(jiǎn)單等特點(diǎn),其應(yīng)用領(lǐng)域越來(lái)越廣泛。包括大規(guī)?茖W(xué)計(jì)算系統(tǒng)、天氣預(yù)報(bào)系統(tǒng)、分時(shí)電話系統(tǒng)、飛機(jī)訂票系統(tǒng)、銀行系統(tǒng)、股票系統(tǒng)、購(gòu)物系統(tǒng)等。隨著系統(tǒng)規(guī)模的不斷擴(kuò)大,其在計(jì)算過(guò)程中發(fā)生故障的幾率也在指數(shù)增長(zhǎng),系統(tǒng)一旦失效,可能帶來(lái)災(zāi)難性的后果,因此迫切需要為分布式計(jì)算系統(tǒng)提供容錯(cuò)機(jī)制。檢查點(diǎn)與回卷恢復(fù)(Checkpoint and Rollback-Recovery)技術(shù)是一類重要的軟件容錯(cuò)技術(shù),具有實(shí)現(xiàn)和使用簡(jiǎn)單,對(duì)資源要求低等特點(diǎn),適合在分布式計(jì)算環(huán)境中應(yīng)用。 分布式計(jì)算環(huán)境中,不確定的通信帶寬、存儲(chǔ)空間限制、節(jié)點(diǎn)的動(dòng)態(tài)性、頻繁的斷開連接等特點(diǎn)決定了為單機(jī)系統(tǒng)開發(fā)的回卷恢復(fù)技術(shù)不能直接地應(yīng)用到分布式計(jì)算系統(tǒng)中。在保證系統(tǒng)一致性的前提下,減少檢查點(diǎn)和消息日志的存儲(chǔ)開銷、減少回卷恢復(fù)機(jī)制引入的通信開銷、提高節(jié)點(diǎn)的自治性(autonomy)、減少由于進(jìn)程間依賴關(guān)系造成的節(jié)點(diǎn)間藕合、實(shí)現(xiàn)回卷恢復(fù)機(jī)制對(duì)節(jié)點(diǎn)的透明,是分布式環(huán)境下回卷恢復(fù)技術(shù)研究的核心問(wèn)題。本文圍繞以上內(nèi)容展開研究,主要?jiǎng)?chuàng)新點(diǎn)如下。 (1)提出了一種分布式環(huán)境下非阻塞協(xié)調(diào)檢查點(diǎn)及回卷恢復(fù)算法。在分布式計(jì)算環(huán)境的實(shí)際應(yīng)用中,節(jié)點(diǎn)的自治性很強(qiáng),希望的容錯(cuò)機(jī)制是一種透明的服務(wù)。提出的檢查點(diǎn)算法基于發(fā)送進(jìn)程來(lái)確保不會(huì)產(chǎn)生孤兒消息,不需要接收進(jìn)程的任何信息,算法每次獲得的檢查點(diǎn)均是全局一致檢查點(diǎn),直接獲得永久檢查點(diǎn),跳過(guò)臨時(shí)檢查點(diǎn)階段,加快了檢查點(diǎn)的形成時(shí)間,一個(gè)進(jìn)程是否獲得檢查點(diǎn)與其他進(jìn)程無(wú)關(guān),算法是否獲得檢查點(diǎn)只與發(fā)送標(biāo)志有關(guān),確保了算法的高并行性。某節(jié)點(diǎn)失效后,,只需要通過(guò)進(jìn)程廣播一條同步消息,其他進(jìn)程收到同步消息后,根據(jù)算法做獨(dú)立處理,不需要其他進(jìn)程的額外消息,從而實(shí)現(xiàn)了節(jié)點(diǎn)間透明、并行地執(zhí)行回卷恢復(fù)算法。通過(guò)算法性能分析和仿真實(shí)驗(yàn),驗(yàn)證了算法無(wú)故障運(yùn)行和回卷恢復(fù)階段的低開銷性。 (2)提出了一種基于動(dòng)態(tài)分組的兩級(jí)檢查點(diǎn)及回卷恢復(fù)容錯(cuò)算法。就包含多個(gè)結(jié)點(diǎn)的應(yīng)用而言,結(jié)點(diǎn)間交換信息的頻率是不一樣的,甚至相差很大,因此需要一種機(jī)制來(lái)適應(yīng)分布式系統(tǒng)中進(jìn)程動(dòng)態(tài)協(xié)作的特點(diǎn)。提出的算法根據(jù)結(jié)點(diǎn)間通信的頻率、通信時(shí)延、通信帶寬及分組中結(jié)點(diǎn)數(shù)等指標(biāo)來(lái)實(shí)現(xiàn)動(dòng)態(tài)分組,實(shí)現(xiàn)分組的高內(nèi)聚低耦合。組內(nèi)通信時(shí)延小、結(jié)點(diǎn)數(shù)不多,適合協(xié)調(diào)檢查點(diǎn)算法,因此在組級(jí)采用協(xié)調(diào)檢查點(diǎn)算法。組間通常是由高時(shí)延、低帶寬的網(wǎng)絡(luò)相互連接,并且組間的通信頻率較低,提出的系統(tǒng)級(jí)檢查點(diǎn)算法充分考慮了這些特點(diǎn),每個(gè)分組是否獲得檢查點(diǎn),與其他分組無(wú)關(guān),各個(gè)分組可以獨(dú)立地,以并行方式獲得系統(tǒng)級(jí)檢查點(diǎn);通過(guò)發(fā)送分組來(lái)確保分組間不會(huì)產(chǎn)生孤兒消息,每次獲得的系統(tǒng)級(jí)檢查點(diǎn)均是全局一致檢查點(diǎn),避免了多米諾效應(yīng)的發(fā)生。算法一方面動(dòng)態(tài)適應(yīng)了應(yīng)用自身的要求,提高了資源的整體效能,另一方面通過(guò)發(fā)送分組來(lái)確保分組間不會(huì)產(chǎn)生孤兒消息,實(shí)現(xiàn)了由傳統(tǒng)的兩階段提交算法到單階段算法的轉(zhuǎn)變。實(shí)驗(yàn)結(jié)果表明,算法執(zhí)行時(shí)間較低,相對(duì)于傳統(tǒng)的兩階段提交算法,時(shí)間復(fù)雜度由通常的O(n2)降低到O(n)。 (3)基于XMPP協(xié)議構(gòu)建一個(gè)通用的消息傳遞機(jī)制。已有檢查點(diǎn)及回卷恢復(fù)算法,都是自定義算法,消息傳遞方式各不相同,沒有通用性可言。我們根據(jù)分布式系統(tǒng)的特點(diǎn)及檢查點(diǎn)算法傳遞的消息特點(diǎn),構(gòu)建一個(gè)通用的消息傳遞機(jī)制,該機(jī)制基于XMPP協(xié)議,實(shí)現(xiàn)了消息的跨平臺(tái)、準(zhǔn)實(shí)時(shí)傳輸。對(duì)XMPP協(xié)議中XML標(biāo)簽進(jìn)行擴(kuò)展,實(shí)現(xiàn)了多種檢查點(diǎn)消息傳輸格式的統(tǒng)一,提高了程序的重用性。 (4)原型系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)。在理論研究基礎(chǔ)上,進(jìn)行系統(tǒng)原型設(shè)計(jì)及實(shí)現(xiàn),驗(yàn)證理論的可實(shí)現(xiàn)性,是從理論研究到實(shí)際工程應(yīng)用過(guò)程中非常重要的工作。結(jié)合前面的理論研究成果,研究了原型系統(tǒng)的系統(tǒng)構(gòu)建、客戶端軟件需求分析、客戶端軟件總體框架、功能模塊及處理流程,并編程實(shí)現(xiàn)一個(gè)原型系統(tǒng),證明了理論成果的可實(shí)現(xiàn)性。
[Abstract]:Distributed system has the user investment risk is small, the structure with good scalability, users can inherit the existing software and hardware resources, the advantages of simple structure, more and more widely used. Including large-scale scientific computing system, weather forecast system, time-sharing telephone system, aircraft booking system, bank system, stock system, shopping system etc. with the continuous expansion of the system size, the probability of failure in the process of calculation is in exponential growth, once the system failure, can be disastrous, so there is an urgent need to provide fault-tolerant mechanism for distributed computing systems. Checkpointing and rollback recovery (Checkpoint and Rollback-Recovery) technology is a kind of important software fault tolerant technique. Has the advantages of simple implementation and use of resources, low requirements, suitable for application in a distributed computing environment.
In the distributed computing environment, communication bandwidth and uncertainty, storage space constraints, dynamic nodes, frequent disconnection and so decided to develop stand-alone system rollback recovery technology can not be directly applied to the distributed computing system. Under the premise of ensuring the consistency of the system, reduce the storage overhead of checkpointing and message logging the communication overhead rollback recoverymechanism, improve the autonomy of the node (autonomy), reduce the node process ofdependency relationship between the coupling and implementation of rollback recovery mechanism for nodes transparent, is the core issue of research on recovery technology in distributed environment. This paper focuses on the rollback above, the main innovation the following.
(1) proposed a distributed environment of non blocking coordinated checkpointing and rollback recovery algorithm. Practical application in a distributed computing environment, autonomous nodes are very strong, we hope the fault-tolerant mechanism is a kind of transparent service. The proposed checkpointing algorithm based on the transmission process to ensure that does not produce any orphan message information do not need the process of receiving, the algorithm for each checkpoint are consistent global checkpoint, direct access to permanent checkpoints, skip the temporary check point, speeding up the formation time of the checkpoint, a process whether the check point has nothing to do with the other processes, whether to obtain the check point algorithm and sending only sign, ensure the highly parallel algorithm. After a node failure, only through the process of broadcasting a synchronous message, other process synchronization messages are received, according to the algorithm of independent processing, no other process amount Foreign news, in order to achieve the node transparent, the concurrent execution of rollback recovery algorithm. The algorithm performance analysis and simulation results verify the algorithm, trouble free operation and low cost of the rollback recovery stage.
(2) proposed a two level checkpoint and rollback recovery fault-tolerant dynamic grouping algorithm based on application includes a number of nodes for the exchange of information between nodes of the frequency is not the same, even a big difference, so we need a mechanism to adapt to the characteristics of the process of dynamic collaboration in distributed systems. The proposed according to the algorithm of communication between nodes frequency, communication delay, nodes communication bandwidth and packet index to realize dynamic grouping, high cohesion and low coupling to realize packet. Group communication delay, the nodes are not many suitable coordinated checkpointing algorithm, so the group level by the coordinated checkpointing algorithm among groups is usually. By Gao Shiyan, low bandwidth network connected to each other, and the communication frequency between groups is low, the system level checkpointing algorithm considers these features, each packet is check points, not with other components, each Group can independently, in parallel for system level checkpoint; to ensure that groups will not produce orphan message by sending packet, system level checkpoint each obtained are consistent global checkpoint, avoid the occurrence of Domino effect. On the one hand to dynamically adapt to the application requirements of their own, to improve the overall efficiency of resources on the other hand, by sending a packet to packet have orphan message is realized by the two stage, the traditional algorithm to change the single stage of the algorithm presented. The experimental results show that the algorithm execution time is low, relative to the conventional two phase commit algorithm, the time complexity is O (N2) is reduced to O (n).
(3) XMPP protocol constructs a universal message transfer mechanism. Based on existing checkpointing and rollback recovery algorithms are custom algorithms, message transfer in different ways, there is no universal definition. We according to the transfer characteristics and system checkpoint algorithm news features, constructs a universal message transfer mechanism this mechanism, based on the XMPP protocol, to achieve a cross platform message, quasi real time transmission. The XML tag in XMPP protocol is extended, the realization of the unity of various checkpoint message transmission format, improves the reusability of the program.
(4) the design and implementation of prototype system. On the basis of theoretical research, and realize the design of the prototype system, can realize the verification of the theory, from theoretical research to practical engineering application process is very important work. Combined with the previous theoretical research results, studies the construction of prototype system, analysis of client software demand, the overall framework of client software, function module and process, a prototype system is implemented, it can be proved that the implementation of the theoretical results.

【學(xué)位授予單位】:重慶大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP338.8

【引證文獻(xiàn)】

相關(guān)期刊論文 前1條

1 黨紅恩;趙爾平;雒偉群;;一種低費(fèi)用的協(xié)調(diào)檢查點(diǎn)算法[J];電腦知識(shí)與技術(shù);2014年10期

相關(guān)碩士學(xué)位論文 前1條

1 李志順;基于VxWorks的檢查點(diǎn)容錯(cuò)技術(shù)研究[D];吉林大學(xué);2014年



本文編號(hào):1544528

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1544528.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶eb6f4***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com