天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計算機論文 >

基于Alluxio的數(shù)據(jù)高可用管理技術(shù)的研究與優(yōu)化

發(fā)布時間:2018-04-18 16:53

  本文選題:Alluxio + 數(shù)據(jù)管理; 參考:《哈爾濱工業(yè)大學(xué)》2017年碩士論文


【摘要】:隨著存儲硬件成本的不斷降低,大數(shù)據(jù)生態(tài)系統(tǒng)的復(fù)雜變化,計算框架與存儲系統(tǒng)的多樣性和異構(gòu)性發(fā)展,基于內(nèi)存的分布式文件系統(tǒng),數(shù)據(jù)庫等一系列產(chǎn)品孕育而生,用來整合整個大數(shù)據(jù)生態(tài)系統(tǒng),更好的服務(wù)于外界業(yè)務(wù)?捎眯允窃u價海量存儲系統(tǒng)性能的重要指標之一。本文將從提高海量存儲系統(tǒng)可用性的角度出發(fā),研究當(dāng)前開源的基于內(nèi)存的虛擬分布式存儲系統(tǒng)Alluxio,主要研究Alluxio上關(guān)于數(shù)據(jù)管理機制的可用性優(yōu)化技術(shù),以此來提高Alluxio與底層存儲相結(jié)合的海量存儲系統(tǒng)在遠程環(huán)境下的可用性。本文將Alluxio與底層存儲結(jié)合的海量存儲系統(tǒng)的可用性狀態(tài)作為研究點,結(jié)合當(dāng)前其他分布式文件系統(tǒng)或基于內(nèi)存的數(shù)據(jù)庫系統(tǒng)的一些可用性技術(shù),分析遠程環(huán)境下由于網(wǎng)絡(luò)等不可預(yù)估因素形成的底層數(shù)據(jù)不可訪問的數(shù)據(jù)不可用狀態(tài)和異步存儲下由于異步機制等原因形成的數(shù)據(jù)不可用現(xiàn)象,基于以上問題,提出了本文的優(yōu)化策略,主要有兩點:一是緩存預(yù)取與替換,將需要的數(shù)據(jù)預(yù)先提取保存到Alluxio上,同時增加Alluxio中熱數(shù)據(jù)容量,減輕網(wǎng)絡(luò)擁塞時的數(shù)據(jù)傳輸壓力,減少訪問底層存儲次數(shù),當(dāng)?shù)讓訑?shù)據(jù)不可訪問時延長對外服務(wù)時間。二是優(yōu)化異步存儲過程,提出結(jié)合操作的異步存儲優(yōu)化策略,即當(dāng)操作明確、具有冪等性且底層有相應(yīng)計算資源時,可直接利用Alluxio向底層存儲發(fā)送命令而非數(shù)據(jù),減輕傳輸大量數(shù)據(jù)帶來的網(wǎng)絡(luò)壓力,同時將異步與同步相結(jié)合進一步保證持久化數(shù)據(jù)的可用性;谏鲜鰞(yōu)化思想,本文提出了以下策略:基于數(shù)據(jù)塊間關(guān)聯(lián)規(guī)則的數(shù)據(jù)預(yù)取與替換策略和結(jié)合操作的異步存儲優(yōu)化策略。較為完善的解決了上述提出的問題。最后,通過實驗進行了相關(guān)優(yōu)化技術(shù)的綜合分析。根據(jù)實驗結(jié)果,得出基于關(guān)聯(lián)規(guī)則的數(shù)據(jù)預(yù)取與替換策略能夠在遠程場景下進行數(shù)據(jù)預(yù)取,避免由于網(wǎng)絡(luò)等原因?qū)е碌膶ν鈽I(yè)務(wù)不可用,同時由于將熱數(shù)據(jù)長久的保留在Alluxio中,降低了應(yīng)用訪問數(shù)據(jù)的延遲,減少了訪問底層存儲的次數(shù),緩解了網(wǎng)絡(luò)高負載時的通信壓力,降低整個系統(tǒng)發(fā)生宕機情況的故障率,從而提高了系統(tǒng)對外業(yè)務(wù)的可用性。異步存儲策略能夠在異步情況下盡可能的保證數(shù)據(jù)的可用性,減輕網(wǎng)絡(luò)傳輸數(shù)據(jù)的壓力,同時能保證數(shù)據(jù)完整一致性等性能要求,這樣既保證了程序要求的性能又保證了數(shù)據(jù)的可用性。
[Abstract]:With the decreasing cost of storage hardware, the complex changes of big data ecosystem, the diversity and heterogeneity of computing framework and storage system, and a series of products, such as memory-based distributed file system, database, etc.To integrate the entire big data ecosystem, better serve the outside world business.Availability is one of the important indexes to evaluate the performance of mass storage system.From the point of view of improving the availability of mass storage system, this paper will study the current open source virtual distributed storage system based on memory, Alluxio, and mainly study the usability optimization technology of data management mechanism on Alluxio.In order to improve the availability of mass storage system combined with Alluxio and underlying storage in remote environment.In this paper, the availability state of mass storage system combined with Alluxio and underlying storage is taken as the research point, and some usability technologies of other distributed file systems or memory-based database systems are combined.This paper analyzes the inaccessible state of the underlying data in remote environment due to the unpredictable factors such as network, and the phenomenon of data unavailability formed under asynchronous storage due to asynchronous mechanism, based on the above problems.The optimization strategy of this paper is put forward. One is to pre-fetch and replace the cache, to pre-extract and save the needed data to Alluxio, and at the same time to increase the thermal data capacity in Alluxio, and to reduce the pressure of data transmission when the network is congested.Reduce the number of access to the underlying storage, when the underlying data is not accessible to extend the external service time.The second is to optimize the asynchronous stored procedure, and put forward the asynchronous storage optimization strategy combined with the operation, that is, when the operation is clear, idempotent and there are corresponding computing resources in the bottom layer, the Alluxio can be directly used to send commands instead of data to the underlying storage.It can reduce the network pressure caused by transmitting a lot of data, and combine asynchronous and synchronization to ensure the availability of persistent data.Based on the above optimization ideas, this paper proposes the following strategies: data prefetching and replacement strategy based on association rules between blocks and asynchronous storage optimization strategy combining operations.More perfect solution to the above raised problems.Finally, the related optimization techniques are comprehensively analyzed through experiments.According to the experimental results, it is concluded that the data prefetching and replacement strategy based on association rules can prefetch data in remote scenarios, avoid the non-availability of external services caused by network, and keep hot data in Alluxio for a long time.It reduces the delay of application accessing data, reduces the times of accessing the bottom storage, alleviates the communication pressure when the network is high load, reduces the failure rate of the whole system, and improves the usability of the system's external business.Asynchronous storage strategy can ensure the availability of data as much as possible in asynchronous situation, reduce the pressure of network data transmission, and ensure the integrity of data consistency and other performance requirements.This ensures both the performance required by the program and the availability of data.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP333

【參考文獻】

相關(guān)期刊論文 前9條

1 王芳;王培群;朱春節(jié);;基于頻繁序列挖掘的預(yù)取算法研究與實現(xiàn)[J];計算機研究與發(fā)展;2016年02期

2 吳甘沙;;大數(shù)據(jù)技術(shù)發(fā)展的十個前沿方向(中)[J];大數(shù)據(jù);2015年03期

3 吳甘沙;;大數(shù)據(jù)技術(shù)發(fā)展的十個前沿方向(上)[J];大數(shù)據(jù);2015年02期

4 黃立鋒;鄧玉輝;;可時間局部性感知的塊I/O關(guān)聯(lián)挖掘算法[J];小型微型計算機系統(tǒng);2015年05期

5 師明;劉軼;唐歌實;;一種面向分布式文件系統(tǒng)的文件預(yù)取模型的設(shè)計與實現(xiàn)[J];計算機科學(xué);2014年07期

6 唐穎峰;陳世平;;一種基于后綴項表的并行閉頻繁項集挖掘算法[J];計算機應(yīng)用研究;2014年02期

7 張榮蕓;;淺析緩存預(yù)取技術(shù)[J];現(xiàn)代計算機(專業(yè)版);2011年13期

8 吳峰光;奚宏生;徐陳鋒;;一種支持并發(fā)訪問流的文件預(yù)取算法[J];軟件學(xué)報;2010年08期

9 楊朝紅,宮云戰(zhàn),桑偉前,劉海燕,李慶艷;基于主從異步復(fù)制技術(shù)的容災(zāi)實時系統(tǒng)研究與實現(xiàn)[J];計算機研究與發(fā)展;2003年07期

相關(guān)博士學(xué)位論文 前2條

1 馮懿;復(fù)雜計算機系統(tǒng)可用性評測技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2013年

2 吳峰光;Linux內(nèi)核中的預(yù)取算法[D];中國科學(xué)技術(shù)大學(xué);2008年

相關(guān)碩士學(xué)位論文 前3條

1 李聰;HDFS元數(shù)據(jù)管理的高可用性優(yōu)化技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2016年

2 還璋武;LRFU及其自適應(yīng)算法的研究[D];安徽工業(yè)大學(xué);2016年

3 黃立鋒;存儲系統(tǒng)中突發(fā)訪問行為的分析與預(yù)測[D];暨南大學(xué);2015年

,

本文編號:1769257

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1769257.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶4e794***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com