并行文件系統(tǒng)緩存技術(shù)的研究
發(fā)布時(shí)間:2018-09-10 10:11
【摘要】:互聯(lián)網(wǎng)技術(shù)的蓬勃發(fā)展,伴隨著的是數(shù)據(jù)的日益膨脹,人們對(duì)數(shù)據(jù)的存儲(chǔ)要求也就越來(lái)越高了,許多應(yīng)用系統(tǒng)的數(shù)據(jù)量都達(dá)到了PB級(jí)別,面對(duì)這些海量數(shù)據(jù),對(duì)存儲(chǔ)系統(tǒng)的容量提出了巨大的挑戰(zhàn),如何能夠?qū)崿F(xiàn)對(duì)這些數(shù)據(jù)的快速有效的存儲(chǔ)成為當(dāng)前存儲(chǔ)技術(shù)的研究熱點(diǎn)。當(dāng)前衡量數(shù)據(jù)存儲(chǔ)系統(tǒng)的性能好壞的幾個(gè)指標(biāo)是系統(tǒng)性能、可用性、可擴(kuò)展性和安全性。原有單一的簡(jiǎn)單存儲(chǔ)文件系統(tǒng)已不能滿(mǎn)足現(xiàn)今的數(shù)據(jù)存儲(chǔ)需要,并行文件系統(tǒng)以它具有的高擴(kuò)展性、高性能、高可用性和高安全性等優(yōu)點(diǎn),成為業(yè)內(nèi)普遍采用的數(shù)據(jù)存儲(chǔ)管理方式,而緩存技術(shù)影響著文件系統(tǒng)對(duì)數(shù)據(jù)存儲(chǔ)的速率,一個(gè)好的緩存技術(shù)能夠在很大程度上提高系統(tǒng)的性能。 本文在這樣的背景下,對(duì)并行文件系統(tǒng)緩存技術(shù)進(jìn)行研究,針對(duì)并行文件系統(tǒng)GlusterFS,設(shè)計(jì)一個(gè)基于內(nèi)存緩存技術(shù)Memcached的中間緩存架構(gòu)(InterMediate Caching architecture, IMCa)。具體的研究工作內(nèi)容有以下幾點(diǎn): 論文首先介紹了三種存儲(chǔ)技術(shù),闡述了它們的各自的特點(diǎn)區(qū)別和,接著對(duì)幾種經(jīng)典的文件系統(tǒng)做了簡(jiǎn)要地介紹。然后分析了并行文件系統(tǒng)GlusterFS的系統(tǒng)架構(gòu),客戶(hù)端和服務(wù)器端的工作原理,剖析了GlusterFS文件系統(tǒng)基于翻譯器Translator實(shí)現(xiàn)各種功能的設(shè)計(jì)原理。接著在Memcached的基礎(chǔ)上,設(shè)計(jì)了MCa緩存系統(tǒng),IMCa緩存系統(tǒng)分為三大部分:GlusterFS客戶(hù)端、Memcached緩存層、GlusterFS服務(wù)器端。在設(shè)計(jì)的過(guò)程中,考慮了緩存替換和熱點(diǎn)數(shù)據(jù)問(wèn)題,與GlusterFS客戶(hù)端的連接方式等問(wèn)題。IMCa的Memcached緩存層又分了三層,分別是網(wǎng)絡(luò)接口層、系統(tǒng)控制層和數(shù)據(jù)存儲(chǔ)層。網(wǎng)絡(luò)接口層負(fù)責(zé)與客戶(hù)端進(jìn)行連接并對(duì)連接進(jìn)行管理,完成分析命令,處理命令,處理并發(fā)連接等功能。系統(tǒng)控制層是整個(gè)緩存層的核心,包括負(fù)載均衡、數(shù)據(jù)管理和副本管理等功能,數(shù)據(jù)管理負(fù)責(zé)完成對(duì)數(shù)據(jù)操作的控制,包括數(shù)據(jù)的添加、查詢(xún)、更新等。副本管理的設(shè)計(jì)是保證熱點(diǎn)數(shù)據(jù)能夠復(fù)制到其他備用的緩存節(jié)點(diǎn)上,以便當(dāng)出現(xiàn)熱點(diǎn)數(shù)據(jù)請(qǐng)求時(shí),能夠?qū)崿F(xiàn)一致性數(shù)據(jù)讀取。數(shù)據(jù)存儲(chǔ)層實(shí)現(xiàn)的是具體數(shù)據(jù)的存儲(chǔ)問(wèn)題,同時(shí)對(duì)數(shù)據(jù)存放的時(shí)間設(shè)置了有效時(shí)間長(zhǎng)度。 最后結(jié)合設(shè)計(jì)的系統(tǒng)框架進(jìn)行系統(tǒng)搭建,并對(duì)該系統(tǒng)的性能進(jìn)行相關(guān)的測(cè)試,實(shí)驗(yàn)結(jié)果也表明,該緩存系統(tǒng)在提高并行文件系統(tǒng)GlusterFS的性能方面有所幫助。 圖32個(gè),參考文獻(xiàn)56個(gè)
[Abstract]:With the rapid development of Internet technology, with the increasing expansion of data, people have higher and higher requirements for data storage, many applications have reached the PB level of data, facing these massive data, It is a great challenge to the capacity of storage system. How to realize the fast and effective storage of these data has become the research hotspot of the current storage technology. At present, the performance of data storage system is measured by system performance, usability, scalability and security. The original simple storage file system can not meet the current data storage needs. Parallel file system has the advantages of high scalability, high performance, high availability and high security. It is widely used in the field of data storage management, and cache technology affects the speed of file system to data storage. A good cache technology can improve the performance of the system to a great extent. In this paper, the parallel file system cache technology is studied, and an intermediate cache architecture (InterMediate Caching architecture, IMCa). Based on memory cache technology Memcached is designed for parallel file system GlusterFS,. The specific contents of the research work are as follows: firstly, three storage technologies are introduced, their characteristics and differences are described, and then several classical file systems are briefly introduced. Then the system architecture of parallel file system (GlusterFS), the working principle of client and server are analyzed, and the design principle of GlusterFS file system based on translator Translator is analyzed. Then, on the basis of Memcached, the MCa cache system is designed, which is divided into three parts: GlusterFS client and Memcached cache layer. In the design process, cache replacement and hot data issues are considered. The Memcached buffer layer of IMCA is divided into three layers: network interface layer, system control layer and data storage layer. The network interface layer is responsible for the connection with the client and manages the connection, completes the analysis command, handles the concurrent connection and so on. The system control layer is the core of the whole buffer layer, including load balancing, data management and replica management. Data management is responsible for the control of data operation, including data addition, query, update and so on. Replica management is designed to ensure that hot data can be copied to other standby cache nodes so that consistent data reading can be achieved when hot data requests occur. The data storage layer implements the storage problem of specific data and sets the effective time length of the data storage. Finally, the system is built with the designed system framework, and the performance of the system is tested. The experimental results also show that the cache system can improve the performance of the parallel file system (GlusterFS). 32 figures, 56 refs
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP333;TP316.4
[Abstract]:With the rapid development of Internet technology, with the increasing expansion of data, people have higher and higher requirements for data storage, many applications have reached the PB level of data, facing these massive data, It is a great challenge to the capacity of storage system. How to realize the fast and effective storage of these data has become the research hotspot of the current storage technology. At present, the performance of data storage system is measured by system performance, usability, scalability and security. The original simple storage file system can not meet the current data storage needs. Parallel file system has the advantages of high scalability, high performance, high availability and high security. It is widely used in the field of data storage management, and cache technology affects the speed of file system to data storage. A good cache technology can improve the performance of the system to a great extent. In this paper, the parallel file system cache technology is studied, and an intermediate cache architecture (InterMediate Caching architecture, IMCa). Based on memory cache technology Memcached is designed for parallel file system GlusterFS,. The specific contents of the research work are as follows: firstly, three storage technologies are introduced, their characteristics and differences are described, and then several classical file systems are briefly introduced. Then the system architecture of parallel file system (GlusterFS), the working principle of client and server are analyzed, and the design principle of GlusterFS file system based on translator Translator is analyzed. Then, on the basis of Memcached, the MCa cache system is designed, which is divided into three parts: GlusterFS client and Memcached cache layer. In the design process, cache replacement and hot data issues are considered. The Memcached buffer layer of IMCA is divided into three layers: network interface layer, system control layer and data storage layer. The network interface layer is responsible for the connection with the client and manages the connection, completes the analysis command, handles the concurrent connection and so on. The system control layer is the core of the whole buffer layer, including load balancing, data management and replica management. Data management is responsible for the control of data operation, including data addition, query, update and so on. Replica management is designed to ensure that hot data can be copied to other standby cache nodes so that consistent data reading can be achieved when hot data requests occur. The data storage layer implements the storage problem of specific data and sets the effective time length of the data storage. Finally, the system is built with the designed system framework, and the performance of the system is tested. The experimental results also show that the cache system can improve the performance of the parallel file system (GlusterFS). 32 figures, 56 refs
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP333;TP316.4
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 吳e,
本文編號(hào):2234151
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2234151.html
最近更新
教材專(zhuān)著