一種面向HDFS的數(shù)據(jù)隨機(jī)訪問方法
發(fā)布時(shí)間:2019-05-22 04:48
【摘要】:為了簡(jiǎn)化文件系統(tǒng)的實(shí)現(xiàn),支持超大規(guī)模數(shù)據(jù)集的流式訪問,HDFS犧牲了文件的隨機(jī)訪問功能,而在實(shí)際場(chǎng)景中很多應(yīng)用都需要對(duì)文件進(jìn)行隨機(jī)訪問。在深入分析HDFS數(shù)據(jù)讀寫原理的基礎(chǔ)上,提出了一種面向HDFS的數(shù)據(jù)隨機(jī)訪問方法。其設(shè)計(jì)思想是為Datanode添加本地?cái)?shù)據(jù)訪問接口,用戶程序可以讀取Datanode上存放的數(shù)據(jù)塊文件以及把數(shù)據(jù)寫入到Datanode上的數(shù)據(jù)塊存放目錄。文件的首副本由用戶程序直接產(chǎn)生,其余副本在首副本寫入完成之后采用數(shù)據(jù)復(fù)制的方式生成。此外,為數(shù)據(jù)塊添加了權(quán)限管理功能,Datanode上的文件副本屬于用戶所有。若名字空間中文件權(quán)限發(fā)生變化,文件對(duì)應(yīng)的數(shù)據(jù)塊權(quán)限也會(huì)改變。測(cè)試表明,數(shù)據(jù)讀取性能提升了約10%,數(shù)據(jù)寫入性能提升了20%以上,在高并發(fā)下寫入性能最大可提升2.5倍。
[Abstract]:In order to simplify the implementation of file system and support streaming access to super-large data sets, HDFS sacrifices the random access function of files, and many applications need random access to files in practical scenarios. Based on the deep analysis of the principle of HDFS data reading and writing, a HDFS oriented data random access method is proposed. The design idea is to add a local data access interface for Datanode. The user program can read the data block files stored on the Datanode and write the data to the data block storage directory on the Datanode. The first copy of the file is generated directly by the user program, and the other copies are generated by data replication after the first copy is written. In addition, permission management function is added to the data block, and the file copy on Datanode belongs to the user. If the file permissions in the name space change, the data block permissions corresponding to the file will also change. The test results show that the data reading performance is improved by about 10%, the data writing performance is improved by more than 20%, and the write performance can be improved by 2.5 times at high concurrency.
【作者單位】: 中國(guó)科學(xué)院高能物理研究所;中國(guó)科學(xué)院大學(xué);
【基金】:國(guó)家自然科學(xué)基金(No.11375223,No.11375221)
【分類號(hào)】:TP311.13;TP333
本文編號(hào):2482690
[Abstract]:In order to simplify the implementation of file system and support streaming access to super-large data sets, HDFS sacrifices the random access function of files, and many applications need random access to files in practical scenarios. Based on the deep analysis of the principle of HDFS data reading and writing, a HDFS oriented data random access method is proposed. The design idea is to add a local data access interface for Datanode. The user program can read the data block files stored on the Datanode and write the data to the data block storage directory on the Datanode. The first copy of the file is generated directly by the user program, and the other copies are generated by data replication after the first copy is written. In addition, permission management function is added to the data block, and the file copy on Datanode belongs to the user. If the file permissions in the name space change, the data block permissions corresponding to the file will also change. The test results show that the data reading performance is improved by about 10%, the data writing performance is improved by more than 20%, and the write performance can be improved by 2.5 times at high concurrency.
【作者單位】: 中國(guó)科學(xué)院高能物理研究所;中國(guó)科學(xué)院大學(xué);
【基金】:國(guó)家自然科學(xué)基金(No.11375223,No.11375221)
【分類號(hào)】:TP311.13;TP333
【相似文獻(xiàn)】
相關(guān)期刊論文 前4條
1 ;室溫超低電壓操作的高密度磁電阻隨機(jī)訪問存儲(chǔ)器[J];中國(guó)基礎(chǔ)科學(xué);2012年01期
2 熊青玲;同步靜態(tài)隨機(jī)訪問存儲(chǔ)器常見問題解析[J];電子器件;2004年01期
3 孫健;陳嵐;郝曉冉;;基于PCRAM主存系統(tǒng)的訪問機(jī)制[J];微電子學(xué)與計(jì)算機(jī);2014年01期
4 ;[J];;年期
相關(guān)會(huì)議論文 前1條
1 代芬;王衛(wèi)星;俞龍;;同步靜態(tài)隨機(jī)訪問存儲(chǔ)器的特點(diǎn)及應(yīng)用[A];農(nóng)業(yè)工程科技創(chuàng)新與建設(shè)現(xiàn)代農(nóng)業(yè)——2005年中國(guó)農(nóng)業(yè)工程學(xué)會(huì)學(xué)術(shù)年會(huì)論文集第三分冊(cè)[C];2005年
相關(guān)碩士學(xué)位論文 前2條
1 趙琨;云存儲(chǔ)中支持隱私保護(hù)的隱藏性隨機(jī)訪問機(jī)制研究[D];電子科技大學(xué);2013年
2 朱婷;大容量靜態(tài)隨機(jī)訪問存儲(chǔ)器的低功耗研究[D];電子科技大學(xué);2011年
,本文編號(hào):2482690
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2482690.html
最近更新
教材專著