一種面向HDFS的數(shù)據(jù)隨機訪問方法
發(fā)布時間:2019-05-22 04:48
【摘要】:為了簡化文件系統(tǒng)的實現(xiàn),支持超大規(guī)模數(shù)據(jù)集的流式訪問,HDFS犧牲了文件的隨機訪問功能,而在實際場景中很多應(yīng)用都需要對文件進行隨機訪問。在深入分析HDFS數(shù)據(jù)讀寫原理的基礎(chǔ)上,提出了一種面向HDFS的數(shù)據(jù)隨機訪問方法。其設(shè)計思想是為Datanode添加本地數(shù)據(jù)訪問接口,用戶程序可以讀取Datanode上存放的數(shù)據(jù)塊文件以及把數(shù)據(jù)寫入到Datanode上的數(shù)據(jù)塊存放目錄。文件的首副本由用戶程序直接產(chǎn)生,其余副本在首副本寫入完成之后采用數(shù)據(jù)復制的方式生成。此外,為數(shù)據(jù)塊添加了權(quán)限管理功能,Datanode上的文件副本屬于用戶所有。若名字空間中文件權(quán)限發(fā)生變化,文件對應(yīng)的數(shù)據(jù)塊權(quán)限也會改變。測試表明,數(shù)據(jù)讀取性能提升了約10%,數(shù)據(jù)寫入性能提升了20%以上,在高并發(fā)下寫入性能最大可提升2.5倍。
[Abstract]:In order to simplify the implementation of file system and support streaming access to super-large data sets, HDFS sacrifices the random access function of files, and many applications need random access to files in practical scenarios. Based on the deep analysis of the principle of HDFS data reading and writing, a HDFS oriented data random access method is proposed. The design idea is to add a local data access interface for Datanode. The user program can read the data block files stored on the Datanode and write the data to the data block storage directory on the Datanode. The first copy of the file is generated directly by the user program, and the other copies are generated by data replication after the first copy is written. In addition, permission management function is added to the data block, and the file copy on Datanode belongs to the user. If the file permissions in the name space change, the data block permissions corresponding to the file will also change. The test results show that the data reading performance is improved by about 10%, the data writing performance is improved by more than 20%, and the write performance can be improved by 2.5 times at high concurrency.
【作者單位】: 中國科學院高能物理研究所;中國科學院大學;
【基金】:國家自然科學基金(No.11375223,No.11375221)
【分類號】:TP311.13;TP333
本文編號:2482690
[Abstract]:In order to simplify the implementation of file system and support streaming access to super-large data sets, HDFS sacrifices the random access function of files, and many applications need random access to files in practical scenarios. Based on the deep analysis of the principle of HDFS data reading and writing, a HDFS oriented data random access method is proposed. The design idea is to add a local data access interface for Datanode. The user program can read the data block files stored on the Datanode and write the data to the data block storage directory on the Datanode. The first copy of the file is generated directly by the user program, and the other copies are generated by data replication after the first copy is written. In addition, permission management function is added to the data block, and the file copy on Datanode belongs to the user. If the file permissions in the name space change, the data block permissions corresponding to the file will also change. The test results show that the data reading performance is improved by about 10%, the data writing performance is improved by more than 20%, and the write performance can be improved by 2.5 times at high concurrency.
【作者單位】: 中國科學院高能物理研究所;中國科學院大學;
【基金】:國家自然科學基金(No.11375223,No.11375221)
【分類號】:TP311.13;TP333
【相似文獻】
相關(guān)期刊論文 前4條
1 ;室溫超低電壓操作的高密度磁電阻隨機訪問存儲器[J];中國基礎(chǔ)科學;2012年01期
2 熊青玲;同步靜態(tài)隨機訪問存儲器常見問題解析[J];電子器件;2004年01期
3 孫健;陳嵐;郝曉冉;;基于PCRAM主存系統(tǒng)的訪問機制[J];微電子學與計算機;2014年01期
4 ;[J];;年期
相關(guān)會議論文 前1條
1 代芬;王衛(wèi)星;俞龍;;同步靜態(tài)隨機訪問存儲器的特點及應(yīng)用[A];農(nóng)業(yè)工程科技創(chuàng)新與建設(shè)現(xiàn)代農(nóng)業(yè)——2005年中國農(nóng)業(yè)工程學會學術(shù)年會論文集第三分冊[C];2005年
相關(guān)碩士學位論文 前2條
1 趙琨;云存儲中支持隱私保護的隱藏性隨機訪問機制研究[D];電子科技大學;2013年
2 朱婷;大容量靜態(tài)隨機訪問存儲器的低功耗研究[D];電子科技大學;2011年
,本文編號:2482690
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2482690.html
最近更新
教材專著