全球剖分編碼海量瓦片文件的有序存儲和預(yù)取技術(shù)研究
發(fā)布時間:2018-08-28 14:24
【摘要】:地理信息服務(wù)具有數(shù)據(jù)量大、文件數(shù)多、大量用戶并發(fā)訪問等特點,傳統(tǒng)文件系統(tǒng)和以Hadoop分布式文件系統(tǒng)HDFS(Hadoop Distributed File System)為代表的分布式文件系統(tǒng)無法滿足海量地理空間數(shù)據(jù)的存儲與訪問要求。為了滿足千億規(guī)模海量小文件的存儲和訪問需求,本人所在項目組基于HDFS實現(xiàn)了海量小文件支持分布文件系統(tǒng)(SMDFS)。測繪信息系統(tǒng)的地圖瓦片數(shù)據(jù)通常以金字塔組織,數(shù)據(jù)訪問具有空間局部性的特點,因此如果能夠?qū)崿F(xiàn)文件預(yù)取將有效改善文件訪問性能。然而SMDFS文件系統(tǒng)將成千上萬小文件聚合成一個聚合文件進行存儲,很難將地理相鄰的若干文件通過一次I/O就反饋給用戶。針對測繪數(shù)據(jù)訪問的空間局部性特點和單個瓦片文件訪問效率低的問題,本文提出基于地理位置的周邊圖片預(yù)取技術(shù),目的是減少SMDFS的I/O訪問次數(shù),提高文件系統(tǒng)的訪問性能。實現(xiàn)預(yù)取的一個前提條件是金字塔內(nèi)瓦片文件存儲是有序的。本文提出遞歸四分排序方法的全球剖分編碼海量瓦片文件順序存儲技術(shù),使得二維瓦片聚合文件依據(jù)地理位置信息進行排序,使地理相鄰的圖片存儲時相鄰。在順序存儲技術(shù)基礎(chǔ)上,本文提出并實現(xiàn)了基于有序金字塔的預(yù)取技術(shù),成功解決海量瓦片讀取效率低、并發(fā)訪問支持能力不足等問題。
[Abstract]:Geographical information service is characterized by large amount of data, large number of files, concurrent access by a large number of users, etc. Traditional file system and distributed file system represented by Hadoop distributed file system HDFS (Hadoop Distributed File System) can not meet the storage and access requirements of massive geospatial data. In order to meet the storage and access requirements of hundreds of billions of large and massive small files, my project team implemented a large number of small files supporting distributed file system (SMDFS).) based on HDFS. Map tile data of surveying and mapping information system is usually organized by pyramid, and data access has the characteristic of space locality. Therefore, if file prefetching can be realized, the performance of file access will be improved effectively. However, the SMDFS file system aggregates thousands of small files into one aggregate file for storage, so it is difficult to feed back several files to users through one I / O. In view of the spatial locality characteristics of surveying and mapping data access and the low efficiency of single tile file access, this paper proposes a prefetching technique based on geographical location for peripheral images, which aims to reduce the number of I / O visits of SMDFS and improve the access performance of file system. A prerequisite for pre-fetching is that the storage of tile files in the pyramid is orderly. In this paper, a recursive quadrature sorting method is proposed, in which the sequential storage technology of massive tile files in global partition coding is proposed, which makes the two-dimensional tile aggregation files be sorted according to the geographic location information, so that the adjacent images can be stored adjacent to each other. Based on the sequential storage technology, this paper proposes and implements the prefetching technology based on the ordered pyramid, which successfully solves the problems of low reading efficiency and insufficient concurrent access support capacity of massive tiles.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333
,
本文編號:2209651
[Abstract]:Geographical information service is characterized by large amount of data, large number of files, concurrent access by a large number of users, etc. Traditional file system and distributed file system represented by Hadoop distributed file system HDFS (Hadoop Distributed File System) can not meet the storage and access requirements of massive geospatial data. In order to meet the storage and access requirements of hundreds of billions of large and massive small files, my project team implemented a large number of small files supporting distributed file system (SMDFS).) based on HDFS. Map tile data of surveying and mapping information system is usually organized by pyramid, and data access has the characteristic of space locality. Therefore, if file prefetching can be realized, the performance of file access will be improved effectively. However, the SMDFS file system aggregates thousands of small files into one aggregate file for storage, so it is difficult to feed back several files to users through one I / O. In view of the spatial locality characteristics of surveying and mapping data access and the low efficiency of single tile file access, this paper proposes a prefetching technique based on geographical location for peripheral images, which aims to reduce the number of I / O visits of SMDFS and improve the access performance of file system. A prerequisite for pre-fetching is that the storage of tile files in the pyramid is orderly. In this paper, a recursive quadrature sorting method is proposed, in which the sequential storage technology of massive tile files in global partition coding is proposed, which makes the two-dimensional tile aggregation files be sorted according to the geographic location information, so that the adjacent images can be stored adjacent to each other. Based on the sequential storage technology, this paper proposes and implements the prefetching technology based on the ordered pyramid, which successfully solves the problems of low reading efficiency and insufficient concurrent access support capacity of massive tiles.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333
,
本文編號:2209651
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2209651.html
最近更新
教材專著