大規(guī)模圖片存儲與索引系統(tǒng)的設(shè)計與實現(xiàn)
發(fā)布時間:2018-04-19 04:10
本文選題:圖片存儲 + 鍵-值對 ; 參考:《華中科技大學》2013年碩士論文
【摘要】:隨著數(shù)碼產(chǎn)品的普及,家庭圖片類型繁多,且其總量呈爆炸式增長,超出普通用戶的管理能力,由此產(chǎn)生了‘為大規(guī)模圖片文件設(shè)計高效存儲與檢索系統(tǒng)’的應(yīng)用需求,針對此,設(shè)計并實現(xiàn)了一種大規(guī)模圖片存儲管理與檢索原型系統(tǒng)。 該系統(tǒng)采取C/S基礎(chǔ)架構(gòu),具備數(shù)據(jù)上傳功能和語義擴展特性,并采取了高效檢索機制和優(yōu)化技術(shù)。具體地,數(shù)據(jù)上傳采用高效可靠的文件傳輸協(xié)議(FTP)將用戶圖片文件傳輸?shù)椒⻊?wù)器上存儲;在客戶端完成圖片語義擴展,并以擴展屬性的方式進行定義和保存;在服務(wù)器內(nèi)存中,實現(xiàn)基于分層索引結(jié)構(gòu)的鍵-值對數(shù)據(jù)庫。對于鍵值對插入操作,首先通過第一層的Bloom Filter建立檢索集,然后對鍵進行哈希處理獲得第二層平衡二叉查找樹(AVL樹)的地址,最后在AVL樹中進行插入操作;對于查詢操作,通過第一層的Bloom Filter對查詢條件進行過濾,然后對查詢條件進行哈希處理獲得第二層AVL樹的地址,最后在AVL樹中進行查詢操作。服務(wù)器內(nèi)存鍵值對數(shù)據(jù)庫的增刪改查操作接口通過遠程調(diào)用的方式提供給客戶端。最后,采用往日志文件中進行追加寫操作和快照相結(jié)合的方式,將內(nèi)存索引信息同步至磁盤日志文件中,保障了內(nèi)存索引信息的可靠性。 實驗結(jié)果表明,基于鍵值對的內(nèi)存分層索引結(jié)構(gòu)每秒鐘可寫入48600左右個鍵值對,可讀出377800左右個鍵值對。以一個擁有140000個文件的目錄為例,通過Linux文件系統(tǒng)自帶find命令,,平均查詢時間約為0.5秒。假設(shè)每個文件有10個屬性,對1400000個鍵值對建立內(nèi)存索引結(jié)構(gòu)需耗費30.78秒,其后,通過內(nèi)存索引結(jié)構(gòu)進行查詢的時間約為30微秒,查詢性能能提升三個數(shù)量級。
[Abstract]:With the popularity of digital products, there are many types of family pictures, and the total number of them is explosive, which is beyond the management ability of ordinary users. As a result, the application demand of 'designing efficient storage and retrieval system for large scale picture files' has arisen.In order to solve this problem, a large-scale picture storage management and retrieval prototype system is designed and implemented.The system adopts C / S infrastructure, has the function of data upload and semantic extension, and adopts efficient retrieval mechanism and optimization technology.In particular, the data upload uses an efficient and reliable file transfer protocol (FTP) to transfer user picture files to the server to store; to complete the semantic extension of the pictures in the client, and to define and save them in the form of extended attributes; in the memory of the server,The key-value pair database based on hierarchical index structure is implemented.For the key-value pair insertion operation, the retrieval set is first established through the Bloom Filter of the first layer, then the address of the second layer balanced binary lookup tree is obtained by hashing the key. Finally, the insertion operation is carried out in the AVL tree; for the query operation,The query condition is filtered by the Bloom Filter of the first layer, and the address of the second layer AVL tree is obtained by hashing the query condition. Finally, the query operation is carried out in the AVL tree.Server memory key to the database change-delete operation interface through remote call to the client.Finally, the memory index information is synchronized to the disk log file by the combination of append write operation and snapshot to the log file, which ensures the reliability of the memory index information.The experimental results show that the memory hierarchical index structure based on key-value pairs can write about 48600 key-value pairs per second and read out about 377800 key-value pairs.Taking a directory with 140000 files as an example, the average query time is about 0.5 seconds through the Linux file system with the find command.Assuming that each file has 10 attributes, it takes 30.78 seconds to set up the memory index structure for 1400,000 key and value pairs, and then, the query time through the memory index structure is about 30 microseconds, and the query performance can be improved by three orders of magnitude.
【學位授予單位】:華中科技大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP333;TP391.3
【參考文獻】
相關(guān)期刊論文 前10條
1 滿鵬;;海量文檔信息的高效檢索算法[J];長春大學學報;2008年02期
2 彭艷兵;龔儉;劉衛(wèi)江;楊望;;Bloom Filter哈?臻g的元素還原[J];電子學報;2006年05期
3 陳珉,喻丹丹,涂國慶;分布式數(shù)據(jù)庫系統(tǒng)中數(shù)據(jù)一致性維護方法研究[J];國防科技大學學報;2002年03期
4 蘇翔宇;;Key-Value數(shù)據(jù)庫及其應(yīng)用研究[J];電腦知識與技術(shù);2012年05期
5 張衛(wèi)豐;徐寶文;周曉宇;許蕾;李東;;Web搜索引擎綜述[J];計算機科學;2001年09期
6 申展;江寶林;陳yN;唐磊;胡運發(fā);;全文檢索模型綜述[J];計算機科學;2004年05期
7 黎浩宏;;一種新型索引結(jié)構(gòu)[J];計算機工程;2008年16期
8 王彬;張計龍;徐迎曉;;整合數(shù)據(jù)持久化與全文檢索的新方法[J];計算機工程;2009年03期
9 高家利;廖曉峰;;Bloom搜索過濾器的優(yōu)化設(shè)計與實現(xiàn)[J];計算機工程;2009年07期
10 劉金芝;余丹;朱率率;;一種新的云存儲服務(wù)模型研究[J];計算機應(yīng)用研究;2011年05期
本文編號:1771512
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1771512.html
最近更新
教材專著