天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于云存儲(chǔ)的重復(fù)數(shù)據(jù)刪除文件系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-03-29 00:03

  本文選題:重復(fù)數(shù)據(jù)刪除 切入點(diǎn):云存儲(chǔ) 出處:《華中科技大學(xué)》2013年碩士論文


【摘要】:隨著在線存儲(chǔ)需求量的增長(zhǎng),各大云存儲(chǔ)公司開(kāi)始計(jì)費(fèi)模式的探索,只有付費(fèi)才能獲得更好的服務(wù),免費(fèi)的云存儲(chǔ)空間已經(jīng)不能滿(mǎn)足用戶(hù)的需求,云存儲(chǔ)的成本問(wèn)題已經(jīng)開(kāi)始影響用戶(hù)的工作生活。針對(duì)上述問(wèn)題,,提出了一種基于云存儲(chǔ)的重復(fù)數(shù)據(jù)刪除文件系統(tǒng)。 該系統(tǒng)是一個(gè)具有云存儲(chǔ)增量同步的用戶(hù)端文件系統(tǒng),采用重復(fù)數(shù)據(jù)刪除技術(shù),自動(dòng)將用戶(hù)的本地?cái)?shù)據(jù)無(wú)冗余上傳到云端。系統(tǒng)由六個(gè)模塊構(gòu)成,用戶(hù)接口模塊接收從Fuse內(nèi)核空間傳遞過(guò)來(lái)的系統(tǒng)請(qǐng)求,調(diào)用相關(guān)模塊完成響應(yīng)。云端同步模塊利用云存儲(chǔ)開(kāi)放接口,配合系統(tǒng)各模塊進(jìn)行本地與云端數(shù)據(jù)同步。文件管理模塊從云端獲取文件列表,建立文件索引節(jié)點(diǎn),對(duì)文件進(jìn)行組織管理。文件操作模塊處理系統(tǒng)讀寫(xiě)請(qǐng)求。數(shù)據(jù)重刪模塊在源端進(jìn)行重復(fù)數(shù)據(jù)刪除,該模塊采用基于內(nèi)容的變長(zhǎng)切分算法,使用一個(gè)長(zhǎng)度固定的滑動(dòng)窗口對(duì)文件數(shù)據(jù)計(jì)算指紋,如果指紋模一個(gè)特定的整數(shù)等于預(yù)定的數(shù)值,就把窗口位置作為塊的邊界,若出現(xiàn)指紋相同的塊則認(rèn)為重復(fù)。將去重后的文件和記錄數(shù)據(jù)塊信息的元數(shù)據(jù)表上傳到云端。垃圾回收模塊在系統(tǒng)卸載時(shí),回收不用的表和冗余的數(shù)據(jù)文件。 利用多版本內(nèi)核文件和虛擬機(jī)文件,對(duì)系統(tǒng)進(jìn)行重復(fù)數(shù)據(jù)刪除壓縮比測(cè)試。結(jié)果表明,在大規(guī)模文檔數(shù)據(jù)中,去重率最高達(dá)到67%。以阿里云平臺(tái)計(jì)費(fèi)標(biāo)準(zhǔn)核算,1TB用戶(hù)數(shù)據(jù)理論上能夠節(jié)省4391元/年。
[Abstract]:With the increasing demand for online storage, the major cloud storage companies began to explore the charging model. Only by paying can we get better services. Free cloud storage space can no longer meet the needs of users. The cost of cloud storage has already begun to affect the working life of users. In view of the above problems, a file system for deleting duplicate data based on cloud storage is proposed. The system is a file system with incremental synchronization of cloud storage. It automatically uploads the local data of the user to the cloud without redundancy by using repeated data deletion technology. The system consists of six modules. The user interface module receives the system request passed from the Fuse kernel space and calls the relevant module to complete the response. The file management module acquires the file list from the cloud and establishes the file index node. File management, file operation module processing system read and write request, data redelete module in the source end of repeated data deletion, the module uses content-based variable length segmentation algorithm, A fixed length sliding window is used to calculate the fingerprint of the file data. If a particular integer is equal to a predetermined value, the window position is used as the boundary of the block. If a block with the same fingerprint appears, the duplicate file and the metadata table recording the block information are uploaded to the cloud. The garbage collection module retrieves unused tables and redundant data files when the system unloads. By using multi-version kernel files and virtual machine files, the system was tested for repeated data deletion compression ratio. The results show that, in large scale document data, The highest weight removal rate is 67 yuan. According to the standard accounting standard of Ali cloud platform, one terabyte user data can be saved 4391 yuan per year theoretically.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP333

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 高英,郭荷清;基于改進(jìn)的ADO.NET的通用數(shù)據(jù)庫(kù)引擎的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)應(yīng)用;2005年01期

2 萬(wàn)繼光,詹玲;一種集群NAS網(wǎng)絡(luò)備份系統(tǒng)的研究與實(shí)現(xiàn)[J];小型微型計(jì)算機(jī)系統(tǒng);2005年06期



本文編號(hào):1678640

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1678640.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)0503a***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com