天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計(jì)算機(jī)論文 >

個(gè)人電子數(shù)據(jù)云存儲(chǔ)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)

發(fā)布時(shí)間:2018-05-03 18:04

  本文選題:云存儲(chǔ) + Hadoop分布式文件系統(tǒng); 參考:《大連理工大學(xué)》2012年碩士論文


【摘要】:隨著社交網(wǎng)絡(luò)的發(fā)展,個(gè)人電子數(shù)據(jù)呈爆炸性增長(zhǎng)趨勢(shì),傳統(tǒng)的集中式數(shù)據(jù)存儲(chǔ)模式已經(jīng)無法滿足其存儲(chǔ)和使用需求。本文針對(duì)個(gè)人電子數(shù)據(jù)的特點(diǎn),提出了基于云環(huán)境的通用存儲(chǔ)模式,通過設(shè)計(jì)索引,可以提供多種快速查詢功能。本系統(tǒng)能夠有效的管理日益復(fù)雜的個(gè)人電子數(shù)據(jù),滿足用戶對(duì)于存儲(chǔ)能力、數(shù)據(jù)可用性和資源共享的需求。 云計(jì)算目前已經(jīng)成為學(xué)術(shù)界的研究熱點(diǎn)。而基于云概念的存儲(chǔ)系統(tǒng),也已經(jīng)初見端倪。云存儲(chǔ)強(qiáng)調(diào)透明性,是以集群的方式對(duì)外提供海量的數(shù)據(jù)存儲(chǔ)能力,通過增加存儲(chǔ)節(jié)點(diǎn),可以擴(kuò)展存儲(chǔ)容量,而數(shù)據(jù)的冗余備份,則可以保證容錯(cuò)性能。另外,分布式的數(shù)據(jù)存儲(chǔ),可以支持并行化數(shù)據(jù)管理,從而提高了數(shù)據(jù)存取性能。 本文針對(duì)日益增長(zhǎng)的海量個(gè)人電子數(shù)據(jù),結(jié)合分布式云存儲(chǔ)技術(shù),設(shè)計(jì)并實(shí)現(xiàn)了個(gè)人電子數(shù)據(jù)的云存儲(chǔ)系統(tǒng)。針對(duì)個(gè)人電子數(shù)據(jù)文件大小異構(gòu)的特點(diǎn),本文設(shè)計(jì)了通用的個(gè)人電子數(shù)據(jù)存儲(chǔ)模型,即利用Hadoop分布式文件系統(tǒng)(HDFS)直接存放視頻等大數(shù)據(jù)文件,而相對(duì)較小的文件,利用HBase直接存放。該模型能根據(jù)數(shù)據(jù)的不同類別,如圖片、文檔、視頻進(jìn)行分區(qū)存儲(chǔ)。針對(duì)視頻文件,本文設(shè)計(jì)了名稱索引和主題索引,能夠支持兩種情況下的快速檢索。針對(duì)個(gè)人電子數(shù)據(jù)中文檔數(shù)據(jù)內(nèi)容雜亂而用戶又需要按內(nèi)容檢索的特點(diǎn),本文利用MapReduce框架編程,建立Lucene倒排索引,實(shí)現(xiàn)了分布式索引的構(gòu)建與維護(hù)。在此基礎(chǔ)上,本文利用分布式檢索工具Katta對(duì)分布式索引進(jìn)行檢索,實(shí)現(xiàn)了對(duì)海量數(shù)據(jù)信息的高效獲取。此外,本文利用HDFS和HBase提供給用戶的操作接口,對(duì)分布式系統(tǒng)上的數(shù)據(jù)實(shí)現(xiàn)上傳、下載、刪除等功能。
[Abstract]:With the development of social network, personal electronic data has an explosive growth trend. The traditional centralized data storage mode has not been able to meet its storage and use requirements. In this paper, a general storage mode based on the cloud environment is proposed for the characteristics of personal electronic data. By setting the index, a variety of fast query functions can be provided. The system can effectively manage the increasingly complex personal electronic data to meet users' needs for storage capacity, data availability and resource sharing.
Cloud computing has become a hot topic in the academic field, and the cloud based storage system has also been seen. Cloud storage emphasizes transparency. It provides massive data storage capacity in a cluster and can expand storage capacity by increasing storage nodes, while redundant backup of data can guarantee fault tolerance. In addition, distributed data storage can support parallel data management, thus improving data access performance.
This paper designs and implements a cloud storage system for personal electronic data in view of the growing mass of personal electronic data and distributed cloud storage technology. In view of the characteristics of the size of individual electronic data files, this paper designs a general personal electronic data storage model, which is stored directly by the Hadoop distributed file system (HDFS). Large data files such as video, while relatively small files are stored directly by HBase. The model can be partitioned according to different types of data, such as pictures, documents, and video. In this paper, the name index and subject index are designed to support the fast retrieval of two situations. According to the characteristics of the disorderly content and the user need to retrieve the content according to the content, this paper uses the MapReduce framework programming to establish the Lucene inverted index, and realizes the construction and maintenance of the distributed index. On this basis, this paper uses the distributed retrieval tool Katta to retrieve the distributed index, and realizes the efficient acquisition of the mass data information. In this paper, we use HDFS and HBase to provide user interface, upload, download and delete data on distributed system.

【學(xué)位授予單位】:大連理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP333

【參考文獻(xiàn)】

相關(guān)期刊論文 前9條

1 程瑩;張?jiān)朴?徐雷;房秉毅;;基于Hadoop及關(guān)系型數(shù)據(jù)庫的海量數(shù)據(jù)分析研究[J];電信科學(xué);2010年11期

2 桑媛媛;;高校電子文檔管理工作[J];河北聯(lián)合大學(xué)學(xué)報(bào)(社會(huì)科學(xué)版);2012年02期

3 程學(xué)旗,呂建明,周昭濤;基于對(duì)等網(wǎng)絡(luò)的全文信息檢索[J];計(jì)算機(jī)研究與發(fā)展;2004年12期

4 劉麗;吳秋云;李軍;;基于Web的分布式文檔管理系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程與科學(xué);2007年01期

5 劉楊;陳帥;趙穩(wěn);劉義豐;;面向內(nèi)容文檔管理系統(tǒng)的研究[J];科技傳播;2012年06期

6 于海波;;分布式索引的研究與應(yīng)用[J];黑龍江科技信息;2010年26期

7 褚光華,吳家春;文檔管理系統(tǒng)的設(shè)計(jì)與開發(fā)[J];現(xiàn)代計(jì)算機(jī);2000年07期

8 楊代慶;張智雄;;基于Hadoop的海量共現(xiàn)矩陣生成方法[J];現(xiàn)代圖書情報(bào)技術(shù);2009年04期

9 吳康新;陳旭;;網(wǎng)絡(luò)環(huán)境下的工程文檔管理系統(tǒng)研究[J];項(xiàng)目管理技術(shù);2012年03期

相關(guān)碩士學(xué)位論文 前4條

1 李彥輝;基于用戶興趣的個(gè)性化搜索引擎研究[D];山西財(cái)經(jīng)大學(xué);2011年

2 付志超;基于Map/Reduce的分布式智能搜索引擎框架研究[D];武漢理工大學(xué);2008年

3 蔡小龍;基于分布式緩存技術(shù)的文檔管理系統(tǒng)應(yīng)用研究[D];安徽大學(xué);2010年

4 屈磊;動(dòng)態(tài)全文索引系統(tǒng)關(guān)鍵技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2009年

,

本文編號(hào):1839526

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1839526.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶1d07a***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com