基于Hadoop的分布式數(shù)據(jù)存儲(chǔ)設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-03-01 19:01
本文關(guān)鍵詞: 分布式 Hadoop 海量數(shù)據(jù)存儲(chǔ) 云計(jì)算 出處:《吉林大學(xué)》2016年碩士論文 論文類(lèi)型:學(xué)位論文
【摘要】:在互聯(lián)網(wǎng)蓬勃發(fā)展的今天,隨處可以聽(tīng)到有關(guān)云計(jì)算的種種應(yīng)用,過(guò)多的耳濡目染似乎也就標(biāo)志著云時(shí)代的真正到來(lái)。因此,云端產(chǎn)品自然成為云時(shí)代下最為熱門(mén)的產(chǎn)物。另外,由于傳統(tǒng)的Web應(yīng)用是將數(shù)據(jù)保存到服務(wù)器下數(shù)據(jù)庫(kù)系統(tǒng)中,隨著用戶(hù)的數(shù)據(jù)量增多,傳統(tǒng)的web項(xiàng)目面臨著巨大的挑戰(zhàn)。不僅如此,服務(wù)器維護(hù)成本也非常高。將數(shù)據(jù)直接保存到云服務(wù)器上面,云存儲(chǔ)不僅可以保障數(shù)據(jù)的安全,為數(shù)據(jù)存儲(chǔ)提供足夠的空間,而且會(huì)大大降低維護(hù)成本。本文的主要工作是設(shè)計(jì)并實(shí)現(xiàn)云端存儲(chǔ)。在Hadoop云環(huán)境中,利用Web服務(wù)器程序在web下操作Hadoop分布式文件系統(tǒng)API實(shí)現(xiàn)云存儲(chǔ)應(yīng)用。系統(tǒng)采用Struts2,Hibernate3和Spring3三大框架開(kāi)發(fā)J2EE的MVC三層架構(gòu)應(yīng)用,采用Log4j配置和規(guī)范控制臺(tái)輸出的系統(tǒng)日志信息,XFire開(kāi)發(fā)WebService,提供相應(yīng)的服務(wù),JAVA對(duì)Apache下的mail開(kāi)發(fā)并實(shí)現(xiàn)郵件的發(fā)送,頁(yè)面處理上主要采用JQuery實(shí)現(xiàn)頁(yè)面無(wú)刷新操作,考慮到該應(yīng)用中用戶(hù)信息數(shù)據(jù)量小,因此采用MySQL進(jìn)行用戶(hù)數(shù)據(jù)信息的管理,避免HDFS對(duì)文件的循環(huán)遍歷。從系統(tǒng)功能上講,基本實(shí)現(xiàn)一個(gè)云端數(shù)據(jù)文件存儲(chǔ)系統(tǒng),用戶(hù)可以隨時(shí)隨地通過(guò)瀏覽器訪問(wèn)并管理自己的數(shù)據(jù)文件,進(jìn)行上傳,下載,刪除,分享等操作;同時(shí)可以管理自己的用戶(hù)基本信息。區(qū)別與普通的web系統(tǒng),該系統(tǒng)是分布式系統(tǒng)架構(gòu),訪問(wèn)速度、響應(yīng)速度明顯快于普通web應(yīng)用,不僅如此,由于Hadoop的HDFS會(huì)自動(dòng)將數(shù)據(jù)文件進(jìn)行備份,存儲(chǔ)到不同的集群環(huán)境下的從服務(wù)器上,所以不用擔(dān)心一臺(tái)服務(wù)器壞了,該機(jī)器上的文件就出現(xiàn)無(wú)法訪問(wèn)的情況,因?yàn)槠渌臋C(jī)器會(huì)擔(dān)當(dāng)起這樣的角色對(duì)文件進(jìn)行管理。
[Abstract]:With the rapid development of the Internet, all kinds of cloud computing applications can be heard everywhere. Too much osmosis seems to mark the real arrival of the cloud age. Cloud products have naturally become the most popular product in the cloud era. In addition, because the traditional Web application is to store data in the database system under the server, with the increase of the user's data, Traditional web projects face enormous challenges. Not only that, server maintenance costs are also very high. To store data directly on the cloud server, cloud storage can not only guarantee the security of data, but also provide enough space for data storage. And will greatly reduce maintenance costs. The main work of this paper is to design and implement cloud storage. In the Hadoop cloud environment, The application of cloud storage is realized by using Web server program to operate Hadoop distributed file system API under web. The system uses Struts 2 + hibernate 3 and Spring3 framework to develop J2EE MVC three-tier architecture application. The system log information output from the Log4j configuration and specification console is used to develop the Web Service. The corresponding service Java is provided to develop the mail under Apache and to send the mail. In the page processing, the JQuery is mainly used to realize the page no refresh operation. Considering the small amount of user information in this application, MySQL is used to manage the user data information to avoid the circular traversal of files by HDFS. In terms of system functions, a cloud data file storage system is basically implemented. Users can access and manage their own data files, upload, download, delete, share and so on at any time and anywhere through the browser; at the same time, they can manage their users' basic information. The system is a distributed system architecture, accessing speed and responding speed is obviously faster than the ordinary web application. Not only that, because the HDFS of Hadoop will automatically backup the data files and store them on the slave server in different cluster environment. So don't worry about the failure of a server, the files on the machine will be inaccessible, because other machines will play such a role in the management of files.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 王海榮;劉珂;;基于Hadoop的海量數(shù)據(jù)存儲(chǔ)系統(tǒng)設(shè)計(jì)[J];科技通報(bào);2014年09期
2 羅彬;陽(yáng)靜;袁峗;;數(shù)字圖書(shū)館中大數(shù)據(jù)存儲(chǔ)的應(yīng)用研究[J];科技與企業(yè);2013年18期
3 張少敏;李曉強(qiáng);王保義;;基于Hadoop的智能電網(wǎng)數(shù)據(jù)安全存儲(chǔ)設(shè)計(jì)[J];電力系統(tǒng)保護(hù)與控制;2013年14期
4 王蘇衛(wèi);;基于Hadoop和Hive的電信行業(yè)數(shù)據(jù)倉(cāng)庫(kù)研究[J];電子技術(shù)與軟件工程;2013年11期
5 張春明;芮建武;何婷婷;;一種Hadoop小文件存儲(chǔ)和讀取的方法[J];計(jì)算機(jī)應(yīng)用與軟件;2012年11期
,本文編號(hào):1553165
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1553165.html
最近更新
教材專(zhuān)著