基于Hbase的大數(shù)據(jù)存儲(chǔ)系統(tǒng)研究開發(fā)
[Abstract]:With the arrival of the big data era, the amount of data stored in the information system database is increasing explosively, and the performance requirements of data reading, writing and querying are becoming more and more high. The traditional relational database can no longer meet the requirements of big data storage and query. In order to explore the storage and query technology of massive data, this paper focuses on the research and development of typical non-relational (NoSQL) database Hbase. Hbase is an open source version of GoogleBigTable, which has the characteristics of high reliability, high performance, column oriented, scalable, consistent, and so on. Support for secondary indexing. A large scale storage cluster can be built on cheap PC Server by using Hbase technology, and the big data storage system can be realized. In this paper, the architecture of big data storage system is studied firstly, and the key technology of Hbase database is discussed. Then the Hbase database system is deployed on the Spark big data platform, and the floating population database is stored. Because the Hbase database only supports the primary key query, we add the secondary index function to the floating population database, which greatly improves the query speed. On this basis, the performance of floating population database based on Hbase is analyzed and evaluated, and the performance of Hbase is tested by YCSB, a testing tool developed by Yahoo Corporation. The test object is a Hbase data table based on the actual data provided by an enterprise. The total number of records is 30 million. Finally, based on Spark big data platform and Hbase database system, a prototype system of massive floating population data management is developed. The system has the functions of data acquisition, data storage, data management, statistical analysis, system management and so on. Among them, 30 million records are stored. The total amount of data reached 12.6 GB, which realized the efficient storage and fast query of massive data of floating population.
【學(xué)位授予單位】:西安理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13;TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 趙越;李培;王震;張聲圳;;電網(wǎng)圖形數(shù)據(jù)管理MongoDB數(shù)據(jù)庫(kù)的應(yīng)用[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2017年03期
2 熊安萍;王運(yùn)萍;鄒洋;;基于數(shù)據(jù)冗余的HBase合并機(jī)制研究[J];計(jì)算機(jī)工程;2017年02期
3 崔丹;史金鑫;;基于Redis實(shí)現(xiàn)HBase二級(jí)索引的方法[J];軟件;2016年11期
4 陳達(dá)倫;陳榮國(guó);謝炯;;基于MPP架構(gòu)的并行空間數(shù)據(jù)庫(kù)原型系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[J];地球信息科學(xué)學(xué)報(bào);2016年02期
5 丁祥武;陳金鑫;王梅;;異構(gòu)計(jì)算平臺(tái)上列存儲(chǔ)系統(tǒng)的并行連接優(yōu)化策略[J];計(jì)算機(jī)工程與應(yīng)用;2017年05期
6 魏文娟;王黎明;;異構(gòu)Hadoop集群下的比例數(shù)據(jù)分配策略[J];計(jì)算機(jī)應(yīng)用與軟件;2015年06期
7 馬雁云;;基于HBase分布式檔案管理系統(tǒng)研究[J];蘭臺(tái)世界;2015年14期
8 費(fèi)賢舉;王樹鋒;;基于云環(huán)境下的海量大數(shù)據(jù)存儲(chǔ)系統(tǒng)設(shè)計(jì)[J];計(jì)算機(jī)測(cè)量與控制;2014年07期
9 杜曉東;;大數(shù)據(jù)環(huán)境下基于Hbase的分布式查詢優(yōu)化研究[J];計(jì)算機(jī)光盤軟件與應(yīng)用;2014年08期
10 薛峰;梁鋒;徐書勛;王彪任;;基于Spring MVC框架的Web研究與應(yīng)用[J];合肥工業(yè)大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年03期
相關(guān)博士學(xué)位論文 前1條
1 丁祥武;列存儲(chǔ)系統(tǒng)的若干關(guān)鍵技術(shù)研究[D];東華大學(xué);2013年
相關(guān)碩士學(xué)位論文 前5條
1 陸婷;基于HBase的交通流數(shù)據(jù)實(shí)時(shí)存儲(chǔ)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];北方工業(yè)大學(xué);2016年
2 張彬;基于Spark大數(shù)據(jù)平臺(tái)日志審計(jì)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];山東大學(xué);2015年
3 邱士海;基于分布式存儲(chǔ)系統(tǒng)的企業(yè)級(jí)大數(shù)據(jù)解決方案的研究與實(shí)現(xiàn)[D];吉林大學(xué);2015年
4 關(guān)瑩瑩;基于SSH框架的流動(dòng)人口管理系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];吉林大學(xué);2014年
5 黃曉云;基于HDFS的云存儲(chǔ)服務(wù)系統(tǒng)研究[D];大連海事大學(xué);2010年
,本文編號(hào):2167265
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2167265.html