海量數(shù)據(jù)分布式存儲(chǔ)技術(shù)的研究與應(yīng)用
[Abstract]:In recent years, with the rapid development of information technology, business on the Internet continues to expand, users continue to increase, storage space continues to increase, data shows an unimaginable growth trend. However, the storage capacity is often inversely proportional to the storage performance. The traditional database is very difficult to deal with the massive data, which exposes the problems of low concurrency, poor expansibility, low efficiency and so on. Therefore, mass data storage has become an important research object, and parallel processing distributed database based on MPP (Massive Parallel Processing) architecture is one of the research directions. This paper has done the exploratory research on the massive data storage technology, selected topics from the "11th Five-Year Plan" national key science and technology support project-safe and credible telecom grade reproductive health service operation support system key technology research. It mainly solves the problem of access performance caused by the increasing amount of data in the project, and provides high concurrency, high availability and high scalability storage technology support for the project. The research work of this paper mainly includes the following aspects: 1. Based on the massive data storage technology, the theory of relational data and NoSQL data model, distributed database storage and parallel processing mode based on MPP architecture. This paper summarizes the scheme and new technology of mass data storage. 2, analyzes the characteristics of mass data storage technology, compares the advantages and disadvantages of distributed mass data storage technology used at home and abroad, and designs a distributed storage model of mass data. The implementation methods of SQL parse module, data segmentation module, parallel query module and result module are described in detail. 3. Based on the design of massive data storage model and the technology of data parallel query storage. A storage architecture'DB Mapping 'system based on MPP architecture is developed in this paper. The solution of mass data storage with good scalability and large scale parallel processing is realized. The main contributions of this paper are as follows: a parallel data storage method based on MPP architecture is proposed, a data storage method from client initiation request to data persistence is proposed, and the idea of Map/Reduce is integrated. The work is distributed to each data node to achieve high scalability, high availability and high concurrency. The feasibility of the massive data storage method is verified by building distributed data nodes for simulation test.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP333;TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 汪劍;郭朗;;分布式遠(yuǎn)程教育數(shù)據(jù)庫(kù)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[J];成都大學(xué)學(xué)報(bào)(自然科學(xué)版);2009年04期
2 王亞剛;楊康平;;大規(guī)模并行處理技術(shù)應(yīng)用綜述[J];電腦知識(shí)與技術(shù);2009年12期
3 姜宇鳴;;海量數(shù)據(jù)存儲(chǔ)系統(tǒng)研究[J];電腦知識(shí)與技術(shù);2011年08期
4 黨鵬飛;;淺談分布式數(shù)據(jù)庫(kù)在電視臺(tái)管理信息系統(tǒng)中的應(yīng)用[J];計(jì)算機(jī)光盤軟件與應(yīng)用;2012年14期
5 劉云生,覃飆;分布式實(shí)時(shí)事務(wù)提交協(xié)議[J];計(jì)算機(jī)研究與發(fā)展;2002年07期
6 彭宏;杜楠;;基于并行數(shù)據(jù)庫(kù)的海量商務(wù)數(shù)據(jù)管理系統(tǒng)研究[J];計(jì)算機(jī)應(yīng)用研究;2009年02期
7 覃雄派;王會(huì)舉;杜小勇;王珊;;大數(shù)據(jù)分析——RDBMS與MapReduce的競(jìng)爭(zhēng)與共生[J];軟件學(xué)報(bào);2012年01期
8 李文虎;;分布式數(shù)據(jù)庫(kù)系統(tǒng)的設(shè)計(jì)淺析[J];科技資訊;2009年34期
相關(guān)碩士學(xué)位論文 前1條
1 周敏;Anthill:一種基于MapReduce的分布式DBMS[D];暨南大學(xué);2010年
,本文編號(hào):2398278
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2398278.html