數(shù)據(jù)遷移云服務(wù)的設(shè)計(jì)與實(shí)現(xiàn)
本文關(guān)鍵詞:數(shù)據(jù)遷移云服務(wù)的設(shè)計(jì)與實(shí)現(xiàn) 出處:《浙江大學(xué)》2017年碩士論文 論文類(lèi)型:學(xué)位論文
更多相關(guān)文章: 數(shù)據(jù)遷移 數(shù)據(jù)庫(kù)日志 負(fù)載均衡 云計(jì)算 分布式
【摘要】:大數(shù)據(jù)時(shí)代的到來(lái),傳統(tǒng)的數(shù)據(jù)存儲(chǔ)和處理手段已經(jīng)難以滿(mǎn)足日益增長(zhǎng)的需求,越來(lái)越多的數(shù)據(jù)需要遷移到hadoop計(jì)算平臺(tái)進(jìn)行存儲(chǔ)和處理。數(shù)據(jù)遷移作為數(shù)據(jù)科學(xué)領(lǐng)域的重要研究方向和技術(shù),也受到學(xué)術(shù)界、工業(yè)界更多研究人員的關(guān)注、研究。已有的數(shù)據(jù)遷移工具往往具有著單機(jī)性能低下、安裝配置繁瑣、不支持流式數(shù)據(jù)遷移等缺點(diǎn)。本文針對(duì)現(xiàn)有工具的缺點(diǎn),結(jié)合已有研究成果,設(shè)計(jì)出了針對(duì)hadoop集群的數(shù)據(jù)遷移云服務(wù)。本文主要貢獻(xiàn)如下:(1)設(shè)計(jì)并優(yōu)化了基于數(shù)據(jù)庫(kù)日志的流式數(shù)據(jù)提取、遷移技術(shù)。通過(guò)對(duì)數(shù)據(jù)庫(kù)日志進(jìn)行解析,提取增量數(shù)據(jù),并將這些數(shù)據(jù)直接封裝為消息發(fā)往hadoop集群。大大降低流式數(shù)據(jù)提取的I0、網(wǎng)絡(luò)等開(kāi)銷(xiāo)。(2)將因子分析數(shù)學(xué)思想應(yīng)用于負(fù)載均衡負(fù)載狀態(tài)評(píng)估,將響應(yīng)時(shí)間納入負(fù)載均衡參數(shù)指標(biāo)。該算法相對(duì)于傳統(tǒng)的負(fù)載均衡算法,能夠更有效地評(píng)估節(jié)點(diǎn)當(dāng)前負(fù)載情況,更大地利用好集群資源。大大提高了數(shù)據(jù)遷移系統(tǒng)的吞吐量和集群計(jì)算能力。(3)將數(shù)據(jù)遷移系統(tǒng)上升到云計(jì)算的高度。針對(duì)業(yè)內(nèi)已有遷移工具配置復(fù)雜、單機(jī)性能低下、容錯(cuò)性差等問(wèn)題,本文提出的數(shù)據(jù)遷移云服務(wù)設(shè)計(jì)能夠更好的提升系統(tǒng)整體遷移能力和吞吐。同時(shí)對(duì)于遷移任務(wù)具有一定的故障可恢復(fù)性。
[Abstract]:With the arrival of big data era, the traditional means of data storage and processing have been difficult to meet the increasing demand. More and more data need to be migrated to the hadoop computing platform for storage and processing. Data migration as an important research direction and technology in the field of data science is also by the academic community. Industry more researchers concern, research. The existing data migration tools often have the disadvantages of low performance, cumbersome installation and configuration, do not support streaming data migration and other shortcomings. This paper aims at the shortcomings of the existing tools. Combined with the existing research results, a data migration cloud service for hadoop cluster is designed. The main contributions of this paper are as follows: 1) the streaming data extraction based on database log is designed and optimized. Migration technology. Through the database log parsing, extract incremental data, and directly encapsulate these data as messages to the hadoop cluster, greatly reducing the I0 stream data extraction. This paper applies the mathematical idea of factor analysis to load balancing state evaluation, and takes response time into load balancing parameter index. The algorithm is compared with the traditional load balancing algorithm. Can more effectively evaluate the current load of the node. Better use of cluster resources. Greatly improve the throughput of data migration systems and cluster computing capabilities.) data migration system to the height of cloud computing. For the industry migration tools configuration is complex. The data migration cloud service design proposed in this paper can improve the overall migration capability and throughput of the system, and has a certain fault recoverability for migration tasks.
【學(xué)位授予單位】:浙江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:TP311.13;TP393.09
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 冰原;;數(shù)據(jù)遷移不再難[J];每周電腦報(bào);2006年04期
2 李建國(guó);;為數(shù)據(jù)遷移上“保險(xiǎn)”[J];信息系統(tǒng)工程;2007年02期
3 史曉燕;;數(shù)據(jù)遷移的研究[J];浙江工商職業(yè)技術(shù)學(xué)院學(xué)報(bào);2007年03期
4 宿培成;;關(guān)于計(jì)算機(jī)云存儲(chǔ)中數(shù)據(jù)遷移的分析[J];信息安全與技術(shù);2012年05期
5 ;阿里云提供服務(wù)器免費(fèi)數(shù)據(jù)遷移[J];金融科技時(shí)代;2012年09期
6 楊麗芳;劉琳;;淺析計(jì)算機(jī)云存儲(chǔ)的數(shù)據(jù)遷移[J];計(jì)算機(jī)光盤(pán)軟件與應(yīng)用;2013年19期
7 王婧韞;數(shù)據(jù)遷移的一般原則[J];電腦開(kāi)發(fā)與應(yīng)用;2000年04期
8 蘆紅;圖書(shū)館計(jì)算機(jī)集成系統(tǒng)之間的數(shù)據(jù)遷移[J];情報(bào)雜志;2003年07期
9 張玉珍,黃東;在管理信息系統(tǒng)開(kāi)發(fā)中如何進(jìn)行數(shù)據(jù)遷移[J];工業(yè)控制計(jì)算機(jī);2003年04期
10 蔡葵;巧用數(shù)據(jù)管道實(shí)現(xiàn)數(shù)據(jù)遷移[J];華南金融電腦;2003年11期
相關(guān)會(huì)議論文 前6條
1 蔣學(xué);黃瑞;劉t,
本文編號(hào):1368091
本文鏈接:http://sikaile.net/shoufeilunwen/xixikjs/1368091.html