天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計(jì)算機(jī)論文 >

面向海量數(shù)據(jù)處理領(lǐng)域的云計(jì)算及其關(guān)鍵技術(shù)研究

發(fā)布時(shí)間:2018-06-28 10:56

  本文選題:海量數(shù)據(jù)處理 + 云計(jì)算; 參考:《南京理工大學(xué)》2013年博士論文


【摘要】:隨著信息技術(shù)的飛速發(fā)展,在許多科學(xué)領(lǐng)域中,數(shù)據(jù)爆炸已成為一個(gè)突出的問題。海量數(shù)據(jù)在提供豐富信息,與擴(kuò)大人們視野的同時(shí),也帶來了數(shù)據(jù)處理和存儲等方面的難題,其主要表現(xiàn)在以下幾個(gè)方面:不同信息系統(tǒng)中存在著大量異構(gòu)數(shù)據(jù)源;數(shù)據(jù)缺乏統(tǒng)一的規(guī)范化組織方法;在某些領(lǐng)域,海量數(shù)據(jù)是以大量小文件形式存在,難以有效分析處理;此外,還需要解決海量數(shù)據(jù)的高效存儲問題等。近年來,云計(jì)算技術(shù)的不斷成熟和發(fā)展,為海量數(shù)據(jù)處理提供了一種新的有效方法。 本文以海量數(shù)據(jù)為研究對象,深入研究了云計(jì)算相關(guān)理論,并結(jié)合有關(guān)前沿思想,突破了云計(jì)算在海量數(shù)據(jù)處理中的若干關(guān)鍵技術(shù),建立了一套行之有效的海量數(shù)據(jù)分析處理方法。本文的主要內(nèi)容如下: (1)在已有云平臺各自特點(diǎn)基礎(chǔ)上,整合開源云平臺用于處理和存儲海量數(shù)據(jù),建立了一種新的基于云計(jì)算環(huán)境的海量小文件處理模型C-MSFPM (Cloud computing-Massive Small Files Process Model)。該模型針對小文件處理的特點(diǎn),通過基于MapReduce和特征向量減少的改進(jìn)KNN算法的進(jìn)行文件分類,建立文件索引機(jī)制,以及就近原則和權(quán)值相似度的文件合并算法,對海量小文件進(jìn)行處理。 (2)在海量小文件處理模型C-MSFPM基礎(chǔ)上,針對文件查詢過程中的復(fù)雜處理及內(nèi)容映射,構(gòu)建了基于XML和多Value的改進(jìn)MapReduce模型。該模型使用XML標(biāo)記數(shù)據(jù)的內(nèi)容、坐標(biāo)、操作映射等信息。對于海量數(shù)據(jù)的復(fù)雜處理,內(nèi)容映射的查詢,通過XML標(biāo)記及Map過程中的多Value處理,一次定位即可查詢到與數(shù)據(jù)相關(guān)的所有信息,極大地提高了數(shù)據(jù)處理效率。在此基礎(chǔ)上,針對海量PDF小文件的內(nèi)容映射查詢、排序,通過實(shí)驗(yàn)進(jìn)行多組數(shù)據(jù)的對比,試驗(yàn)表明了模型的算法正確,性能可靠。對于基于云平臺的車載信息數(shù)據(jù)處理,通過引進(jìn)資源池策略,解決海量數(shù)據(jù)傳輸中的數(shù)據(jù)包丟失問題。 (3)針對云存儲的問題,分析云存儲中的協(xié)調(diào)機(jī)制和虛擬化,從虛擬節(jié)點(diǎn)的性能引伸出虛擬存儲節(jié)點(diǎn)存儲效率值的概念,并討論了云存儲機(jī)制和任務(wù)調(diào)度。提出基于改進(jìn)遺傳算法的存儲任務(wù)分配機(jī)制和基于改進(jìn)動(dòng)態(tài)規(guī)劃的云存儲數(shù)據(jù)分配策略。這兩種算法大幅提高了存儲節(jié)點(diǎn)的利用率和優(yōu)化了系統(tǒng)負(fù)載均衡。
[Abstract]:With the rapid development of information technology, data explosion has become a prominent problem in many fields of science. Mass data not only provides abundant information, but also brings problems in data processing and storage, while expanding people's vision. It mainly shows in the following aspects: there are a large number of heterogeneous data sources in different information systems; In some fields, the massive data is in the form of a large number of small files, it is difficult to effectively analyze and process, in addition, we also need to solve the problem of efficient storage of mass data. In recent years, cloud computing technology continues to mature and develop, which provides a new and effective method for mass data processing. This paper takes massive data as the research object, deeply studies the cloud computing related theory, and combines the related frontier thought, breaks through some key technologies of cloud computing in the massive data processing, A set of effective analysis and processing method for mass data is established. The main contents of this paper are as follows: (1) based on the existing cloud platform, the open source cloud platform is integrated to process and store massive data. A new cloud computing-passive small Files process model (C-MSFPM) is proposed in this paper. According to the characteristics of small file processing, this model classifies files based on MapReduce and feature vector reduction, establishes file index mechanism, and combines file merging algorithm based on proximity principle and weight similarity. (2) based on C-MSFPM, an improved MapReduce model based on XML and multi-value is constructed for the complex processing and content mapping in the process of file query. The model uses XML markup data content, coordinates, operational mapping and other information. For the complex processing of massive data and the query of content mapping, all the information related to the data can be queried at one time by XML markup and multi-value processing in Map process, which greatly improves the efficiency of data processing. On this basis, the content mapping query and sorting of mass PDF small files are carried out. The experiments show that the algorithm of the model is correct and the performance of the model is reliable. For vehicle information data processing based on cloud platform, the problem of data packet loss in mass data transmission is solved by introducing resource pool strategy. (3) aiming at the problem of cloud storage, the coordination mechanism and virtualization in cloud storage are analyzed. The concept of storage efficiency value of virtual storage node is derived from the performance of virtual node, and the cloud storage mechanism and task scheduling are discussed. A storage task allocation mechanism based on improved genetic algorithm and a cloud storage data allocation strategy based on improved dynamic programming are proposed. These two algorithms greatly improve the utilization of storage nodes and optimize system load balancing.
【學(xué)位授予單位】:南京理工大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2013
【分類號】:TP333

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 陳曉華;李春芝;俞堅(jiān)奇;;虛擬主機(jī)云存儲系統(tǒng)整數(shù)規(guī)劃模型及最優(yōu)化分配算法[J];電信科學(xué);2011年01期

2 李明棟;孟昱;胡捷;;云計(jì)算關(guān)鍵技術(shù)及標(biāo)準(zhǔn)化[J];電信網(wǎng)技術(shù);2010年09期

3 龐麗萍,陳勇,吳松,肖儂;數(shù)據(jù)網(wǎng)格環(huán)境下的一種動(dòng)態(tài)備份模型[J];華中科技大學(xué)學(xué)報(bào)(自然科學(xué)版);2004年04期

4 李文中 ,顧鐵成 ,李春洪 ,陸桑璐 ,陳道蓄;GCaching——一種網(wǎng)格協(xié)同緩存系統(tǒng)[J];計(jì)算機(jī)研究與發(fā)展;2004年12期

5 郝秀蘭;陶曉鵬;徐和祥;胡運(yùn)發(fā);;kNN文本分類器類偏斜問題的一種處理對策[J];計(jì)算機(jī)研究與發(fā)展;2009年01期

6 王鵬;孟丹;詹劍鋒;涂碧波;;數(shù)據(jù)密集型計(jì)算編程模型研究進(jìn)展[J];計(jì)算機(jī)研究與發(fā)展;2010年11期

7 徐小龍;吳家興;楊庚;程春玲;王汝傳;;基于大規(guī)模廉價(jià)計(jì)算平臺的海量數(shù)據(jù)處理系統(tǒng)的研究[J];計(jì)算機(jī)應(yīng)用研究;2012年02期

8 周敬利;周正達(dá);;改進(jìn)的云存儲系統(tǒng)數(shù)據(jù)分布策略[J];計(jì)算機(jī)應(yīng)用;2012年02期

9 張桂剛;李超;張勇;邢春曉;;一種基于海量信息處理的云存儲模型研究[J];計(jì)算機(jī)研究與發(fā)展;2012年S1期

10 戴元順;;云計(jì)算技術(shù)簡述[J];信息通信技術(shù);2010年02期

相關(guān)博士學(xué)位論文 前5條

1 龍柏;并行計(jì)算平臺上的數(shù)據(jù)索引技術(shù)研究[D];中國科學(xué)技術(shù)大學(xué);2011年

2 康俊鋒;云計(jì)算環(huán)境下高分辨率遙感影像存儲與高效管理技術(shù)研究[D];浙江大學(xué);2011年

3 曾志;云格環(huán)境下海量高分遙感影像資源與服務(wù)高效調(diào)配研究[D];浙江大學(xué);2012年

4 陳海波;云計(jì)算平臺可信性增強(qiáng)技術(shù)的研究[D];復(fù)旦大學(xué);2008年

5 張東;中國互聯(lián)網(wǎng)信息治理模式研究[D];中國人民大學(xué);2010年

,

本文編號:2077760

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2077760.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶edb54***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com