MapReduce模型下的圖像并行化處理研究
本文關(guān)鍵詞: Hadoop MapReduce模型 圖像并行化 出處:《西安科技大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:近年來(lái),伴隨著云計(jì)算和大數(shù)據(jù)的興起,網(wǎng)絡(luò)中各應(yīng)用領(lǐng)域所產(chǎn)生的數(shù)據(jù)量快速增長(zhǎng),已經(jīng)達(dá)到PB級(jí)別甚至更高。在這些數(shù)據(jù)中,圖像大數(shù)據(jù)的處理與存儲(chǔ)已經(jīng)成為各界研究的熱點(diǎn)。MapReduce技術(shù)是一種高可靠的并行編程框架,常用于進(jìn)行大數(shù)據(jù)量的并行計(jì)算,對(duì)復(fù)雜的集群環(huán)境問(wèn)題有著很好的解決方案。Hadoop平臺(tái)的核心之一MapReduce模型就是利用該項(xiàng)技術(shù)處理海量數(shù)據(jù),目前已取得了較好成效,但不足的是對(duì)于圖像文件的處理,尤其是海量小圖像文件的研究還不成熟。本文針對(duì)于此問(wèn)題,探討了在MapReduce模型下圖像的并行化處理,給出了一種可以作為處理海量小圖像文件的新的數(shù)據(jù)平臺(tái)基礎(chǔ)架構(gòu),主要研究的內(nèi)容及研究成果如下:首先,本文綜述了在大數(shù)據(jù)背景下,海量圖像數(shù)據(jù)處理的研究背景、研究現(xiàn)狀及意義,并介紹了 Hadoop生態(tài)系統(tǒng),包括其核心HDFS系統(tǒng)和MapReduce框架技術(shù)。其次,詳細(xì)設(shè)計(jì)了如何改進(jìn)Hadoop系統(tǒng),提出組合分片方法,提高了海量小圖片的處理效率,擴(kuò)展了 MapReduce軟件框架,使之能夠很好的支持并處理圖像文件。最后,本文設(shè)計(jì)并實(shí)現(xiàn)了在MapReduce模型下,圖像的并行K-means聚類算法分析、圖像的并行Sobel邊緣檢測(cè)算法、擴(kuò)展了圖像在MapReduce模型下的并行化直方圖提取等操作。通過(guò)實(shí)驗(yàn)驗(yàn)證,擴(kuò)展后的MapReduce模型的可行性及處理圖像文件的高效性。通過(guò)指標(biāo)性能分析,驗(yàn)證了 MapReduce模型下進(jìn)行圖像并行化處理的有效性,為在Hadoop平臺(tái)下處理海量大數(shù)據(jù)圖像文件的應(yīng)用提供了一種可行的解決方案。
[Abstract]:In recent years, with the rise of cloud computing and big data, the amount of data generated in various application areas of the network has grown rapidly, reaching the PB level or higher. The processing and storage of image big data has become a hot topic of research. MapReduce technology is a high reliable parallel programming framework, which is often used for parallel computing of large amount of data. There is a good solution to the complex cluster environment problem. One of the core of the Hadoop platform MapReduce model is to use this technology to process massive data. At present, it has achieved good results, but the lack of the image file processing, Especially, the research of massive small image files is not mature. In this paper, we discuss the parallelization of images under MapReduce model, and give a new data platform infrastructure which can be used to deal with large amount of small image files. The main research contents and results are as follows: firstly, this paper summarizes the research background, research status and significance of massive image data processing under the background of big data, and introduces the Hadoop ecosystem. It includes its core HDFS system and MapReduce framework technology. Secondly, how to improve the Hadoop system is designed in detail, and the combined slicing method is put forward, which improves the processing efficiency of massive small images and extends the MapReduce software framework. Finally, this paper designs and implements the parallel K-means clustering algorithm under the MapReduce model, and the parallel Sobel edge detection algorithm of the image. The parallel histogram extraction of image under MapReduce model is extended. The feasibility of the extended MapReduce model and the efficiency of processing image files are verified by experiments. The validity of parallel image processing based on MapReduce model is verified, which provides a feasible solution for processing massive big data image files on Hadoop platform.
【學(xué)位授予單位】:西安科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 謝雪蓮;李蘭友;;基于云計(jì)算的并行K-means聚類算法研究[J];計(jì)算機(jī)測(cè)量與控制;2014年05期
2 趙慶;;基于Hadoop平臺(tái)下的Canopy-Kmeans高效算法[J];電子科技;2014年02期
3 朱長(zhǎng)明;張新;駱劍承;李萬(wàn)慶;楊紀(jì)偉;;基于樣本自動(dòng)選擇與SVM結(jié)合的海岸線遙感自動(dòng)提取[J];國(guó)土資源遙感;2013年02期
4 黃斌;許舒人;蒲衛(wèi);;基于MapReduce的數(shù)據(jù)挖掘平臺(tái)設(shè)計(jì)與實(shí)現(xiàn)[J];計(jì)算機(jī)工程與設(shè)計(jì);2013年02期
5 冀素琴;石洪波;;面向海量數(shù)據(jù)的K-means聚類優(yōu)化算法[J];計(jì)算機(jī)工程與應(yīng)用;2014年14期
6 崔朝國(guó);劉志明;李婧;陳曉凡;;一種基于Hadoop的Scool云存儲(chǔ)平臺(tái)[J];電腦知識(shí)與技術(shù);2013年02期
7 阮濤;那彥;王澍;;基于壓縮感知的遙感圖像融合方法[J];電子科技;2012年04期
8 仇李寅;邱衛(wèi)東;蘇芊;廖凌;;基于Hadoop的分布式哈希算法實(shí)現(xiàn)[J];信息安全與通信保密;2011年11期
9 多雪松;張晶;高強(qiáng);;基于Hadoop的海量數(shù)據(jù)管理系統(tǒng)[J];微計(jì)算機(jī)信息;2010年13期
10 袁春蘭;熊宗龍;周雪花;彭小輝;;基于Sobel算子的圖像邊緣檢測(cè)研究[J];激光與紅外;2009年01期
相關(guān)碩士學(xué)位論文 前1條
1 霍樹(shù)民;基于Hadoop的海量影像數(shù)據(jù)管理關(guān)鍵技術(shù)研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2010年
,本文編號(hào):1497555
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1497555.html