基于多核的Loeffler算法的并行優(yōu)化與實(shí)現(xiàn)
本文選題:DCT變換 + 多核平臺(tái) ; 參考:《鄭州大學(xué)》2013年碩士論文
【摘要】:隨著計(jì)算機(jī)技術(shù)和通信技術(shù)的快速發(fā)展,特別是大數(shù)據(jù)時(shí)代的到來(lái),對(duì)計(jì)算機(jī)效率、計(jì)算能力的要求也越來(lái)越高。多核處理器已經(jīng)日益普及,充分利用多核提供的高性能開(kāi)發(fā)并行軟件,逐步取代串行軟件,將是未來(lái)軟件行業(yè)的發(fā)展趨勢(shì)。所以,對(duì)于傳統(tǒng)的串行算法,在多核平臺(tái)上實(shí)現(xiàn)算法的并行優(yōu)化,對(duì)于提高計(jì)算機(jī)運(yùn)行效率意義重大。 本文首先介紹了多核平臺(tái)上并行優(yōu)化快速DCT變換涉及到的多核、多線程、線程同步和并行計(jì)算等一些關(guān)鍵技術(shù),以及快速DCT變換涉及到的數(shù)學(xué)基礎(chǔ)知識(shí);在此基礎(chǔ)上簡(jiǎn)要描述了壓縮解壓縮過(guò)程,并討論快速DCT變換的整體結(jié)構(gòu)。同時(shí),通過(guò)對(duì)變換的核心部分-循環(huán)變換的分析,提出基于數(shù)據(jù)分解的分解方式對(duì)快速DCT變換之一的Loeffler變換進(jìn)行并行優(yōu)化,詳細(xì)描述分解過(guò)程,同時(shí)進(jìn)行可行性分析。 根據(jù)DCT變換的本身特征,特別是二維DCT變換的特征,采用數(shù)據(jù)分解的方法進(jìn)行分析。常規(guī)的DCT變換采用雙重For循環(huán)來(lái)進(jìn)行DCT變換,就是采用循環(huán)嵌套來(lái)逐步實(shí)現(xiàn)對(duì)輸入系數(shù)的轉(zhuǎn)換工作。快速DCT則是利用DCT變換的可分解性,先進(jìn)行行變換,再進(jìn)行列變換。本文對(duì)典型的快速變換算法之一Loeffler,進(jìn)行進(jìn)一步的數(shù)據(jù)分解,利用多核多線程的優(yōu)越性實(shí)現(xiàn)Loeffler算法的并行優(yōu)化。實(shí)驗(yàn)結(jié)果表明:常規(guī)DCT變換、快速DCT變換、并行優(yōu)化快速DCT變換,其執(zhí)行效率在逐步遞增。特別是并行化后的快速Loeffler變換在特定情況下有明顯的時(shí)間優(yōu)越性。在處理大數(shù)據(jù)量的圖片信息時(shí),性能提升更加明顯。實(shí)驗(yàn)證明當(dāng)數(shù)據(jù)規(guī)模在1600×1200時(shí)效率可提高30%左右,且隨著數(shù)據(jù)規(guī)模的增加,效率提升趨于平穩(wěn)?傊,對(duì)快速Loeffler變換的并行化所帶來(lái)的性能優(yōu)勢(shì)基本符合理論預(yù)期目標(biāo)。
[Abstract]:With the rapid development of computer technology and communication technology, especially the arrival of the era of large data, the demand for computer efficiency and computing power is becoming higher and higher. Multi core processors have become increasingly popular. It is the trend of the future software industry to make full use of high performance and parallel software provided by multi-core and gradually replace serial software. For traditional serial algorithm, parallel optimization of algorithm on multi-core platform is of great significance for improving the efficiency of computer operation.
This paper first introduces some key technologies, such as multicore, multithreading, thread synchronization and parallel computing, as well as the basic knowledge involved in fast DCT transform, and then briefly describes the compression and decompression process, and discusses the whole structure of fast DCT transform. After the analysis of the core part of the transformation - cyclic transformation, the decomposition method based on data decomposition is proposed to optimize the Loeffler transform of one of the fast DCT transformations. The decomposition process is described in detail, and the feasibility analysis is also carried out.
According to the characteristics of the DCT transformation, especially the characteristics of the two-dimensional DCT transformation, the method of data decomposition is used to analyze. The conventional DCT transformation uses a double For cycle to carry out DCT transformation. It is the use of cyclic nesting to gradually realize the conversion of the input coefficients. The fast DCT is the decomposability of the DCT transformation and the advanced line transformation. In this paper, one of the typical fast transformation algorithms, Loeffler, is further decomposed, and the parallel optimization of the Loeffler algorithm is realized by using the superiority of multi core and multithreading. The experimental results show that the efficiency of the conventional DCT transform, fast DCT transform and parallel optimization for fast DCT change is gradually increasing. The improved fast Loeffler transform has obvious time superiority in a particular case. The performance enhancement is more obvious when processing large data amount of picture information. The experiment proves that when the data scale is 1600 * 1200, the efficiency can be increased by about 30%, and the efficiency lifting tends to be stable with the increase of data size. In a word, fast Loeffler transform is used. The performance advantages brought by parallelization are basically in line with the theoretical expectations.
【學(xué)位授予單位】:鄭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP338.6
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 王茂芝,徐文皙,佘春東,車(chē)著明;基于數(shù)據(jù)分割的二維離散余弦變換并行算法及其在圖像壓縮中的應(yīng)用[J];成都理工大學(xué)學(xué)報(bào)(自然科學(xué)版);2004年04期
2 吳丹;傅秀芬;蘇磊;林喬捷;;多線程編程模型的研究與應(yīng)用[J];廣東工業(yè)大學(xué)學(xué)報(bào);2008年01期
3 吳繼雁;;多核技術(shù)及發(fā)展趨勢(shì)[J];哈爾濱軸承;2007年02期
4 羅天煦;鄺繼順;;一種基于Loeffler算法的快速實(shí)現(xiàn)2D DCT/IDCT的方法[J];計(jì)算機(jī)應(yīng)用研究;2007年01期
5 張鳳梅,洪運(yùn)國(guó);Windows98下多線程編程方法研究[J];遼寧師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2003年03期
6 姜攀;;并行計(jì)算機(jī)的比較分析[J];軟件導(dǎo)刊;2010年06期
7 劉軒;劉子軼;徐考基;;基于Loeffler快速DCT算法的JPEG壓縮在DSP上的實(shí)現(xiàn)[J];數(shù)據(jù)采集與處理;2009年S1期
8 黃國(guó)睿;張平;魏廣博;;多核處理器的關(guān)鍵技術(shù)及其發(fā)展趨勢(shì)[J];計(jì)算機(jī)工程與設(shè)計(jì);2009年10期
9 湯春林;MFC多線程技術(shù)在多媒體教學(xué)系統(tǒng)上的應(yīng)用[J];微處理機(jī);2002年01期
10 劉近光;梁滿貴;;多核多線程處理器的發(fā)展及其軟件系統(tǒng)架構(gòu)[J];微處理機(jī);2007年01期
相關(guān)博士學(xué)位論文 前3條
1 戴曉龍;基于DCT-Ⅱ系數(shù)的圖像采樣率變換的算法研究及其結(jié)構(gòu)設(shè)計(jì)[D];復(fù)旦大學(xué);2003年
2 殷瑞祥;DCT快速新算法及濾波器結(jié)構(gòu)研究與子波變換域圖像降噪研究[D];華南理工大學(xué);2000年
3 李文明;基于重疊變換和小波變換的圖像壓縮研究[D];山東大學(xué);2008年
相關(guān)碩士學(xué)位論文 前6條
1 李秀芳;基于多核的多線程算法并行優(yōu)化[D];鄭州大學(xué);2010年
2 黃琮軒;基于GPU的離散余弦變換并行程序設(shè)計(jì)[D];暨南大學(xué);2011年
3 葉曉敏;基于多核處理器并行加速EDA算法研究[D];復(fù)旦大學(xué);2011年
4 徐金棒;基于多核多線程的FFT算法和堆排序算法的并行優(yōu)化和實(shí)現(xiàn)[D];鄭州大學(xué);2011年
5 周益民;圖像處理并行算法的研究[D];電子科技大學(xué);2006年
6 師攀攀;基于多核的AES算法的并行優(yōu)化與實(shí)現(xiàn)[D];鄭州大學(xué);2012年
,本文編號(hào):1974324
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1974324.html