HEVC幀內(nèi)預(yù)測單元的硬件設(shè)計(jì)
發(fā)布時(shí)間:2018-06-01 14:58
本文選題:HEVC + 幀內(nèi)預(yù)測; 參考:《西安電子科技大學(xué)》2015年碩士論文
【摘要】:視頻作為多媒體信息的重要組成部分,近年來呈現(xiàn)出以超高分辨率為特征的新趨勢,而以宏塊為基礎(chǔ)的H.264/AVC壓縮標(biāo)準(zhǔn)卻越來越難以滿足高清和超高清視頻的壓縮需求。為適應(yīng)超高分辨率視頻的壓縮需求,國際電聯(lián)和國際標(biāo)準(zhǔn)化組織共同提出了HEVC視頻壓縮標(biāo)準(zhǔn),其壓縮效率相比于H.264/AVC提高了近一倍。但是算法改進(jìn)的同時(shí)也引起了復(fù)雜度的提升,這也給實(shí)時(shí)編碼器的設(shè)計(jì)帶來了巨大的挑戰(zhàn)。由于FPGA在數(shù)據(jù)的處理速度上有著通用處理器無法比擬的巨大優(yōu)勢,因此對算法尤其是幀內(nèi)預(yù)測算法的硬件化設(shè)計(jì)成了近年來研究的熱點(diǎn)。此外,利用Xilinx公司推出的Vivado-HLS工具能將軟件代碼描述的硬件電路進(jìn)行RTL級的綜合實(shí)現(xiàn)和驗(yàn)證,與傳統(tǒng)的基于Verilog/VHDL等硬件描述語言的硬件開發(fā)相比,使用HLS工具能夠方便的對模塊的設(shè)計(jì)架構(gòu)進(jìn)行不斷地迭代優(yōu)化,從而極大地縮短了硬件設(shè)計(jì)開發(fā)的周期,這也逐漸成為了FPGA開發(fā)的新方式。本文首先介紹了HEVC幀內(nèi)預(yù)測的算法,然后根據(jù)幀內(nèi)預(yù)測中參考點(diǎn)預(yù)處理模塊和粗選模塊的特點(diǎn)設(shè)計(jì)了適合硬件實(shí)現(xiàn)的流水和并行架構(gòu):(1)參考點(diǎn)預(yù)處理模塊的流水設(shè)計(jì)。HEVC標(biāo)準(zhǔn)軟件中預(yù)處理算法的可用性判斷、賦值和平滑是串行處理的。模塊的時(shí)鐘延遲很大,吞吐率不高,為降低延遲和提高吞吐率,設(shè)計(jì)了一種預(yù)處理模塊的流水結(jié)構(gòu),使其數(shù)據(jù)吞吐率相比于標(biāo)準(zhǔn)算法的串行緩存結(jié)構(gòu)提高了四倍。(2)粗選單元模塊的并行化設(shè)計(jì);诖诌x模塊中存在的并行特性,設(shè)計(jì)了以8*8塊處理單元為基礎(chǔ)的、支持不同塊尺寸的粗選單元模塊處理架構(gòu)。其中,8*8塊處理單元采用64點(diǎn)全并行的方式進(jìn)行預(yù)測和STAD值的計(jì)算,且不同塊間的預(yù)測和STAD計(jì)算均以流水的方式進(jìn)行,使模塊的數(shù)據(jù)吞吐率提高到1.5Gbps。本文在設(shè)計(jì)上述兩種硬件架構(gòu)的基礎(chǔ)上,還使用Vivado-HLS工具對兩種硬件架構(gòu)進(jìn)行了實(shí)現(xiàn),并解決了實(shí)現(xiàn)過程中存在的影響硬件并行化設(shè)計(jì)的數(shù)據(jù)依賴性問題。最后,對設(shè)計(jì)實(shí)現(xiàn)的兩種硬件架構(gòu)進(jìn)行了RTL級的仿真測試。仿真結(jié)果顯示,本文設(shè)計(jì)實(shí)現(xiàn)的硬件架構(gòu)能夠有效地提高HEVC幀內(nèi)壓縮的效率。
[Abstract]:Video, as an important part of multimedia information, has shown a new trend of ultra-high resolution in recent years. However, the H.264/AVC compression standard based on macroblock is becoming more and more difficult to meet the demand of high-definition and ultra-high-definition video compression. In order to meet the demand of ultra high resolution video compression, ITU and the International Organization for Standardization (ISO) jointly put forward the HEVC video compression standard. The compression efficiency of the standard is nearly twice as high as that of H.264/AVC. However, the improvement of the algorithm also leads to the increase of complexity, which brings a great challenge to the design of real-time encoder. Because FPGA has an incomparable advantage in data processing speed, the hardware design of the algorithm, especially the intra prediction algorithm, has become a hot topic in recent years. In addition, the hardware circuit described by software code can be implemented and verified at RTL level by using the Vivado-HLS tool developed by Xilinx, which is compared with the traditional hardware development based on Verilog/VHDL and other hardware description languages. Using HLS tools can easily optimize the design architecture of the module, which greatly shortens the cycle of hardware design and development, which has gradually become a new way of FPGA development. In this paper, we first introduce the intra prediction algorithm of HEVC. Then according to the characteristics of reference point preprocessing module and rough selection module in intra prediction, the pipeline design of reference point preprocessing module suitable for hardware implementation and parallel architecture: 1) the usability judgment of preprocessing algorithm in HEVC standard software are designed. Assignment and smoothing are serially processed. The clock delay of the module is very large and the throughput is not high. In order to reduce the delay and improve the throughput, a pipeline structure of the preprocessing module is designed. Compared with the serial buffer structure of the standard algorithm, the data throughput is improved by four times. Based on the parallel characteristics of rough selection module, a processing architecture of rough selection unit module with different block sizes is designed, which is based on 8 pieces of processing units. The data throughput of the module is increased to 1.5 Gbps because the data throughput of the module is improved to 1.5Gbps. the prediction and STAD calculation among the different blocks are carried out in the way of pipelining. On the basis of the design of the two kinds of hardware architecture, this paper also uses Vivado-HLS tools to implement the two kinds of hardware architecture, and solves the problem of data dependence which affects the hardware parallelization design in the process of implementation. Finally, the RTL level simulation test is carried out on the two kinds of hardware architecture designed and implemented. Simulation results show that the hardware architecture designed in this paper can effectively improve the efficiency of HEVC intra compression.
【學(xué)位授予單位】:西安電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號(hào)】:TN919.81
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 張峻;董蘭芳;余家奎;;基于紋理的HEVC快速幀內(nèi)預(yù)測算法[J];電子技術(shù);2015年09期
2 張永飛;李哲;趙明菲;李波;;面向下一代高性能視頻編碼標(biāo)準(zhǔn)HEVC的快速殘差四叉樹編碼算法(英文)[J];中國通信;2013年10期
3 魏小文;石旭利;趙子武;;基于視覺冗余模型的碼率壓縮方法[J];電視技術(shù);2011年09期
4 王正宏;;MPEG-4視頻壓縮技術(shù)及其應(yīng)用[J];電視字幕(特技與動(dòng)畫);2006年07期
,本文編號(hào):1964589
本文鏈接:http://sikaile.net/kejilunwen/wltx/1964589.html
最近更新
教材專著