注意力引導(dǎo)的高效視頻解碼及顯示研究
發(fā)布時(shí)間:2019-02-15 01:19
【摘要】:對(duì)于視頻通信而言,其信宿,也就是最終接受者,是一個(gè)人類觀測(cè)者。對(duì)于這一點(diǎn),很多傳統(tǒng)的視頻編解碼方式均沒(méi)有考慮到。本文圍繞這個(gè)問(wèn)題,考慮了人類視覺(jué)系統(tǒng)在視線與視野、強(qiáng)度與對(duì)比上面的特點(diǎn),結(jié)合人類視覺(jué)系統(tǒng)的視覺(jué)選擇性特性,設(shè)計(jì)了一個(gè)注意力引導(dǎo)的高效視頻解碼及顯示系統(tǒng),主要工作如下:(1)首先對(duì)人類視覺(jué)系統(tǒng)的特性進(jìn)行系統(tǒng)的研究,從人眼的生理構(gòu)造入手,研究了人類視覺(jué)系統(tǒng)的功能,然后研究了人類視覺(jué)系統(tǒng)在視線與視野、強(qiáng)度與對(duì)比上面的特點(diǎn)。基于以上的基礎(chǔ)研究,設(shè)計(jì)了一個(gè)基于注意力引導(dǎo)的高效視頻解碼及顯示系統(tǒng)的結(jié)構(gòu)。(2)在實(shí)現(xiàn)所設(shè)計(jì)的高效視頻解碼及顯示系統(tǒng)中,對(duì)其中兩個(gè)關(guān)鍵技術(shù)要點(diǎn)做了重點(diǎn)的研究,第一個(gè)是多個(gè)數(shù)據(jù)流的幀融合,通過(guò)二值化、腐蝕、高斯模糊等操作后結(jié)合一個(gè)加權(quán)公式得到融合后圖像,這種方法可以消除對(duì)多數(shù)據(jù)流的圖像進(jìn)行簡(jiǎn)單拼接留下的邊緣效應(yīng)。第二個(gè)是基于視頻序列在時(shí)間維度上的冗余進(jìn)行背景區(qū)域高分辨率重建。利用上一幀中部分信息對(duì)當(dāng)前幀中的部分背景區(qū)域進(jìn)行重建,從而提高該區(qū)域的清晰度。通常方法僅考慮背景區(qū)域與高分辨區(qū)域之間的相似性,重建效果較差。本文利用最優(yōu)化理論中的懲罰函數(shù)法對(duì)遺傳算法中的目標(biāo)函數(shù)進(jìn)行了優(yōu)化,結(jié)合了兩幀高分辨區(qū)域之間的聯(lián)系,使得重建的結(jié)果與周圍圖像有更好的匹配效果。(3)為量化對(duì)比本文方法與傳統(tǒng)方法在相同碼率條件下得到的視頻質(zhì)量之間的差異,研究了兩種視頻客觀質(zhì)量評(píng)價(jià)算法:峰值信噪比PSNR和結(jié)構(gòu)相似性SSIM,其中重點(diǎn)研究了現(xiàn)在被廣泛使用的SSIM算法。研究過(guò)程中發(fā)現(xiàn)SSIM算法在模糊失真下表現(xiàn)不佳,給出的評(píng)價(jià)值與人眼觀感相悖。本文結(jié)合圖像的邊緣信息和感興趣區(qū)域信息,提出了一個(gè)結(jié)合了人眼注意力特性的改進(jìn)SSIM算法。實(shí)驗(yàn)結(jié)果顯示經(jīng)非線性擬合后對(duì)測(cè)試序列的主觀評(píng)價(jià)值進(jìn)行估計(jì),相比傳統(tǒng)SSIM算法相對(duì)誤差下降了約50%。這表明本文改進(jìn)后的SSIM算法更接近人眼的主觀感受。(4)使用C++語(yǔ)言和ffmpeg等開(kāi)源庫(kù),編寫(xiě)了一個(gè)注意力引導(dǎo)的高效視頻解碼及顯示程序,實(shí)現(xiàn)了所設(shè)計(jì)的視頻解碼及顯示系統(tǒng),實(shí)驗(yàn)表明本系統(tǒng)實(shí)現(xiàn)了設(shè)計(jì)目標(biāo)。使用本文改進(jìn)的SSIM算法,進(jìn)行了本文方法與傳統(tǒng)H.264方法的對(duì)比評(píng)估,結(jié)果表明在碼率有限時(shí)本文方法得到的視頻客觀質(zhì)量相比H.264提升了約15%,主觀質(zhì)量提升了約30%。
[Abstract]:For video communication, its destination, which is the ultimate receiver, is a human observer. For this point, many traditional video coding and decoding methods are not taken into account. This paper focuses on this problem, considering the features of human visual system in sight and visual field, intensity and contrast, and combining the visual selectivity of human visual system, designs an attention-guided video decoding and display system. The main works are as follows: (1) firstly, the characteristics of human visual system are systematically studied, and the function of human visual system is studied from the physiological structure of human eye, and then the human visual system in sight and field of vision is studied. Strength and contrast above the characteristics. Based on the above basic research, the structure of an efficient video decoding and display system based on attention-guided is designed. (2) in the implementation of the designed high-efficiency video decoding and display system, The first is the frame fusion of multiple data streams. The fusion image is obtained by binarization, corrosion, Gao Si fuzziness and a weighted formula. This method can eliminate the edge effect caused by simple stitching of multi-stream images. The second is high resolution reconstruction of background region based on the redundancy of video sequence in time dimension. Part of the background region in the current frame is reconstructed by using partial information in the previous frame to improve the clarity of the region. Usually, only the similarity between the background region and the high resolution region is considered, and the reconstruction effect is poor. In this paper, the objective function in genetic algorithm is optimized by using the penalty function method in optimization theory, and the relation between two frames of high resolution region is combined. The result of reconstruction has better matching effect with the surrounding image. (3) in order to quantify the difference between the video quality obtained by this method and the traditional method at the same bit rate, In this paper, two kinds of video objective quality evaluation algorithms: peak signal-to-noise ratio (PSNR) PSNR and structural similarity SSIM, are studied. The emphasis is on the widely used SSIM algorithm. In the course of the study, it is found that the SSIM algorithm performs poorly under fuzzy distortion, and the evaluation value given is contrary to the human visual perception. In this paper, an improved SSIM algorithm is proposed, which combines the edge information and the region of interest information of the image. The experimental results show that the subjective evaluation value of the test sequence is estimated by nonlinear fitting, and the relative error of the traditional SSIM algorithm is reduced by about 50%. This shows that the improved SSIM algorithm is closer to the subjective perception of the human eye. (4) using C language and ffmpeg open source libraries, a high-efficiency video decoding and display program with attention-guided is developed. The design of video decoding and display system is realized, and the experiment shows that the system achieves the design goal. Using the improved SSIM algorithm in this paper, the comparison between the proposed method and the traditional H.264 method is carried out. The results show that the objective quality of the video obtained by this method is about 15% higher than that of H. 264 when the code rate is limited. Subjective quality has increased by about 30.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TN919.8
本文編號(hào):2422807
[Abstract]:For video communication, its destination, which is the ultimate receiver, is a human observer. For this point, many traditional video coding and decoding methods are not taken into account. This paper focuses on this problem, considering the features of human visual system in sight and visual field, intensity and contrast, and combining the visual selectivity of human visual system, designs an attention-guided video decoding and display system. The main works are as follows: (1) firstly, the characteristics of human visual system are systematically studied, and the function of human visual system is studied from the physiological structure of human eye, and then the human visual system in sight and field of vision is studied. Strength and contrast above the characteristics. Based on the above basic research, the structure of an efficient video decoding and display system based on attention-guided is designed. (2) in the implementation of the designed high-efficiency video decoding and display system, The first is the frame fusion of multiple data streams. The fusion image is obtained by binarization, corrosion, Gao Si fuzziness and a weighted formula. This method can eliminate the edge effect caused by simple stitching of multi-stream images. The second is high resolution reconstruction of background region based on the redundancy of video sequence in time dimension. Part of the background region in the current frame is reconstructed by using partial information in the previous frame to improve the clarity of the region. Usually, only the similarity between the background region and the high resolution region is considered, and the reconstruction effect is poor. In this paper, the objective function in genetic algorithm is optimized by using the penalty function method in optimization theory, and the relation between two frames of high resolution region is combined. The result of reconstruction has better matching effect with the surrounding image. (3) in order to quantify the difference between the video quality obtained by this method and the traditional method at the same bit rate, In this paper, two kinds of video objective quality evaluation algorithms: peak signal-to-noise ratio (PSNR) PSNR and structural similarity SSIM, are studied. The emphasis is on the widely used SSIM algorithm. In the course of the study, it is found that the SSIM algorithm performs poorly under fuzzy distortion, and the evaluation value given is contrary to the human visual perception. In this paper, an improved SSIM algorithm is proposed, which combines the edge information and the region of interest information of the image. The experimental results show that the subjective evaluation value of the test sequence is estimated by nonlinear fitting, and the relative error of the traditional SSIM algorithm is reduced by about 50%. This shows that the improved SSIM algorithm is closer to the subjective perception of the human eye. (4) using C language and ffmpeg open source libraries, a high-efficiency video decoding and display program with attention-guided is developed. The design of video decoding and display system is realized, and the experiment shows that the system achieves the design goal. Using the improved SSIM algorithm in this paper, the comparison between the proposed method and the traditional H.264 method is carried out. The results show that the objective quality of the video obtained by this method is about 15% higher than that of H. 264 when the code rate is limited. Subjective quality has increased by about 30.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TN919.8
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 朱宏;蔣剛毅;王曉東;陳芬;郁梅;邵楓;彭宗舉;;一種基于人眼視覺(jué)特性的視頻質(zhì)量評(píng)價(jià)算法[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2014年05期
2 靳鑫;蔣剛毅;陳芬;郁梅;邵楓;彭宗舉;Yo-Sung Ho;;基于結(jié)構(gòu)相似度的自適應(yīng)圖像質(zhì)量評(píng)價(jià)[J];光電子.激光;2014年02期
3 田浩南;李素梅;;基于邊緣的SSIM圖像質(zhì)量客觀評(píng)價(jià)方法[J];光子學(xué)報(bào);2013年01期
相關(guān)碩士學(xué)位論文 前1條
1 蔣敏;基于HVS的多描述視頻壓縮編解碼系統(tǒng)研究[D];西安電子科技大學(xué);2012年
,本文編號(hào):2422807
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2422807.html
最近更新
教材專著