基于視覺感知的3D視頻編碼方法研究
發(fā)布時間:2018-11-19 08:10
【摘要】:3D-HEVC能夠較好地去除3D視頻中的時空域、視點間的冗余,但未能夠很好地去除感知冗余,而感知編碼的應用能夠在保證主觀質(zhì)量不變的前提下,進一步去除感知冗余從而降低編碼復雜度或節(jié)省編碼碼率。因此如何構(gòu)建視覺感知模型以及如何應用到3D視頻編碼當中,是當前3D視頻編碼的研究熱點。為此,本學位論文基于3D-HEVC編碼標準,從視覺感知角度出發(fā),對3D視頻的低復雜度編碼和率失真優(yōu)化兩個核心技術(shù)展開研究。針對深度圖編碼復雜度較高,本文提出一種基于虛擬視點合成的快速深度圖編碼方案。面向三維視頻系統(tǒng)采用最大可容忍深度失真(Maximum Tolerated Depth Distortions,MTDD)模型,首先根據(jù)MTDD值的可容忍性不同,給出應用于不同類型深度范圍提前決策算法;然后,檢測是否對繪制失真敏感的豎直邊緣區(qū)域,根據(jù)不同的搜索策略進行模式?jīng)Q策;最后,融合這兩個算法進一步降低深度視頻編碼的復雜度。實驗結(jié)果表明,所提出的算法在保證繪制虛擬視點質(zhì)量和編碼碼率基本不變的情況下,降低了49.45%的編碼時間。針對立體視頻中存在著大量的感知冗余,本文提出了一種基于中心凹的雙目恰可察覺編碼失真(Foveated Binocular Just-Noticeable Coding Distortion,FBJNCD)模型。首先通過主觀實驗研究梯度幅值和紋理幅值對立體掩蔽效應的影響;同時考慮到人類視覺特性(Human Visual System,HVS)的視覺敏感度并非恒定不變,當視網(wǎng)膜離心率變大時,像素的視覺閾值也隨之變大,因此結(jié)合HVS的視網(wǎng)膜中心凹感知特性;最后將FBJNCD并將其應用于多視點高效視頻編碼(Multi View-High Efficiency Video Coding,MV-HEVC)測試平臺中對立體視頻進行非對稱編碼。實驗結(jié)果表明所提出模型在保持立體視頻感知質(zhì)量的同時,平均能夠節(jié)省26.04%的編碼碼率,提高立體視頻壓縮效率。針對傳統(tǒng)的恰可察覺失真(Just-Noticeable Distortion,JND)模型很難應用于立體視頻當中且存在高估前景區(qū)域的視覺閾值和低估背景區(qū)域的視覺閾值的問題,因此本文提出了一種能夠應用于立體視頻的恰可察覺失真(Stereo Just-Noticeable Distortion,SJND)模型。首先,利用視差信息把傳統(tǒng)的JND分為前背景區(qū)域,對前景區(qū)域賦予較小的閾值,對背景區(qū)域賦予較大的閾值。同時考慮到前景區(qū)域中人比較關(guān)注視覺中心區(qū)域和視差較大的區(qū)域,因此基于這兩個規(guī)則提出了一種新的顯著圖,并給不同顯著性區(qū)域賦予不同的量化參數(shù)(QP)值。實驗結(jié)果表明所提出方法在保證立體視頻質(zhì)量不變的前提下,平均能夠節(jié)省19.92%的碼率。
[Abstract]:3D-HEVC can remove spatio-temporal domain and redundancy between viewpoints of 3D video, but it can not remove perceptual redundancy. However, the application of perceptual coding can ensure that the subjective quality remains the same. Further remove perceptual redundancy to reduce coding complexity or save coding rate. Therefore, how to construct visual perception model and how to apply it to 3D video coding is a hot topic in 3D video coding. Therefore, based on the 3D-HEVC coding standard, this dissertation focuses on two core technologies of low complexity coding and rate-distortion optimization for 3D video from the perspective of visual perception. In view of the high complexity of depth map coding, this paper proposes a fast depth map coding scheme based on virtual view synthesis. The maximum tolerance depth distortion (Maximum Tolerated Depth Distortions,MTDD) model is adopted for 3D video systems. Firstly, according to the different tolerance of MTDD values, an early decision algorithm for different depth ranges is proposed. Then, the vertical edge region which is sensitive to rendering distortion is detected, and the pattern decision is made according to different search strategies. Finally, the fusion of these two algorithms further reduces the complexity of depth video coding. The experimental results show that the proposed algorithm can reduce the coding time by 49.45% under the condition that the quality of rendering virtual view and the coding rate are not changed. In view of the large amount of perceptual redundancy in stereo video, a binocular exactly detectable coding distortion (Foveated Binocular Just-Noticeable Coding Distortion,FBJNCD) model based on concave is proposed in this paper. Firstly, the influence of gradient amplitude and texture amplitude on stereoscopic masking effect is studied by subjective experiment. At the same time, considering that the visual sensitivity of (Human Visual System,HVS) is not constant, when the retinal eccentricity becomes larger, the visual threshold of pixels also becomes larger, so the visual sensitivity of HVS is combined with the characteristics of retinal fovea perception. Finally, FBJNCD is applied to asymmetric stereo video coding in a multi-view efficient video coding (Multi View-High Efficiency Video Coding,MV-HEVC) test platform. Experimental results show that the proposed model can save an average coding rate of 26.04% and improve the stereo video compression efficiency while maintaining stereo video perception quality. The traditional Just-Noticeable Distortion,JND model is difficult to be used in stereo video and has the problem of overestimating the visual threshold of foreground region and underestimating the visual threshold of background region. Therefore, this paper presents a Stereo Just-Noticeable Distortion,SJND model which can be applied to stereo video. Firstly, the parallax information is used to divide the traditional JND into the pre-background region, which assigns a small threshold to the foreground region and a larger threshold to the background region. Considering that people in foreground region pay more attention to visual center region and parallax region, we propose a new salience map based on these two rules, and assign different quantization parameter (QP) value to different significant region. Experimental results show that the proposed method can save an average rate of 19.92% on the premise that the stereo video quality is invariable.
【學位授予單位】:寧波大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TN919.81
本文編號:2341652
[Abstract]:3D-HEVC can remove spatio-temporal domain and redundancy between viewpoints of 3D video, but it can not remove perceptual redundancy. However, the application of perceptual coding can ensure that the subjective quality remains the same. Further remove perceptual redundancy to reduce coding complexity or save coding rate. Therefore, how to construct visual perception model and how to apply it to 3D video coding is a hot topic in 3D video coding. Therefore, based on the 3D-HEVC coding standard, this dissertation focuses on two core technologies of low complexity coding and rate-distortion optimization for 3D video from the perspective of visual perception. In view of the high complexity of depth map coding, this paper proposes a fast depth map coding scheme based on virtual view synthesis. The maximum tolerance depth distortion (Maximum Tolerated Depth Distortions,MTDD) model is adopted for 3D video systems. Firstly, according to the different tolerance of MTDD values, an early decision algorithm for different depth ranges is proposed. Then, the vertical edge region which is sensitive to rendering distortion is detected, and the pattern decision is made according to different search strategies. Finally, the fusion of these two algorithms further reduces the complexity of depth video coding. The experimental results show that the proposed algorithm can reduce the coding time by 49.45% under the condition that the quality of rendering virtual view and the coding rate are not changed. In view of the large amount of perceptual redundancy in stereo video, a binocular exactly detectable coding distortion (Foveated Binocular Just-Noticeable Coding Distortion,FBJNCD) model based on concave is proposed in this paper. Firstly, the influence of gradient amplitude and texture amplitude on stereoscopic masking effect is studied by subjective experiment. At the same time, considering that the visual sensitivity of (Human Visual System,HVS) is not constant, when the retinal eccentricity becomes larger, the visual threshold of pixels also becomes larger, so the visual sensitivity of HVS is combined with the characteristics of retinal fovea perception. Finally, FBJNCD is applied to asymmetric stereo video coding in a multi-view efficient video coding (Multi View-High Efficiency Video Coding,MV-HEVC) test platform. Experimental results show that the proposed model can save an average coding rate of 26.04% and improve the stereo video compression efficiency while maintaining stereo video perception quality. The traditional Just-Noticeable Distortion,JND model is difficult to be used in stereo video and has the problem of overestimating the visual threshold of foreground region and underestimating the visual threshold of background region. Therefore, this paper presents a Stereo Just-Noticeable Distortion,SJND model which can be applied to stereo video. Firstly, the parallax information is used to divide the traditional JND into the pre-background region, which assigns a small threshold to the foreground region and a larger threshold to the background region. Considering that people in foreground region pay more attention to visual center region and parallax region, we propose a new salience map based on these two rules, and assign different quantization parameter (QP) value to different significant region. Experimental results show that the proposed method can save an average rate of 19.92% on the premise that the stereo video quality is invariable.
【學位授予單位】:寧波大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TN919.81
【參考文獻】
相關(guān)期刊論文 前2條
1 徐升陽;郁梅;蔣剛毅;方樹清;邵楓;彭宗舉;;面向HEVC的恰可察覺編碼失真模型[J];光電子·激光;2015年12期
2 蔣剛毅;朱亞培;郁梅;張云;;基于感知的視頻編碼方法綜述[J];電子與信息學報;2013年02期
,本文編號:2341652
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2341652.html
最近更新
教材專著