誤差恢復視頻壓縮中的高級可伸縮編碼和運動估計

發(fā)布時間：2019-06-19 03:29

【摘要】：我們目前正處于一個信息化高度發(fā)展的時代,在日常生活中會遇到大量的多媒體內(nèi)容數(shù)據(jù),特別是通過網(wǎng)絡進行傳輸?shù)膱D片和視頻信息。在互聯(lián)網(wǎng)和無線網(wǎng)絡上富媒體的需求止在快速的增長,驅(qū)動這些富媒體通信和娛樂服務,不僅需要增強的寬帶接入,也需要有力的媒體編碼技術,使傳輸更加有效。一些視頻編碼標準,例如ISO/IEC MPEG系列和ITU-T視頻編碼標準,已經(jīng)開發(fā)成功,可以顯著地降低數(shù)據(jù)速率。大部分這些視頻壓縮方法使用基于塊的帶有運動補償?shù)碾x散余弦變換(DCT:Discrete Cosine Transform)來消除空間和時間冗余。在針對網(wǎng)絡傳輸所設計的視頻編碼技術中,兩個主要問題比較突出：第一個是任何網(wǎng)絡系統(tǒng)的性能都希望最佳地輸送數(shù)據(jù),但并不能保證網(wǎng)絡的可靠性。視頻數(shù)據(jù),相比與其它數(shù)據(jù)類型,具有更大的數(shù)據(jù)量,因而網(wǎng)絡有限的傳輸帶寬、低的處理器功耗和可用的存儲空間可能限制它的傳播能力。針對視頻應用,高的傳輸差錯帶來了附加的成水,例如時延、復雜度和品質(zhì)。重傳是解決網(wǎng)絡傳輸差錯一個有效的方式,但它引入了網(wǎng)絡附加的負載,可能不適合要求低時延的應用。其主要的目的是保護視頻數(shù)據(jù),以及在可能的錯誤中隱藏或恢復視頻數(shù)據(jù)。在大區(qū)域網(wǎng)絡中異構(gòu)性是另一個限制視頻應用的問題。不同類型的網(wǎng)絡有不同的帶寬和流量負載。異構(gòu)視頻網(wǎng)絡要求提供具有可變品質(zhì)的視頻服務,并且能夠自動準確地滿足這些需求。視頻壓縮中最關鍵的部分是運動估計。運動估計是產(chǎn)生運動矢量的過程。這些矢量決定了從前一幀中生成的用來補償預測幀的運動參數(shù)。它的計算量對算法的實時實現(xiàn)提出了很大的挑戰(zhàn)。運動估計算法可以分為時域算法和頻域算法。匹配算法和基于梯度的算法是時域算法的重要部分。匹配算法可以分為塊匹配算法和特征匹配算法�；谔荻鹊乃惴ǹ梢苑譃橄袼剡f歸和塊遞歸方法。頻域算法則應用相位相關、小波域匹配和DCT域匹配的方法。梯度技術通常用于對圖像序列的分析。像素遞歸技術,作為梯度技術的一個子集,應用在圖像序列編碼中,其中最佳匹配搜索在基于逐像素基礎上進行�；谙袼氐募夹g要求非常高的計算復雜度,不適合實時應用。頻域技術則是依賴與移位圖像傳輸系數(shù)之間的關系,沒有廣泛的應用在圖像序列編碼中。最終,塊匹配技術,其基于最小化特定的代價函數(shù)思想,成為編碼應用中最廣泛使用的方法,它的搜索是在n×n的像素塊上進行的。在各種運動估計算法中,塊匹配運動估計是最主要的方法。為了最小化塊匹配中的搜索時間,一個簡單有效的算法是非常關鍵的。塊匹配運動估計(BMME:Block Matching Motion Estimation)是視頻編碼中最流行和最實際的運動估計方法。H.26X標準系列和MPEG標準系列均使用BMME方法。塊匹配是一個相關技術,它尋找當前圖像塊和參考幀中特定區(qū)域的候選圖像塊間的最佳匹配。塊匹配過程至少用到兩幀圖片,即參考幀和當前幀。當前幀被分解為各個宏模塊,運動估計在每個宏模塊上單獨進行。一個運動估計算法針對當前幀中將要進行編碼的宏模塊找出在參考幀上最匹配的宏模塊。一旦找到最佳匹配的宏模塊,最佳匹配的宏模塊和當前的宏模塊之間的差異或預測誤差就被計算,進而進行DCT變換、量化和游程編碼。除了編碼不同宏模塊之間的差異外,兩個宏模塊之間的相對位移矢量也將被編碼。在本論文中,我們首先討論各種基于塊的快速運動估計算法,通過實驗在搜索速度和計算復雜性方面對這些算法進行評估。進一步將對性能最好的算法進行仔細的分析。這些算法包括窮舉搜索或全搜索(FS:Full Search),三步搜索(TSS:Three Step Search),新三步搜索(NTSS:New Three Step Search),四步搜索法(4SS:Four Step Search),菱形搜索(DS：Diamond Search)和自適應十字模式搜索(ARPS:Adaptive Rood Pattern Search)。其次論文提出了ARPS的新的動態(tài)自適應十字搜索算法。它利用了鄰塊之間的空域相關性,因此我們用ARPS_S來命名,以與ARPS區(qū)分。ARPS_S是基于如下的假設：運動矢量的分布不僅與預測的運動矢量高度相關,而且在垂直和水平方向都有高度的相關性,這構(gòu)成了一個十字陣形。我們所感興趣的模塊周圍的模塊,其MV的最大值和最小值可以認為是預測MV的估計偏差,這樣,他們可以用作臂長的精確估計,從而表示相應方向上的運動動態(tài)范圍。與ARPS相反,在ARPS_S中四條臂長并不相等。ARPS的初始搜索點數(shù)為5,而ARPS_S的初始搜索點數(shù)為6。在我們的實驗中ARPS_S在搜索速度和視頻品質(zhì)上都比ARPS要優(yōu)。最后本論文將討論使用可仲縮編碼策略的差錯恢復編碼技術�？缮炜s的視頻編解碼技術指的是用戶把一個視頻序列編碼為一個若干個比特流,從而支持譯碼端各種品質(zhì)級別。本文將介紹和評估兩類可伸縮差錯恢復編碼技術：分層的編解碼(LC:Layered Coding)和多描述編解碼(MDC:Multiple Descriptions Coding) 壓縮視頻比特流的特性使得視頻差恢復技術具有很大的重要性。例如,在VLC編碼視頻數(shù)據(jù)中單一比特的誤差可能導致編碼器和譯碼器之間同步的丟失,進一步導致多個視頻塊的丟失。多個比特誤差,其經(jīng)常發(fā)生在突發(fā)信道差錯或是包丟失情況下,可能導致部分或整個視頻幀的丟失,引起時域維度的誤差傳播。而這個傳播是在減少視頻時間冗余度時使用運動補償技術的直接結(jié)果。差錯恢復和可伸縮性是視頻傳輸過程中極其重要的兩個特征�？缮炜s的視頻編解碼技術指的是用戶把一個視頻序列編碼為一個若干個比特流,從而支持譯碼端各種品質(zhì)級別�？缮炜s性為在某些可接受的信息損失的情況下提供了很好的魯棒性。同時,它不會給解碼帶來太大的問題,也不會嚴重地影響視覺品質(zhì)。分層的編解碼(LC:Layered Coding)和多描述編解碼(MDC:Multiple Descriptions Coding)是視頻傳輸中的兩種類型的可伸縮性編碼技術。魯棒的視頻編解碼技術在限制錯誤傳播和提高視覺品質(zhì)方面起著極為關鍵的作用。通過同時設計合理的結(jié)果和維持在最小復雜度下的可接受冗余,魯棒的視頻編解碼技術可以有效的解決錯誤隱藏問題。分層的編碼技術把視頻序列分成幾層,每層對保真度有不同的重要性。最低層也叫做基層,基層可以被獨立地編碼�；鶎右陨系膶哟谓凶鲈鰪妼�,他們的譯碼依賴于基層。基層的視頻的品質(zhì)是最低,隨著增強層的增加,視頻品質(zhì)將得到提升。在阻塞的情況下,支持分層服務的網(wǎng)絡首先傳輸對于解碼最重要的的基層包。分層的視頻編碼方法最早被提出來用于對抗在ATM網(wǎng)絡中的包丟失,提高傳輸?shù)聂敯粜�。隨后,這種編碼方法被MPEG-2和MPEG-4兩個標準組織接受作為一種主要的錯誤糾正和可伸縮的編碼方法。這種分層的編碼也被應用于一些IP中多播的應用,例如Internet多播骨干網(wǎng)。在MDC中整個比特流(描述是同等重要的)。分層編碼經(jīng)常與不均等誤差保護(UEP: Unequal Error Protection)相關,進而對傳輸中最重要的數(shù)據(jù),即基層數(shù)據(jù),提供了更高的保護性。盡管如此,如果基層發(fā)生丟失(如,由于服務器崩潰或是連接失敗),或是接收中有大量的錯誤,那么由于層間的等級性結(jié)構(gòu),增強層中附加的信息幾乎沒有用處。MDC技術把視頻序列壓縮成幾個具有相同重要性的比特流。每個比特流(也叫描述)獨立解碼,而他們之間可以互相增強。當接收器接收到更多的描述時,重建的視頻品質(zhì)更高。因此,并行的可擴展性在多描述編碼是天然存在的。本文中的一部分內(nèi)容就是研究在LC和MDC中如何生成比特流。每一幀首先經(jīng)過DCT變換,然后被量化和Zigzag編碼。在分層的編碼中,最重要的DCT系數(shù)(前十個系數(shù))被分配給基層,其余的被分配給增強層。在多描述編碼中,64DCT系數(shù)被等價地分割成奇偶兩個部分。仿真結(jié)果顯示MDC場景要優(yōu)于LC場景。實驗仿真證明,相對于分層編碼,如果適當?shù)亟Y(jié)合路徑多樣性或服務器多樣性多描述編碼技術可以明顯的提升實時的視頻應用的魯棒性。在MDC編碼中,由于在存在錯誤的情況下所有接收到的信息都是有用的,這樣就避免了盡力而為網(wǎng)絡中分層編碼的問題,從而在盡力而為的包傳輸網(wǎng)絡中,對于視頻傳輸這種編碼方法非常有效。
[Abstract]:At present, we are in an era of high information development, and we will encounter a great deal of multimedia content data in our daily life, especially the pictures and video information to be transmitted through the network. The demand for rich media on the Internet and wireless networks is growing rapidly, driving these rich-media communication and entertainment services, not only for enhanced broadband access, but also strong media coding techniques to make the transmission more efficient. Some video coding standards, such as the iso/ iec mpeg series and the itu-t video coding standard, have been developed to significantly reduce the data rate. Most of these video compression methods use a block-based discrete cosine transform (dct) with motion compensation to eliminate spatial and temporal redundancy. In the video coding technology designed for network transmission, the two main problems are: the first is that the performance of any network system is the best to deliver the data, but it can't guarantee the reliability of the network sex. video data, compared to other data types, have a larger amount of data, so the network's limited transmission bandwidth, low processor power consumption, and available storage space may limit its propagation energy Force. For video applications, high transmission errors bring additional water, such as time delay, complexity, and product quality. Retransmission is an effective way to address network transmission errors, but it introduces a network-attached load that may not be suitable for requiring low latency with. Its main purpose is to protect the video data and to hide or restore the number of videos in the possible errors According to. Heterogeneity in large-area networks is another question of limiting video applications problem. Different types of networks have different bandwidth and flow negative The heterogeneous video network requires the provision of video services with variable quality and is capable of automatically and accurately meeting these requirements Please. The most critical part of video compression is the transport motion estimation. Motion estimation is a motion vector The process. These vectors determine the amount of transport generated in the previous frame to compensate for the predicted frame The real-time implementation of the algorithm is very important to the real-time realization of the algorithm. The motion estimation algorithm can be divided into time domain algorithm and frequency. Domain algorithm. The matching algorithm and the gradient-based algorithm are the weight of the time-domain algorithm. The matching algorithm can be divided into a block matching algorithm and a characteristic piece. The gradient-based algorithm can be divided into pixel recursion and block delivery. the method comprises the following steps of: applying phase correlation, wavelet domain matching and DCT domain matching in a frequency domain algorithm methods. gradient techniques are commonly used for image-to-image processing, The analysis of the sequence. The pixel recursive technique, as a subset of the gradient technique, is applied in the image sequence coding where the best match search is based on pixel-by-pixel on the basis of pixel-based technology requires very high computational complexity, discomfort, In-time application, the frequency-domain technique is the relation between the dependence and the transfer coefficient of the shift image, and it is not widely used in the image finally, the block matching technique, based on the idea of minimizing the particular cost function, becomes the most widely used method in the coding application, block-matched motion estimation in a variety of motion estimation algorithms is the most important method. To minimize the search time in a block match, a simple and effective calculation The block matching motion estimation (BMME) is the most popular and practical in the video coding The motion estimation method of the H.26X standard series and the MPEG standard series the bmme method is used. block matching is a related technique that looks for candidate images of a particular area in the current image block and the reference frame the best match between the blocks. The block matching process uses at least two frame pictures, that is, reference frames and current frames. the current frame is decomposed into individual macro blocks, the motion is estimated at each macro, a motion estimation algorithm finds the macro module to be encoded on the reference frame for the current frame the most matched macro-module, once the best-matched macro-module is found, the difference or the prediction error between the best-matched macro-module and the current macro-module is calculated, and then the DCT transformation is carried out, Quantization and run-length coding. In addition to coding differences between different macro blocks, the relative displacement between the two macro blocks The vector will also be encoded. In this paper, we first discuss various block-based fast motion estimation algorithms, which are based on the search speed and computational complexity. These algorithms are evaluated. The best performance will be The algorithms are carefully analyzed. These algorithms include exhaustive search or full search (FS: Full Search), three-step search (TSS: Three Step Search), new three-step search (NTSS: New Three Step Search), four-step search (4SS: Four Step Search), diamond search (DS: Diamond Search), and adaptive cross-mode search (ARPS: Adaptive Good Patte) (r n Search). Secondly, we put forward the new ARPS The dynamic adaptive cross search algorithm. It uses the spatial correlation between the adjacent blocks, so we use ARPS _ S The ARPS _ S is based on the assumption that the distribution of the motion vector is not only related to the predicted motion vector height, but also has a high degree of correlation in both the vertical and horizontal directions This constitutes a cross-form. The module around the module of interest, the maximum and minimum of the MV, can be considered to be the estimated deviation of the predicted MV, so that they can be used as an accurate estimate of the length of the arm, indicating the phase The dynamic range of motion in the direction. In contrast to ARPS, in ARPS The four arms in the _ S are not equal. The initial search point for ARPS is 5, and ARPS The number of initial search points for _ S is 6. In our lab, ARPS _ S is searching for speed and video The quality is better than the ARPS. In the end, the paper will discuss the use of the scalable the scalable video coding and decoding technique refers to a user encoding a video sequence into a plurality of bit streams, so as to support the various quality levels of the decoding end. The two types of scalable error recovery coding techniques are described and evaluated in this paper: layered coding and decoding (LC: Layered Coding) and multi-description codec (MDC: Multiple Descr) the properties of the compressed video bitstream are such that video difference recovery techniques have a great importance. for example, the error of a single bit in the vlc encoded video data may result in a loss of synchronization between the encoder and the decoder, a loss of a plurality of video blocks is further caused by the loss of a plurality of bit errors, which often occur in the case of a burst channel error or packet loss, which may result in partial or full video frames, The loss of the time-domain dimension is caused by the loss of the time-domain dimension. The direct result of using motion compensation techniques when using motion compensation techniques. Error recovery and scalability are apparent The scalable video coding and decoding technique refers to the fact that the user encodes a video sequence into a number of bits The stream, thus supporting the various quality levels of the decoding end. The scalability is in some acceptable information A good robustness is provided in the event of a loss. At the same time, it does not bring too much to the decoding The problem does not seriously affect the visual quality. The layered codec (LC: Layered Coding) and the multi-description codec (MDC: Multiple Descriptions Coding) are video transmission Two types of scalable coding techniques. Robust video coding and decoding techniques are limiting the propagation and enhancement of errors It plays an important role in the visual quality. The robust video coding and decoding can be achieved by simultaneously designing a reasonable result and maintaining the acceptable redundancy at the minimum complexity The invention can effectively solve the problem of error concealment, In several layers, each layer has a different importance to fidelity. The layer is also called a base layer and the base layer may be independently encoded. sometimes called the enhancement layer, their decoding depends on the base layer. The quality of the video at the base layer is the lowest, with the quality of the video will be improved with the enhancement of the enhancement layer. In the case of congestion, the network that supports the layered service the network first transmits a base layer packet for decoding the most important base layer packet. the layered video encoding method is first proposed to be used to combat the at least one of the at least one of the at least one of the at least one of The packet loss in the m network is lost and the robustness of the transmission is improved. the main error correction and the scalable coding method. this layered coding is also applied to the multicast in some ip With, for example, the internet multicast backbone. The entire bit stream in the MDC (description is equally important). The layered coding is often associated with unequal error protection (UEP: Unfair Error Protection), which in turn is the most important in the transmission the data, that is, the base layer data, provides a higher degree of protection. Nevertheless, if the base layer is lost (e.g., due to a server crash or a connection failure) or a large number of errors are received, The additional information in the enhancement layer is hardly useful in the nature of the structure. The frequency sequence is compressed into several bit streams with the same importance. Each bit stream (also called Description) Independent decoding, and they can be enhanced with each other. When the receiver the reconstructed video quality is higher when more description is received. thus, The parallel scalability is naturally occurring in multi-description coding. A part of this article This is how to generate a bit stream in the LC and MDC. Each frame first passes through D ct transforms and then quantized and zag-coded. in the layered coding, the most important dct coefficients (the first ten systems the number) is assigned to the base layer and the remaining allocated to the enhancement layer. hi the multi-description encoding,6 the 4 dct coefficients are equally divided into odd and even two parts, The simulation results show that the MDC scene is better than the LC scene. the method can obviously improve the robustness of the real-time video application. in the mdc coding, since all the received information is useful in the case of an error, the problem of layered coding in the best-effort network is avoided, so that the best-effort packet transmission network
【學位授予單位】：北京郵電大學
【學位級別】：博士
【學位授予年份】：2014
【分類號】：TN919.81

【相似文獻】

相關期刊論文前10條

1 李應興;;基于子塊運動估計補償?shù)囊曨l誤碼塊掩飾[J];微計算機信息;2006年36期

2 馮峗;方宗德;金晟毅;;基于統(tǒng)計學理論的參數(shù)模型運動估計方法[J];計算機工程與應用;2007年09期

3 戴衛(wèi)恒,程宏煌,姚u&u&;一種基于云模型的運動估計快速算法[J];電視技術;2001年09期

4 洪波,余松煜;基于對象的菱形搜索運動估計方法[J];數(shù)據(jù)采集與處理;2001年01期

5 楊曉輝,李中科,吳樂南;模型基輔助編碼中實時運動估計的自適應方法[J];信號處理;2002年06期

6 婁東升;一種新的運動估計與運動補償算法[J];北京廣播學院學報(自然科學版);2003年02期

7 陳良琴,陳新;基于提升小波變換域運動估計的序列圖像壓縮方法[J];陜西科技大學學報;2004年06期

8 齊兵;王群生;楊春玲;;一種運動估計快速算法的研究與實現(xiàn)[J];通信技術;2006年S1期

9 李志欣;李建華;侯建黨;;一種改進的運動估計新算法[J];計算機工程與應用;2007年18期

10 劉彥輝;賈俊玲;張顏艷;;一種自適應的六邊形-方形運動估計搜索算法[J];廣東通信技術;2009年07期

相關會議論文前10條

1 周露平;陳宗海;王海波;;運動估計中的不確定性分析[A];2007系統(tǒng)仿真技術及其應用學術會議論文集[C];2007年

2 孫明利;吳一全;;基于改進的粒子群算法的塊匹配運動估計方法[A];2008通信理論與技術新發(fā)展——第十三屆全國青年通信學術會議論文集（下）[C];2008年

3 鄒曉春;馮燕;趙歆波;;一種快速的塊匹配運動估計新算法[A];中國航空學會信號與信息處理專業(yè)全國第八屆學術會議論文集[C];2004年

4 郭翌;汪源源;侯濤;;基于運動估計和非局部平均的超聲心動圖濾波[A];中國儀器儀表學會第十二屆青年學術會議論文集[C];2010年

5 歐陽國勝;羅永倫;;一種用于視頻編碼運動估計的新算法[A];2006中國西部青年通信學術會議論文集[C];2006年

6 鄒曉春;趙歆波;馮燕;;圖像序列分析綜述[A];信號與信息處理技術第三屆信號與信息處理全國聯(lián)合學術會議論文集[C];2004年

7 魏津瑜;孫靜靜;李欣;代中華;;基于運動估計的動態(tài)夜視圖像的上色算法[A];2011年中國智能自動化學術會議論文集（第一分冊）[C];2011年

8 高韜;于明;;基于冗余小波變換的運動估計及DSP實現(xiàn)[A];第十三屆全國圖象圖形學學術會議論文集[C];2006年

9 李振亞;宋建斌;李波;;一種采用混合搜索模式的H.264運動估計快速算法[A];第四屆和諧人機環(huán)境聯(lián)合學術會議論文集[C];2008年

10 魯小兵;肖創(chuàng)柏;;H.264運動估計搜索窗口的動態(tài)調(diào)整算法[A];圖像圖形技術研究與應用2009——第四屆圖像圖形技術與應用學術會議論文集[C];2009年

相關重要報紙文章前1條

1 田力;準確“鎖定”交通肇事車輛[N];人民公安報;2010年

相關博士學位論文前10條

1 陳運必;高性能運動估計的架構(gòu)設計與優(yōu)化的研究[D];中國科學技術大學;2011年

2 紀中偉;先進的運動估計與運動補償算法在數(shù)字視頻處理中的應用[D];電子科技大學;2002年

3 王鎮(zhèn)道;視頻壓縮的運動估計與小波方法研究[D];湖南大學;2008年

4 魏偉;視頻壓縮編碼的運動估計與補償技術[D];天津大學;2008年

5 許曉中;視頻編碼標準中運動估計技術研究[D];清華大學;2009年

6 于雪松;基于單目無標記點的人體3D運動估計關鍵技術的研究[D];哈爾濱工業(yè)大學;2009年

7 劉新春;面向MPEG-4的視頻分割算法研究[D];中國科學院電子學研究所;2000年

8 朱向軍;視頻運動對象分割與先進運動估計/運動補償算法之研究[D];浙江大學;2006年

9 向東;基于H.264框架的運動估計和變換研究[D];華中科技大學;2006年

10 鄭兆青;用于H.264視頻編碼的運動估計VLSI結(jié)構(gòu)研究[D];華中科技大學;2007年

相關碩士學位論文前10條

1 鄒曉春;基于快速塊匹配的圖象序列運動估計技術研究[D];西北工業(yè)大學;2005年

2 吳慶偉;運動估計方法研究與序列圖像的相關性分析[D];華中科技大學;2005年

3 葉學兵;視頻壓縮中運動估計的研究[D];北京化工大學;2005年

4 魏偉;基于可變形塊匹配的運動估計與補償[D];天津大學;2006年

5 田勝軍;基于塊匹配算法的運動估計研究[D];電子科技大學;2006年

6 王平;基于粒子群的視頻運動估計算法研究與優(yōu)化[D];電子科技大學;2009年

7 陳良琴;視頻壓縮系統(tǒng)運動估計技術研究[D];福州大學;2005年

8 張益林;運動估計匹配標準的抗噪聲研究[D];上海交通大學;2009年

9 龔源泉;視頻運動估計與噪聲抑制濾波部件的設計[D];浙江大學;2005年

10 丁銳;用于運動估計的高效三步法的硬件設計與仿真[D];湖南大學;2006年

，

本文編號：2502041

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/wltx/2502041.html

上一篇：DMR系統(tǒng)轉(zhuǎn)發(fā)模型的研究及中轉(zhuǎn)臺軟件設計與開發(fā)
下一篇：近鄰傳播聚類無線傳感器網(wǎng)絡分簇路由算法

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

誤差恢復視頻壓縮中的高級可伸縮編碼和運動估計