天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

語音非線性特性分析及其應(yīng)用

發(fā)布時(shí)間:2018-07-01 13:31

  本文選題:語音 + 非線性分析與處理。 參考:《南京大學(xué)》2014年博士論文


【摘要】:語言的聲學(xué)表現(xiàn)形式——語音一直是人們探究的對(duì)象?諝鈩(dòng)力學(xué)研究表明語音產(chǎn)生過程是非線性的。通過語音信號(hào)的非線性動(dòng)力學(xué)特性研究以及語音信號(hào)的非線性處理,我們基本了解了語音信號(hào)的分形維、Lyapunov指數(shù)等“宏觀”的非線性特征。但語音是短時(shí)非平穩(wěn)信號(hào)。那些基于信號(hào)平穩(wěn)、數(shù)據(jù)量足夠多之假設(shè)所得到的分析結(jié)果,不能準(zhǔn)確細(xì)致地刻劃語音信號(hào)的非線性特征,特別是一些時(shí)域或其子空間的微結(jié)構(gòu)特征。語音的非線性分析與非線性信號(hào)處理正轉(zhuǎn)向精細(xì)結(jié)構(gòu)特性的分析。因此,本文圍繞語音的時(shí)域及分解子空間的非線性微結(jié)構(gòu)開展研究。這既是認(rèn)識(shí)語音的需要,也是目前電子技術(shù)、信號(hào)處理技術(shù)及計(jì)算機(jī)科學(xué)高度發(fā)展的條件下,更有效地應(yīng)用語音信號(hào)處理技術(shù)的需要。語音信號(hào)的聲學(xué)原理是研究的基礎(chǔ)。首先根據(jù)音素的發(fā)聲機(jī)理,討論濁音的聲門振蕩模式、清音的湍流聲源模式和交互作用模式這三種不同的非線性模式。然后回顧總結(jié)已知的語音信號(hào)非線性特性。在語音信號(hào)分析模型方面,介紹了語音的線性預(yù)測(cè)模型(Linear Prediction,LP)、非線性回歸模型及非線性振子模型,從非線性振子的動(dòng)力學(xué)方程導(dǎo)出了一階和二階的局部近似模型,研究了這些模型與LP模型、非線性回歸模型之間的關(guān)系。這使得由非線性回歸模型導(dǎo)出的局部線性預(yù)測(cè)模型(Local Linear Prediction, LLP)和二階Volterra模型有了語音聲學(xué)解釋。波形隨幅度變化,這是非線性信號(hào)的一個(gè)特點(diǎn)。語音音素含有振幅時(shí)變的起始和結(jié)束部分。遞歸圖分析方法是一種適用于短時(shí)非平穩(wěn)信號(hào)的圖形分析方法。用這種方法分析元音及鼻音信號(hào)的起始和結(jié)束等暫態(tài)部分的特性。這有益于提高那些基于相點(diǎn)距離的非線性分析方法。為了更細(xì)致分析語音起始和結(jié)束部分的遞歸特性,我們提出了一種多級(jí)閡值遞歸圖的遞推方法。這種方法的計(jì)算復(fù)雜性低于原遞歸圖分析算法。通過分析狀態(tài)演化進(jìn)程,提出一種部分自適應(yīng)多步局部線性預(yù)測(cè)算法(Partially Adaptive Multi-step Local Linear Prediction, paLLP),并且分析了算法的精度和計(jì)算復(fù)雜性。和已有的兩種非線性遞推預(yù)測(cè)算法比較表明,這種算法有理想的預(yù)測(cè)精度。而計(jì)算復(fù)雜性分析表明,這種算法計(jì)算量遠(yuǎn)低于LLP算法。在實(shí)驗(yàn)中,以Lorenz混沌序列驗(yàn)證算法的可行性、精度、計(jì)算復(fù)雜性及抗干擾能力。對(duì)元音和鼻音信號(hào)的比較性實(shí)驗(yàn)結(jié)果則表明,在語音的非線性預(yù)測(cè)中,paLLP算法是一種高效的、高精度算法。和LP算法相比,paLLP算法不僅精度高,而且預(yù)測(cè)殘差中周期性大大減小,這將有益于基于paLLP的碼本激勵(lì)編解碼中碼書性能的提高。受LD-CELP的啟發(fā),我們?cè)O(shè)計(jì)出一種基于paLLP算法的A-B-S(Analysis-by-Synthesis)語音編解碼器,介紹了這種編碼器的實(shí)施原理。作為非線性非平穩(wěn)信號(hào)分析方法,經(jīng)驗(yàn)?zāi)J椒纸?Empirical Mode Decompo-sition, EMD)也應(yīng)用于語音信號(hào)處理中。EMD的應(yīng)用使得語音信號(hào)的分析可以在其本征模態(tài)函數(shù)(Intrinsic Mode Function, IMF)子空間中進(jìn)行,但很多應(yīng)用中只是直觀地選擇部分IMF作為后續(xù)處理的對(duì)象。為了合理選擇和應(yīng)用IMF,本文分析了IMF的非線性特性。由于原始EMD算法篩分過程不穩(wěn)定,分析中應(yīng)用加窗平均經(jīng)驗(yàn)?zāi)J椒纸?Windowed Average-EMD, WA-EMD)方法作語音信號(hào)分解。通過預(yù)先指定一組期望頻率,用WA-EMD算法將語音信號(hào)穩(wěn)定地分解為一組指定個(gè)數(shù)的IMF。通過估計(jì)IMF功率譜的Hurst指數(shù),區(qū)分出包含原語音中重要信息最多的IMF。用高階奇異譜分方法分析各IMF的嵌入維信息。結(jié)果表明,除了少數(shù)高頻IMF,其它的IMF嵌入維都低于原語音信號(hào)的嵌入維。最后估計(jì)各元音所有IMF的三階譜和歸一化三階譜,分析IMF的非線性。實(shí)驗(yàn)結(jié)果表明,包含原語音中信息最多的IMF基本上是線性的。這將簡化諸如語音瞬時(shí)基音頻率的估計(jì)等語音處理。本文的研究成果讓我們更加深入地認(rèn)識(shí)語音信號(hào)的非線性特性,提高語音信號(hào)的非線性處理性能。
[Abstract]:The acoustical expression of language - speech has always been the object of inquiry. Aerodynamics research shows that the process of speech production is nonlinear. Through the study of the nonlinear dynamic characteristics of the speech signal and the nonlinear processing of the speech signal, we basically understand the fractal dimension of the speech signal, the Lyapunov exponent and so on. But speech is a short-time nonstationary signal. The analysis results based on the assumption that the signal is stable and the amount of data are sufficient, can not accurately and meticulously depict the nonlinear characteristics of the speech signal, especially some time domain or its subspace microstructural features. The nonlinear analysis of speech and the positive steering of the nonlinear signal processing. The analysis of the fine structure characteristics. Therefore, this paper studies the nonlinear microstructures of the speech time domain and the decomposed subspace. This is not only the need of speech recognition, but also the need of the current electronic technology, signal processing technology and the high development of computer science. It is more effective to use speech signal processing technology. The principle of acoustics is the basis of research. Firstly, according to the sound mechanism of phoneme, the glottal oscillation mode of voiced sound, the turbulent sound source mode of the voiceless sound and the interaction mode are three different nonlinear modes. Then the nonlinear characteristics of the known speech signal are reviewed and summarized. The linear Preview of speech signal analysis model is introduced. Linear Prediction (LP), nonlinear regression model and nonlinear oscillator model, the first and two order local approximation models are derived from the dynamic equations of nonlinear oscillator, and the relationship between these models and the LP model and the nonlinear regression model is studied. This makes the local linear prediction model derived from the nonlinear regression model (Lo). Cal Linear Prediction, LLP) and the two order Volterra model have acoustic acoustic interpretation. The waveform varies with amplitude. This is a characteristic of nonlinear signals. The phoneme contains the starting and ending parts of the amplitude time variation. The recursive graph analysis method is a graphical analysis method for short time nonstationary signals. The characteristics of the transient parts, such as the beginning and end of the sound signal, are beneficial to improving the nonlinear analysis methods based on the phase point distance. In order to more carefully analyze the recursion characteristics of the speech start and end parts, we propose a recursive method of multilevel threshold recursion. The computational complexity of this method is lower than the original recursive graph. By analyzing the process of state evolution, a partial adaptive multi step local linear prediction algorithm (Partially Adaptive Multi-step Local Linear Prediction, paLLP) is proposed, and the accuracy and computational complexity of the algorithm are analyzed. Compared with the two existing nonlinear recursive prediction algorithms, this algorithm has an ideal prediction. The computational complexity analysis shows that the computational complexity of the algorithm is far lower than the LLP algorithm. In the experiment, the feasibility, accuracy, complexity and anti-interference ability of the algorithm are verified by Lorenz chaotic sequence. The comparative experimental results on vowel and nasal sound signals show that the paLLP algorithm is efficient in the nonlinear prediction of speech. The high precision algorithm. Compared with the LP algorithm, the paLLP algorithm not only has high precision, but also greatly reduces the periodicity in the prediction residual. This will be beneficial to the improvement of codebook performance in codebook based on paLLP. Inspired by LD-CELP, we design a A-B-S (Analysis-by-Synthesis) speech codec based on paLLP algorithm, which is introduced in this paper. The implementation principle of the seed encoder. As a nonlinear non-stationary signal analysis method, Empirical Mode Decompo-sition (EMD) is also applied to the application of.EMD in speech signal processing so that the analysis of speech signal can be carried out in its eigenmode function (Intrinsic Mode Function, IMF) subspace, but in many applications, only a number of applications are used. In order to choose and apply IMF, the nonlinear characteristics of IMF are analyzed in order to select and apply the IMF. Because the screening process of the original EMD algorithm is unstable, the Windowed Average-EMD (WA-EMD) square method is used to decompose the speech signal in the analysis. A set of expectations is given in advance by specifying a set of expectations. Frequency, the WA-EMD algorithm is used to decompose the speech signal steadily into a set of specified number of IMF. by estimating the Hurst exponent of the IMF power spectrum, and differentiating the IMF. which contains the most important information in the original speech to analyze the embedded dimension information of each IMF by the high order singular spectral method. The result shows that the other IMF embedding dimensions are lower than the original one, except for a few high frequency IMF. The embedded dimension of the speech signal. Finally, the three order spectrum and the normalized three order spectrum of all the vowels are estimated and the nonlinearity of the IMF is analyzed. The experimental results show that the IMF containing the most information in the original speech is basically linear. This will simplify the speech theory such as the estimation of the instantaneous pitch frequency of the speech. The results of this paper make us more thorough. We should recognize the nonlinear characteristics of speech signals and improve the nonlinear processing performance of speech signals.
【學(xué)位授予單位】:南京大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2014
【分類號(hào)】:TN912.3

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 孟慶芳;彭玉華;曲懷敬;韓民;;基于信息準(zhǔn)則的局域預(yù)測(cè)法鄰近點(diǎn)的選取方法[J];物理學(xué)報(bào);2008年03期

,

本文編號(hào):2087945

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/wltx/2087945.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶86fd9***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com