經(jīng)驗(yàn)?zāi)B(tài)分解在單通道語音盲分離中的應(yīng)用研究
發(fā)布時間:2019-01-14 08:22
【摘要】:盲源信號分離是當(dāng)前國際信號處理領(lǐng)域的研究熱點(diǎn),語音分離是較早引入盲源分離技術(shù)研究的領(lǐng)域之一,隨著移動通信和互聯(lián)網(wǎng)技術(shù)飛速發(fā)展,語音分離技術(shù)已經(jīng)在眾多領(lǐng)域得到應(yīng)用,比如語音識別系統(tǒng)、計(jì)算機(jī)聽覺、無線通信以及電視電話會議等。經(jīng)驗(yàn)?zāi)B(tài)分解(Empirical Mode Decomposition,EMD)是一種新型信號分析方法,突破了傳統(tǒng)的傅里葉變換頻率的局限性,在非線性非平穩(wěn)信號處理領(lǐng)域具有強(qiáng)大的優(yōu)勢,語音信號作為一種典型的非線性非平穩(wěn)信號,經(jīng)驗(yàn)?zāi)B(tài)分解方法為語音信號處理提供了新的思路,為語音盲分離技術(shù)開辟了新的路徑,本文就基于經(jīng)驗(yàn)?zāi)B(tài)分解方法圍繞單通道語音盲分離展開研究,主要作了以下工作: 為解決非負(fù)矩陣分解算法(Nonnegative Matrix Factoritzation,NMF)用于單通道語音分離的不足,避免NMF算法對混合矩陣的稀疏限制的要求,減少分離語音源信號之間的頻域混疊,本文首先采用EEMD算法對混合語音信號進(jìn)行處理,將信號分解成若干個具有信號瞬時特征的固有模態(tài)分量(IMF),簡化其頻譜結(jié)構(gòu),利用語音信號短時平穩(wěn)性,針對語音信號數(shù)據(jù)量大的特性,對所得的IMF進(jìn)行稀疏化處理非負(fù)矩陣分解數(shù)學(xué)模型,選用具有尺度不變性的板倉-齋藤(Itakura-Saito,IS)散度進(jìn)行NMF分解,最終通過聚類算法實(shí)現(xiàn)語音源信號重構(gòu),通過仿真實(shí)驗(yàn)表明該算法在分離質(zhì)量上略有改善; 單通道盲源分離理論研究遠(yuǎn)還不如傳統(tǒng)欠定或超定盲源分離技術(shù)成熟,為解決單通道欠定盲分離難題,本文運(yùn)用EEMD分解方法,將單通道混合語音轉(zhuǎn)化成單入多輸出的虛擬多通道,然后運(yùn)用相對成熟的快速獨(dú)立成分分析(FastICA)盲源分離算法進(jìn)行處理,最后重構(gòu)恢復(fù)出源信號,針對利用EEMD分解后得到的IMF直接進(jìn)行ICA處理盲分離迭代次數(shù)過高、收斂速度慢問題,對EEMD分解得到的固有模態(tài)分量進(jìn)行主成分分析方法,以達(dá)到降維目的,最后利用FastICA進(jìn)行盲分離,提高算法迭代效率,最后通過仿真實(shí)驗(yàn)驗(yàn)證算法的有效性。
[Abstract]:Blind source signal separation (BSS) is a hot topic in the field of international signal processing. Speech separation is one of the fields in which blind source separation (BSS) technology was introduced earlier. With the rapid development of mobile communication and Internet technology, Speech separation technology has been applied in many fields, such as speech recognition system, computer hearing, wireless communication and video teleconference. Empirical mode decomposition (Empirical Mode Decomposition,EMD) is a new signal analysis method, which breaks through the limitation of the traditional Fourier transform frequency and has a strong advantage in the field of nonlinear non-stationary signal processing. As a typical nonlinear non-stationary signal, the empirical mode decomposition (EMD) method provides a new way for speech signal processing and a new path for speech blind separation. This paper focuses on blind speech separation based on empirical mode decomposition (EMD). The main work is as follows: in order to solve the problem of non-negative matrix decomposition (Nonnegative Matrix Factoritzation,NMF) for single-channel speech separation, In order to avoid the sparse limitation of NMF algorithm on the mixing matrix and reduce the frequency domain aliasing between the separated speech source signals, this paper first uses EEMD algorithm to process the mixed speech signals. The signal is decomposed into a number of inherent modal components with the instantaneous characteristics of the signal, (IMF), simplifies its spectral structure. By using the short-time stationarity of the speech signal, aiming at the characteristics of the large amount of data of the speech signal, The obtained IMF is sparse processed by non-negative matrix decomposition mathematical model, and Itakura-Saito,IS divergence with scale invariance is selected for NMF decomposition. Finally, the speech source signal reconstruction is realized by clustering algorithm. The simulation results show that the separation quality of the algorithm is improved slightly. The theoretical study of single channel blind source separation is far less mature than that of traditional undetermined or overdetermined blind source separation technology. In order to solve the problem of single channel blind source separation, EEMD decomposition method is used in this paper. The single-channel mixed speech is transformed into a virtual multi-channel with single input and multi-output, and then the relatively mature fast independent component analysis (FastICA) blind source separation algorithm is used to process it. Finally, the source signal is reconstructed and recovered. In order to solve the problem of high iteration number and slow convergence rate of blind separation, the IMF obtained by EEMD decomposition is directly processed by ICA. The principal component analysis method is used to reduce the dimension of the intrinsic modal component obtained by EEMD decomposition. Finally, FastICA is used for blind separation to improve the iterative efficiency of the algorithm. Finally, the effectiveness of the algorithm is verified by simulation experiments.
【學(xué)位授予單位】:西南交通大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TN912.3
本文編號:2408511
[Abstract]:Blind source signal separation (BSS) is a hot topic in the field of international signal processing. Speech separation is one of the fields in which blind source separation (BSS) technology was introduced earlier. With the rapid development of mobile communication and Internet technology, Speech separation technology has been applied in many fields, such as speech recognition system, computer hearing, wireless communication and video teleconference. Empirical mode decomposition (Empirical Mode Decomposition,EMD) is a new signal analysis method, which breaks through the limitation of the traditional Fourier transform frequency and has a strong advantage in the field of nonlinear non-stationary signal processing. As a typical nonlinear non-stationary signal, the empirical mode decomposition (EMD) method provides a new way for speech signal processing and a new path for speech blind separation. This paper focuses on blind speech separation based on empirical mode decomposition (EMD). The main work is as follows: in order to solve the problem of non-negative matrix decomposition (Nonnegative Matrix Factoritzation,NMF) for single-channel speech separation, In order to avoid the sparse limitation of NMF algorithm on the mixing matrix and reduce the frequency domain aliasing between the separated speech source signals, this paper first uses EEMD algorithm to process the mixed speech signals. The signal is decomposed into a number of inherent modal components with the instantaneous characteristics of the signal, (IMF), simplifies its spectral structure. By using the short-time stationarity of the speech signal, aiming at the characteristics of the large amount of data of the speech signal, The obtained IMF is sparse processed by non-negative matrix decomposition mathematical model, and Itakura-Saito,IS divergence with scale invariance is selected for NMF decomposition. Finally, the speech source signal reconstruction is realized by clustering algorithm. The simulation results show that the separation quality of the algorithm is improved slightly. The theoretical study of single channel blind source separation is far less mature than that of traditional undetermined or overdetermined blind source separation technology. In order to solve the problem of single channel blind source separation, EEMD decomposition method is used in this paper. The single-channel mixed speech is transformed into a virtual multi-channel with single input and multi-output, and then the relatively mature fast independent component analysis (FastICA) blind source separation algorithm is used to process it. Finally, the source signal is reconstructed and recovered. In order to solve the problem of high iteration number and slow convergence rate of blind separation, the IMF obtained by EEMD decomposition is directly processed by ICA. The principal component analysis method is used to reduce the dimension of the intrinsic modal component obtained by EEMD decomposition. Finally, FastICA is used for blind separation to improve the iterative efficiency of the algorithm. Finally, the effectiveness of the algorithm is verified by simulation experiments.
【學(xué)位授予單位】:西南交通大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TN912.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 楊海濱;張軍;;基于模型的單通道語音分離綜述[J];計(jì)算機(jī)應(yīng)用研究;2010年11期
相關(guān)博士學(xué)位論文 前2條
1 楊尚明;盲信號分離ICA理論與應(yīng)用[D];電子科技大學(xué);2009年
2 劉建強(qiáng);非平穩(wěn)環(huán)境中的盲源分離算法研究[D];西安電子科技大學(xué);2009年
,本文編號:2408511
本文鏈接:http://sikaile.net/kejilunwen/wltx/2408511.html
最近更新
教材專著