天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 信息工程論文 >

基于聽覺計(jì)算模型和深度神經(jīng)網(wǎng)絡(luò)的雙耳語(yǔ)音分離

發(fā)布時(shí)間:2018-03-24 03:14

  本文選題:雙耳通道語(yǔ)音分離 切入點(diǎn):回歸神經(jīng)網(wǎng)絡(luò) 出處:《中國(guó)科學(xué)技術(shù)大學(xué)》2017年碩士論文


【摘要】:語(yǔ)音是人們最重要的交流方式之一。由于日常生活環(huán)境中噪聲的存在,以及信道傳輸損失等等因素,語(yǔ)音質(zhì)量往往會(huì)受到影響,我們所接收到的語(yǔ)音中所包含的信息也會(huì)大打折扣,因此如何從帶噪語(yǔ)音中分離出干凈的語(yǔ)音,與人們的日常生活息息相關(guān)。故語(yǔ)音分離技術(shù)成為語(yǔ)音信號(hào)處理中一個(gè)重要研究方向。在過(guò)去的幾十年中,傳統(tǒng)的語(yǔ)音分離方法已經(jīng)有了豐富的研究,例如譜減法,維納濾波法等。但是傳統(tǒng)的語(yǔ)音分離方法對(duì)語(yǔ)音和干擾的特性所做的一些假設(shè),在實(shí)際生活中可能并不能得到滿足,因此也使得其在實(shí)際應(yīng)用場(chǎng)景中的效果大打折扣,比如會(huì)使得分離出的語(yǔ)音帶有"音樂噪聲干擾"等。近年來(lái)聽覺場(chǎng)景分析這一方法也越來(lái)越多地得到人們的重視和研究。該方法受人耳聽覺處理系統(tǒng)的啟發(fā),通過(guò)對(duì)語(yǔ)音提取出有效的"場(chǎng)景線索"來(lái)進(jìn)行語(yǔ)音的分離。而基于計(jì)算機(jī)軟件來(lái)實(shí)現(xiàn)對(duì)語(yǔ)音的場(chǎng)景分析和分離方面的研究也方興未艾。但是目前基于分類神經(jīng)網(wǎng)絡(luò)的聽覺場(chǎng)景分析方法,雖然能夠有效地提高分離后語(yǔ)音的信噪比,但是卻沒有很好地保證語(yǔ)音的聽感,使得語(yǔ)音存在一些不連續(xù)性的問(wèn)題。為此,在本文中,我們重點(diǎn)研究了如何利用深度神經(jīng)網(wǎng)絡(luò)來(lái)進(jìn)行語(yǔ)音分離,并改善聽感上的不自然的缺點(diǎn);并基于計(jì)算聽覺場(chǎng)景分析理論,針對(duì)雙耳通道語(yǔ)音信號(hào)提取出有效的"場(chǎng)景線索",提高模型在帶噪環(huán)境下的分離性能;通過(guò)對(duì)人耳聽覺計(jì)算模型的探索,在聽覺皮層感知域?qū)用嫣崛〕鼍哂心M人耳聽覺特性的特征,改善語(yǔ)音分離效果。首先,我們提出了一種基于回歸神經(jīng)網(wǎng)絡(luò)的雙耳通道語(yǔ)音分離方法。與分類神經(jīng)網(wǎng)絡(luò)進(jìn)行時(shí)頻單元的分類和重組不同,我們利用神經(jīng)網(wǎng)絡(luò)強(qiáng)大的信息提取和建模能力,直接從輸入的帶噪語(yǔ)音中估計(jì)出干凈的目標(biāo)語(yǔ)音。通過(guò)選擇網(wǎng)絡(luò)的學(xué)習(xí)目標(biāo)以及最小化均方誤差的準(zhǔn)則,使得最終估計(jì)出的語(yǔ)音特征在時(shí)域和頻域上都保留了很好的連續(xù)性和自然度。實(shí)驗(yàn)結(jié)果表明基于回歸神經(jīng)網(wǎng)絡(luò)的分離方法能很大程度地提升分離后語(yǔ)音的聽感。其次,在回歸模型的基礎(chǔ)上,基于聽覺場(chǎng)景分析理論,我們提出了一種基于對(duì)數(shù)能量譜的雙通道特征表示方法。在傳統(tǒng)的對(duì)數(shù)能量譜特征上,我們針對(duì)雙耳通道信息的特點(diǎn),設(shè)計(jì)了基于頻點(diǎn)和時(shí)間的全頻帶互能量差異性特征和低維度的全局互能量差異性特征。為了使特征在包含足夠信息量的同時(shí)不至于因維度過(guò)高而引入過(guò)多參數(shù),我們?cè)O(shè)計(jì)了子頻帶互能量差異性特征。實(shí)驗(yàn)結(jié)果表明我們?cè)O(shè)計(jì)的雙通道能量差異性特征有效地利用了雙耳通道信息,較好地提升了分離效果,且基于子頻帶互能量差異性特征的系統(tǒng)性能更優(yōu)。最后,通過(guò)對(duì)聽覺計(jì)算模型領(lǐng)域的學(xué)習(xí),我們提出了基于聽覺皮層時(shí)頻感知域特征的語(yǔ)音分離方法。通過(guò)對(duì)已有的數(shù)學(xué)模型的研究,我們針對(duì)雙耳通道語(yǔ)音設(shè)計(jì)了模擬時(shí)頻感知域特性的二維濾波器。此外針對(duì)時(shí)頻感知域特征的維度過(guò)高問(wèn)題,我們提出并采用了多種特征降維方式。比如單通道中的頻域平均的方法和主成份分析的方法。在提取雙通道"線索"時(shí),我們?cè)O(shè)計(jì)了時(shí)頻感知域能量差特征,并使用了全局加權(quán)和和分區(qū)加權(quán)和的降維方式。使得雙通道特征在尺度組合上能達(dá)到最優(yōu),另外還設(shè)計(jì)了分頻帶加權(quán)和方法,使得雙通道特征在尺度組合上和不同頻帶上都能達(dá)到最優(yōu)。通過(guò)模型對(duì)加權(quán)系數(shù)的學(xué)習(xí),我們最終得到了一套有效的降維的時(shí)頻感知域能量差特征。實(shí)驗(yàn)結(jié)果表明自動(dòng)學(xué)習(xí)的特征組合方式能更有效地提升模型的分離效果。
[Abstract]:Speech is one of the most important means of communication. People in daily life due to the noise existing in the environment, and the transmission loss of voice quality often affected, contained in speech we received the information will be greatly reduced, so from a noisy speech separation of clean speech, and is closely related to people in the daily life. So the speech separation technology becomes an important research direction of processing of speech signal. In the past few decades, the traditional speech separation method has a wealth of research, such as spectral subtraction, Wiener filters and so on. But some of the assumptions made characteristics of speech separation method of traditional voice and interference in real life, may not be satisfied, so in the practical application scenarios in effect, such as the voice separated with "music noise Acoustic interference ". In recent years, auditory scene analysis this method is also more and more attention and research. The method is inspired by the human auditory system, through extracting the effective separation" "Scene clues for speech to speech. Based on computer software to realize the research of scene analysis and separation the speech is just unfolding. But the current classification of neural network based on auditory scene analysis method, although after separation can effectively improve the SNR of the speech, but did not guarantee a good sense of hearing speech, the speech has some continuity problems. Therefore, in this paper, we focus on the study of how to use the depth of the neural network to improve the auditory and speech separation, not natural disadvantages sense; and based on computational auditory scene analysis theory, aiming at the binaural channel speech signal is extracted effectively "The scene clues", to improve the separation performance of model to noisy environment; through the exploration of computational models of human auditory feature extraction, with simulation human auditory characteristicsmethod in the auditory cortex perceptual level, improve speech separation effect. Firstly, we propose a binaural channel speech based on recurrent neural network separation method. The classification and reorganization of time-frequency unit and classification of different neural networks, we use information extraction and modeling ability of neural network robust, estimated directly from the input of the target speech clean and noisy speech. By choosing the network learning goals and minimizing the mean square error criterion, the final estimate of speech feature in the time domain and frequency domain have retained the good continuity and naturalness. The experimental results show that the separation method based on recurrent neural networks can greatly enhance the separation of speech The sense of hearing. Secondly, in the regression model based on auditory scene analysis based on the theory, we propose a representation method of double channel characteristics logarithm based on energy spectrum. In the traditional logarithmic energy spectral features, we address binaural channel information characteristics, design of a global full band frequency and time of mutual energy differences in characteristics and low dimension based on mutual energy difference. In order to make the features contain enough information quantity at the same time not because of high dimension of introducing too many parameters, we design the subband mutual energy difference. The experimental results of double channel energy difference characteristic that we design the effective use of the ears the channel of information, improve the separation efficiency, and system performance based on sub-band energy mutual difference characteristic better. Finally, the calculation model of field of auditory learning, we propose based on Speech frequency domain feature separation method. When the auditory cortex by studying the mathematical model of the two-dimensional frequency domain filter characteristics for binaural perception we designed analog channel speech. In addition to time-frequency domain features of the high dimension of perceived problems, we propose several dimensionality reduction methods. Analysis methods such as the average single channel method in frequency domain and principal component extraction. In the dual channel "clues", we design the time-frequency domain characteristics and perceived energy difference, the use of dimensionality reduction methods and global weighted and weighted. The characteristics of dual channel partition optimal scale in combination, also designed the frequency with the weighted sum method, the double channel characteristics can achieve optimal scale in combination and different frequency. Through the model of weighted coefficient learning, we finally got a set of effective dimension reduction. The experimental results show that the feature combination method of automatic learning can improve the separation effect of the model more effectively.

【學(xué)位授予單位】:中國(guó)科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TN912.3

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 李從清;孫立新;龍東;任曉光;;語(yǔ)音分離技術(shù)的研究現(xiàn)狀與展望[J];聲學(xué)技術(shù);2008年05期

2 施劍;杜利民;;基于麥克陣列的實(shí)時(shí)盲語(yǔ)音分離系統(tǒng)[J];微計(jì)算機(jī)應(yīng)用;2008年05期

3 張磊;劉繼芳;項(xiàng)學(xué)智;;基于計(jì)算聽覺場(chǎng)景分析的混合語(yǔ)音分離[J];計(jì)算機(jī)工程;2010年14期

4 楊海濱;張軍;;基于模型的單通道語(yǔ)音分離綜述[J];計(jì)算機(jī)應(yīng)用研究;2010年11期

5 虞曉,胡光銳;基于高斯混合密度函數(shù)估計(jì)的語(yǔ)音分離[J];上海交通大學(xué)學(xué)報(bào);2000年01期

6 虞曉,胡光銳;基于高斯混合密度函數(shù)估計(jì)的語(yǔ)音分離[J];上海交通大學(xué)學(xué)報(bào);2000年02期

7 張雪峰,劉建強(qiáng),馮大政;一種快速的頻域盲語(yǔ)音分離系統(tǒng)[J];信號(hào)處理;2005年05期

8 陳鍇;盧晶;徐柏齡;;基于話者狀態(tài)檢測(cè)的自適應(yīng)語(yǔ)音分離方法的研究[J];聲學(xué)學(xué)報(bào);2006年03期

9 董優(yōu)麗;謝勤嵐;;不確定信號(hào)源個(gè)數(shù)的語(yǔ)音分離[J];現(xiàn)代電子技術(shù);2008年03期

10 徐方鑫;;瑞米茲交替算法在語(yǔ)音分離上的應(yīng)用[J];電腦知識(shí)與技術(shù);2012年03期

相關(guān)會(huì)議論文 前5條

1 史曉非;王憲峰;黃耀P;劉人杰;;一個(gè)推廣參數(shù)矢量算法在語(yǔ)音分離中的應(yīng)用[A];中國(guó)航海學(xué)會(huì)通信導(dǎo)航專業(yè)委員會(huì)2004學(xué)術(shù)年會(huì)論文集[C];2004年

2 劉學(xué)觀;陳雪勤;趙鶴鳴;;基于改進(jìn)遺傳算法的混疊語(yǔ)音分離研究[A];第十屆全國(guó)信號(hào)處理學(xué)術(shù)年會(huì)(CCSP-2001)論文集[C];2001年

3 林靜然;彭啟琮;邵懷宗;;基于麥克風(fēng)陣列的雙波束近場(chǎng)定位及語(yǔ)音分離[A];第二屆全國(guó)信息獲取與處理學(xué)術(shù)會(huì)議論文集[C];2004年

4 茅泉泉;趙力;;基于MIMO的盲信道語(yǔ)音分離技術(shù)[A];2004年全國(guó)物理聲學(xué)會(huì)議論文集[C];2004年

5 李量;杜憶;吳璽宏;Claude Alain;;人類聽皮層在語(yǔ)音分離中對(duì)頻率線索和空間線索的線性整合[A];增強(qiáng)心理學(xué)服務(wù)社會(huì)的意識(shí)和功能——中國(guó)心理學(xué)會(huì)成立90周年紀(jì)念大會(huì)暨第十四屆全國(guó)心理學(xué)學(xué)術(shù)會(huì)議論文摘要集[C];2011年

相關(guān)博士學(xué)位論文 前3條

1 王燕南;基于深度學(xué)習(xí)的說(shuō)話人無(wú)關(guān)單通道語(yǔ)音分離[D];中國(guó)科學(xué)技術(shù)大學(xué);2017年

2 趙立恒;基于計(jì)算聽覺場(chǎng)景分析的單聲道語(yǔ)音分離研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2012年

3 王雨;基于計(jì)算聽覺場(chǎng)景分析的單通道語(yǔ)音分離研究[D];華東理工大學(xué);2013年

相關(guān)碩士學(xué)位論文 前10條

1 趙訓(xùn)川;基于計(jì)算聽覺場(chǎng)景分析和麥克風(fēng)陣列的語(yǔ)音分離的研究[D];燕山大學(xué);2015年

2 何求知;單通道語(yǔ)音分離關(guān)鍵技術(shù)研究[D];電子科技大學(xué);2015年

3 曹猛;基于計(jì)算聽覺場(chǎng)景分析和深度神經(jīng)網(wǎng)絡(luò)的混響語(yǔ)音分離[D];太原理工大學(xué);2016年

4 李梟雄;基于雙耳空間信息的語(yǔ)音分離研究[D];東南大學(xué);2015年

5 王瑜;基于計(jì)算聽覺場(chǎng)景分析的三通道語(yǔ)音分離研究[D];燕山大學(xué);2016年

6 王菁;基于計(jì)算聽覺場(chǎng)景分析的混合語(yǔ)音分離[D];燕山大學(xué);2016年

7 束佳明;基于雙耳聲源定位的魯棒語(yǔ)音分離研究[D];東南大學(xué);2016年

8 陳麟琳;基于機(jī)器學(xué)習(xí)的欠定語(yǔ)音分離方法研究[D];大連理工大學(xué);2016年

9 李號(hào);基于深度學(xué)習(xí)的單通道語(yǔ)音分離[D];內(nèi)蒙古大學(xué);2017年

10 夏莎莎;監(jiān)督性語(yǔ)音分離中訓(xùn)練目標(biāo)的研究[D];內(nèi)蒙古大學(xué);2017年

,

本文編號(hào):1656457

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/1656457.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶dd1f9***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com