天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 信息工程論文 >

基于機器學習的雙麥克風手機語音增強算法研究

發(fā)布時間:2018-06-15 19:27

  本文選題:神經(jīng)網(wǎng)絡 + 手機; 參考:《南京師范大學》2017年博士論文


【摘要】:手機作為目前市場最大,消費人群最廣的便攜式移動通訊設備,其通話質(zhì)量的改善一直以來受到了廣泛的關注。由于使用場合很廣,需要應對的背景噪聲環(huán)境也十分復雜,這就要求應用于手機平臺上的消噪算法可以靈活地應對多種噪聲,在保證語音通話質(zhì)量的前提下,對背景噪聲進行有效抑制,而且算法的性能不會因使用者握機姿勢的不同或通話過程中手機的轉動而下降,對真實環(huán)境具有良好的魯棒性。近年來人工智能的應用已逐步覆蓋各個領域,機器學習作為其核心,強調(diào)在不斷的數(shù)據(jù)學習中改善算法的性能,這種特性使得機器學習相關算法(如神經(jīng)網(wǎng)絡)能夠靈活應對復雜而多變的外部環(huán)境,如果能將機器學習應用于手機消噪算法中一定會顯著提升算法在真實場景下的性能,然而相關研究工作卻并不多。本文嘗試將機器學習中的神經(jīng)網(wǎng)絡模型應用于手機消噪算法中,并針對消噪算法的各個部分進行了改進,提高了算法在真實使用場景下的靈活性和魯棒性。全文工作及創(chuàng)新點主要包含下列幾個方面:(1)針對現(xiàn)有的雙通道VAD算法依賴于固定閾值難以在多種不同的噪聲環(huán)境下準確地檢測語音和噪聲等問題。論文第二章結合神經(jīng)網(wǎng)絡提出了一種新的雙通道VAD算法,該算法以分頻帶能量差和歸一化互通道相關作為兩類新的特征,采用神經(jīng)網(wǎng)絡對語音和噪聲進行分類,不依賴于固定的閾值,可以靈活應對復雜而多變的噪聲環(huán)境,較現(xiàn)有的基于互通道能量差及其改進的VAD算法準確性更高。(2)論文的第三章利用了手機兩個麥克風接收帶噪語音信號功率的比值在噪聲段和語音段的不同,提出一種新的基于互通道功率比值的VAD算法,在此基礎上,將第二章的神經(jīng)網(wǎng)絡VAD算法與基于互通道功率比值的VAD算法相結合,最終得到一種適用于手機消噪處理中的語音和噪聲活動檢測算法,該算法能夠分別針對語音和噪聲進行準確的檢測,使用檢測結果控制時域語音增強算法對帶噪語音信號進行消噪處理,在濾除噪聲的同時能夠顯著降低對語音信號造成的損傷,提高語音的可懂度,特別是對方向性的語音干擾也能夠有很好的抑制效果。(3)為了進一步濾除第三章時域語音增強處理后殘留的線性不相關噪聲,論文的第四章將時域輸出的增強語音信號和背景噪聲信號轉化到頻域進行進一步的消噪處理,并分別針對消噪算法中兩個重要的組成部分:噪聲估計和噪聲消除做了改進。首先結合單、雙麥克風的噪聲估計算法,提高了噪聲估計的準確性,然后將基音檢測與消噪處理相結合,在語音幀中估計語音基音頻率確定語音和噪聲頻率點,針對語音和噪聲頻率點分別調(diào)整維納濾波器的參數(shù),在對噪聲進行濾除的同時盡可能地保留語音頻點,從而減少了語音失真。實驗結果表明,與現(xiàn)有的雙麥克風消噪算法相比,經(jīng)過改進后的頻域消噪算法能夠更有效地減少對語音信號造成的損害,提高了手機的通話質(zhì)量。(4)使用者握機姿勢的不同或通話過程中手機的轉動會對消噪算法的性能產(chǎn)生影響,如果能夠?qū)崟r確定手機的位置,并依據(jù)當前位置及時調(diào)整消噪算法的參數(shù)則能夠提高算法的性能,F(xiàn)有的定位算法大多需要三個以上的麥克風陣列,無法直接用于雙麥克風的手機上。論文第五章結合手機這一特定的應用場景提出了一種只使用兩個麥克風在三維空間中定位手機位置的新方法,該方法使用互通道時延和通過對目標語音到達兩個麥克風的傳播路徑進行分析提出的新特征子帶互通道功率比作為輸入,訓練神經(jīng)網(wǎng)絡輸出手機的空間位置。(5)當檢測到手機偏離標準通話位置時,依據(jù)第五章神經(jīng)網(wǎng)絡定位的結果及時地對論文第三和第四章中的時域和頻域消噪算法的參數(shù)進行調(diào)整,避免了算法因手機位置的移動而造成的通話性能下降。實驗結果表明,現(xiàn)有的雙麥克風消噪算法由于忽略了手機轉動的問題,在真實場景下的性能無法得到保障,而本論文提出的消噪算法性能更加穩(wěn)定也更具有實用性。論文的結尾概括了全文的主要工作和創(chuàng)新性的研究成果,并對進一步的研究進行了展望。
[Abstract]:Mobile phone, the largest portable mobile communication device in the market and the largest consumer in the market, has been widely concerned about the improvement of call quality. Because of the wide use of the mobile phone, the background noise environment that needs to be dealt with is very complex. This requires that the denoising algorithm applied to the flat platform of the mobile phone can be flexible to deal with many kinds of noise. On the premise of guaranteeing the quality of voice calls, the background noise is effectively suppressed, and the performance of the algorithm will not decline because of the different positions of the user and the rotation of the mobile phone during the call process. It has good robustness to the real environment. In recent years, the application of artificial intelligence has been gradually covered in various fields, and machine learning is used as its application. The core is to improve the performance of the algorithm in continuous data learning. This feature makes the machine learning related algorithms (such as neural networks) flexible to cope with complex and changeable external environments. If the machine learning is applied to the mobile phone denoising algorithm, the performance of the algorithm will be significantly improved in the real scene. This paper tries to apply the neural network model in machine learning to the algorithm of mobile phone noise elimination, and improves the flexibility and robustness of the algorithm in the real use scene. The main package of full text work and innovation includes the following aspects: (1) for the existing dual channel In the second chapter, a new dual channel VAD algorithm is proposed in the second chapter of the paper. The second chapter combines the energy difference of the frequency band and the normalized cross channel correlation as two new features, and the neural network is used for speech and noise. The classification of sound is not dependent on the fixed threshold, and it can handle complex and changeable noise environment flexibly. The VAD algorithm based on the existing mutual channel energy difference and its improved algorithm is more accurate. (2) the third chapter of the paper uses the difference of the ratio of the power of the noisy speech signals received by the two microphone of the mobile phone, and the difference between the noise and the speech segments is proposed. A new VAD algorithm based on the ratio of mutual channel power is proposed. On this basis, the second chapter neural network VAD algorithm is combined with the VAD algorithm based on the power ratio of mutual channel. Finally, a speech and noise detection algorithm suitable for mobile phone noise elimination can be obtained. The algorithm can be used to correct speech and noise respectively. Detection, using the detection results to control the time domain speech enhancement algorithm to denoise the noisy speech signal. While filtering the noise, it can significantly reduce the damage to the speech signal and improve the intelligibility of the speech, especially for the directional speech interference. (3) in order to further filter the third chapters The fourth chapter of this paper transforms the enhanced speech signal and background noise signal in the time domain to the frequency domain for further de-noising. The two important components of the denoising algorithm: noise estimation and noise elimination are improved. First, single, double Mike is combined. The algorithm of wind noise estimation improves the accuracy of noise estimation. Then the pitch detection and noise elimination are combined. The speech and noise frequency points are estimated in the speech frame, and the parameters of the Wiener filter are adjusted to the speech and noise frequency points. While the noise is filtered, the speech is preserved as much as possible. The experimental results show that compared with the existing double microphone denoising algorithm, the improved frequency domain denoising algorithm can reduce the damage to the speech signal more effectively and improve the call quality of the mobile phone. (4) the rotation of the mobile phone in the different position of the user's grip or the call process will eliminate the noise. The performance of the algorithm has an impact. If it can determine the location of the mobile phone in real time and adjust the parameters of the denoising algorithm in time according to the current position, the algorithm can improve the performance of the algorithm. Most of the existing location algorithms need more than three microphone arrays and can not be used directly on the two microphone mobile phones. The fifth chapter of the paper combines with the specific mobile phone. In the application scenario, a new method of locating the mobile phone in a three-dimensional space with only two microphones is used. This method uses the mutual channel time delay and the new characteristic subband power ratio as input by analyzing the propagation path of the target speech to two microphones, and trains the space of the neural network to output the cell phone space. Position. (5) when the mobile phone is detected to deviate from the standard call position, the parameters of the time domain and frequency domain denoising algorithm in the third and fourth chapters of the paper are adjusted in time according to the results of the fifth chapter neural network positioning, which avoids the call performance degradation caused by the mobile location of the mobile phone. The experimental results show that the existing dual microphone is used. Because of ignoring the problem of mobile phone rotation, the performance of the noise elimination algorithm can not be guaranteed in the real scene, and the performance of the denoising algorithm proposed in this paper is more stable and more practical. The end of this paper summarizes the main work and innovative research results of the full text, and looks forward to the further research.
【學位授予單位】:南京師范大學
【學位級別】:博士
【學位授予年份】:2017
【分類號】:TN912.3;TP181

【參考文獻】

相關期刊論文 前10條

1 紀振發(fā);楊暉;李然;金銀超;;基于短時自相關及過零率的語音端點檢測算法[J];電子科技;2016年09期

2 章雒霏;張銘;李晨;;一種新的語音和噪聲活動檢測算法及其在手機雙麥克風消噪系統(tǒng)中的應用[J];電子與信息學報;2016年08期

3 王明合;張二華;唐振民;許昊;;基于Fisher線性判別分析的語音信號端點檢測方法[J];電子與信息學報;2015年06期

4 張宗帥;顧亞平;張俊;楊小平;;基于HRTF的虛擬聲源定位[J];網(wǎng)絡新媒體技術;2015年02期

5 郭海燕;李梟雄;李擬s,

本文編號:2023281


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2023281.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶92723***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com