語(yǔ)音識(shí)別中抗噪方法的研究
發(fā)布時(shí)間:2018-05-30 22:10
本文選題:語(yǔ)音識(shí)別 + 抗噪; 參考:《中國(guó)礦業(yè)大學(xué)》2014年碩士論文
【摘要】:目前,語(yǔ)音識(shí)別系統(tǒng)已經(jīng)在理想的環(huán)境下獲得了不錯(cuò)的成績(jī),但是存在于應(yīng)用環(huán)境中的各種干擾信號(hào),導(dǎo)致系統(tǒng)的識(shí)別能力大幅度下降。由此可見(jiàn),去噪技術(shù)已經(jīng)成為語(yǔ)音識(shí)別系統(tǒng)能否在生活中完美應(yīng)用的關(guān)鍵,同時(shí)也是語(yǔ)音識(shí)別領(lǐng)域要攻克的熱點(diǎn)問(wèn)題。目前,語(yǔ)音識(shí)別中主要的抗噪方法分為語(yǔ)音增強(qiáng)技術(shù)、抗噪特征提取技術(shù)和模型補(bǔ)償技術(shù)三個(gè)方向,在本文中,,結(jié)合前兩種技術(shù)提出了一種組合的去噪方法來(lái)進(jìn)一步提高系統(tǒng)的魯棒性。 首先,在語(yǔ)音增強(qiáng)技術(shù)方面,通過(guò)分析硬閾值函數(shù)、軟閾值函數(shù)、軟硬閾值折中函數(shù)和Garrote閾值函數(shù)的優(yōu)缺點(diǎn),構(gòu)造出了一種改進(jìn)的閾值函數(shù),這個(gè)函數(shù)同時(shí)具備了以上幾種函數(shù)的優(yōu)點(diǎn)。然后通過(guò)Matlab仿真驗(yàn)證了該函數(shù)的可行性與有效性。 其次,在抗噪特征提取方面,通常采用MFCC參數(shù)和基于小波多分辨率分析改進(jìn)的MFCC參數(shù)。由于MFCC參數(shù)提取過(guò)程中的FFT變換在時(shí)域和頻域分析窗是不變化的,這就違背了語(yǔ)音信號(hào)非平穩(wěn)性的特點(diǎn);而基于小波多分辨率分析的MFCC參數(shù)只分解變換后的低頻部分,高頻部分卻不做任何操作。針對(duì)這兩個(gè)缺陷,本文給出了一種改進(jìn)的基于小波包分析的特征提取方法,并驗(yàn)證了這種方法具有較好的識(shí)別結(jié)果。 最后,在性能分析部分,首先基于本文的組合去噪方法構(gòu)建了一個(gè)非特定人、孤立詞、小詞匯量的語(yǔ)音識(shí)別系統(tǒng),然后在幾種不同信噪比環(huán)境下,經(jīng)過(guò)對(duì)比不同系統(tǒng)的識(shí)別率,驗(yàn)證了該組合去噪方法的有效性。
[Abstract]:At present, the speech recognition system has achieved good results in the ideal environment, but there are various interference signals in the application environment, which leads to a great decline in the recognition ability of the system. Thus, denoising has become the key to the perfect application of the speech recognition system in life, and it is also the field of speech recognition. At present, the main anti noise methods in speech recognition are divided into three directions: speech enhancement technology, anti noise feature extraction technology and model compensation technology. In this paper, a combined denoising method is proposed to further improve the robustness of the system with the first two techniques.
First, by analyzing the advantages and disadvantages of hard threshold function, soft threshold function, soft and hard threshold function and Garrote threshold function, an improved threshold function is constructed in speech enhancement technology. This function has the advantages of several functions at the same time. Then the feasibility and effectiveness of the function are verified through Matlab simulation.
Secondly, the MFCC parameter and the MFCC parameter based on the wavelet multiresolution analysis are usually adopted in the anti noise feature extraction. Because the FFT transform in the MFCC parameter extraction process is not changed in the time domain and frequency domain analysis window, it violates the characteristics of the nonstationary of the speech signal, and the MFCC parameters based on the wavelet multi-resolution analysis are only divided. In this paper, an improved feature extraction method based on wavelet packet analysis is given in this paper, and it is proved that this method has good recognition results.
Finally, in the part of performance analysis, firstly, based on the combined denoising method of this paper, a speech recognition system of non specific person, isolated word and small vocabulary is constructed. Then, in several different SNR environments, the effectiveness of the combined denoising method is verified by comparing the recognition rate of different systems.
【學(xué)位授予單位】:中國(guó)礦業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TN912.34
【參考文獻(xiàn)】
相關(guān)博士學(xué)位論文 前2條
1 呂釗;噪聲環(huán)境下的語(yǔ)音識(shí)別算法研究[D];安徽大學(xué);2011年
2 馬龍華;車載環(huán)境下語(yǔ)音識(shí)別方法研究[D];哈爾濱工程大學(xué);2008年
本文編號(hào):1957021
本文鏈接:http://sikaile.net/kejilunwen/wltx/1957021.html
最近更新
教材專著