天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 信息工程論文 >

語音識別系統(tǒng)中的VTS特征補償算法優(yōu)化

發(fā)布時間:2018-05-28 17:09

  本文選題:矢量泰勒級數(shù) + 特征補償 ; 參考:《東南大學》2016年碩士論文


【摘要】:在實際環(huán)境中,由于環(huán)境噪聲的干擾,語音識別系統(tǒng)的識別性能并不理想。矢量泰勒級數(shù)(VTS:Vector Taylor Series)特征補償是一種基于模型的特征補償算法,具有很強的魯棒性,能夠有效解決訓練環(huán)境與測試環(huán)境失配導致的識別性能下降問題。針對VTS計算量大、在低信噪比環(huán)境下性能急劇下降的問題,論文將對基于VTS的孤立詞識別系統(tǒng)進行優(yōu)化,主要包括基于雙層高斯混合模型(GMM:Gaussian Mixture Model)結構的VTS特征補償優(yōu)化,以及針對多環(huán)境模型的噪聲參數(shù)估計的初始值優(yōu)化,通過優(yōu)化提高系統(tǒng)的識別速度和識別率,增強語音識別系統(tǒng)的實用性。主要工作如下:(1)魯棒語音識別系統(tǒng)結構分析。重點分析魯棒語音識別中的關鍵技術,包括基于加權子帶譜熵的端點檢測算法,VTS特征補償算法,以及聲學模型。聲學模型包括用于特征補償?shù)腉MM模型和模式識別的隱馬爾可夫模型(HMM:Hidden Markov Model).(2)基于雙層GMM模型的VTS補償算法優(yōu)化。針對VTS特征補償計算量大的問題,本文提出了雙層GMM的VTS算法結構,將特征補償中的噪聲參數(shù)估計過程和特征映射過程分開進行。在訓練階段,分別得到高斯單元混合數(shù)個數(shù)較少的GMM1模型和混合高斯個數(shù)較多的GMM2模型。特征補償過程中,先用GMM1模型估計測試語音中噪聲的均值和方差,再利用GMM2模型基于最小均方誤差準則,將測試語音的含噪特征參數(shù)映射成純凈的語音特征參數(shù)。算法優(yōu)化大幅降低了計算量,同時保持了識別性能。(3)基于多環(huán)境模型VTS算法的噪聲參數(shù)估計初始值優(yōu)化;诙喹h(huán)境模型VTS語音識別從基本環(huán)境模型集中選出與當前環(huán)境最匹配的聲學模型,用于特征補償,能夠有效降低訓練環(huán)境與測試環(huán)境之間的失配性。根據(jù)最優(yōu)GMM模型設置噪聲參數(shù)的初始值,在噪聲參數(shù)迭代求解過程中可以有效的避免最大期望(EM:Expectation-maximization)算法陷入局部收斂,使得EM算法能夠以更少的迭代次數(shù)收斂到更為準確的估計值,從而提高語音識別性能。(4)實現(xiàn)了基于MATLAB的離線仿真測試和基于C平臺的實時測試。在MATLAB平臺和C平臺進行大量實驗,驗證本文所提出優(yōu)化算法的有效性。實驗證明,本文所提出的雙層GMM結構優(yōu)化算法在中文語音庫下識別速度提升38%左右,噪聲參數(shù)估計EM迭代初始值優(yōu)化算法能夠更加準確的估計出噪聲參數(shù),從而使系統(tǒng)誤識率下降,特別是在低信噪比環(huán)境下效果更加明顯。
[Abstract]:In the actual environment, the recognition performance of speech recognition system is not ideal due to the interference of environmental noise. Vector Taylor series Taylor series is a model-based feature compensation algorithm, which is robust and can effectively solve the problem of poor recognition performance caused by mismatch of training environment and test environment. Aiming at the problem of large amount of VTS computation and sharp deterioration of performance in low SNR environment, the isolated word recognition system based on VTS will be optimized in this paper, including the VTS feature compensation optimization based on the two-layer Gao Si hybrid model (GMM: Gaussian Mixture Model) structure. And the initial value of noise parameter estimation for multi-environment model is optimized to improve the recognition speed and recognition rate of the system and enhance the practicability of the speech recognition system. The main work is as follows: 1) structure analysis of robust speech recognition system. The key technologies of robust speech recognition are analyzed, including the VTS feature compensation algorithm based on weighted sub-band spectral entropy and acoustic model. The acoustic model includes the GMM model for feature compensation and the hidden Markov model for pattern recognition. In order to solve the problem of large computation of VTS feature compensation, a two-layer GMM VTS algorithm is proposed in this paper, in which the noise parameter estimation process and the feature mapping process in feature compensation are separated. In the training stage, the GMM1 model with less mixing number of Gao Si cells and the GMM2 model with more mixed Gao Si number are obtained respectively. In the process of feature compensation, the GMM1 model is used to estimate the mean and variance of the noise in the test speech first, and then, based on the minimum mean square error criterion, the noisy feature parameters of the tested speech are mapped to pure speech feature parameters by using the GMM2 model. The algorithm greatly reduces the computational complexity, while keeping the recognition performance. 3) the noise parameter estimation initial value optimization based on the multi-environment model VTS algorithm. Based on the multi-environment model VTS speech recognition selects the most suitable acoustic model from the basic environment model for feature compensation which can effectively reduce the mismatch between the training environment and the test environment. By setting the initial value of noise parameters according to the optimal GMM model, we can effectively avoid the maximum expectation EM1: Expectation-maximization algorithm falling into local convergence in the iterative solution of noise parameters. The EM algorithm can converge to a more accurate estimate with fewer iterations, thus improving the speech recognition performance. (4) the off-line simulation test based on MATLAB and the real-time test based on C platform are realized. A large number of experiments are carried out on MATLAB platform and C platform to verify the effectiveness of the proposed optimization algorithm. Experimental results show that the proposed two-layer GMM structure optimization algorithm increases the recognition speed by about 38% under the Chinese speech corpus, and the noise parameters can be estimated more accurately by the EM iterative initial value optimization algorithm. Thus, the system error rate is decreased, especially in the low SNR environment.
【學位授予單位】:東南大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TN912.34

【相似文獻】

相關期刊論文 前10條

1 汪洪波;;語音識別系統(tǒng)在配送中心的應用[J];信息與電腦;2006年06期

2 楊q,

本文編號:1947512


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/1947512.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶7c8ba***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
黄片美女在线免费观看| 日韩成人h视频在线观看| 中文字幕人妻综合一区二区| 日韩成人高清免费在线| 国产成人精品资源在线观看| 在线亚洲成人中文字幕高清| 风间中文字幕亚洲一区| 国产成人精品午夜福利| 神马午夜福利一区二区| 国产美女网红精品演绎| 欧美字幕一区二区三区| 亚洲乱码av中文一区二区三区| 99一级特黄色性生活片| 狠色婷婷久久一区二区三区| 日本午夜一本久久久综合| 亚洲午夜福利视频在线| 国产欧美日本在线播放| 欧美成人精品一区二区久久| 国内自拍偷拍福利视频| 熟女免费视频一区二区| 日韩精品第一区二区三区| 欧美日韩国产福利在线观看| 亚洲日本加勒比在线播放| 日韩性生活片免费观看| 国产一区在线免费国产一区| 亚洲欧美日韩熟女第一页| 亚洲综合色婷婷七月丁香| 国产欧美日韩精品一区二区| 91免费一区二区三区| 亚洲一区二区三区三区| 欧美日韩精品久久亚洲区熟妇人 | 精品欧美日韩一二三区 | 欧美一二三区高清不卡| 久久精品伊人一区二区| 日本一本在线免费福利| 激情五月激情婷婷丁香| 中文字幕乱码一区二区三区四区 | 亚洲天堂久久精品成人| 五月婷日韩中文字幕四虎| 中文字幕在线五月婷婷| 日本人妻熟女一区二区三区|