基于HMM和DNN的語音識別算法研究與實現(xiàn)
[Abstract]:In the past 2016, artificial intelligence, virtual reality, wearable devices and so on have become the frontier and hot spot of the technology industry research, these research inevitably need people and computer interaction, Speech is more efficient than keyboard and mouse, and speech has complex emotion expression, so the interaction experience is greatly improved. Therefore, speech recognition technology will be widely used as the most convenient way of human-computer interaction. For a long time, the modeling of acoustic models in the field of speech recognition is based on GMM-HMM model, which has reliable precision and mature EM algorithm to train the model parameters. Therefore, GMM-HMM model is widely used in the field of speech recognition. However, because GMM model belongs to shallow model, the ability of modeling is obviously insufficient with the increase of data volume. Deep neural network (DNN) has become a hot topic in speech recognition field because of its better modeling and learning ability for complex data. In this paper, the recognition algorithms based on HMM model and DNN model are deeply studied, and the advantages and disadvantages of the two models are analyzed. The main work is as follows: (1) the speech recognition algorithm based on Hidden Markov Model (HMM) is studied deeply. A robot control command speech recognition system is constructed by using CMUSphinx speech recognition platform, and the speech model and acoustic model are obtained by training the speech signal of the robot's ten control commands. The experimental results show that the average error rate of the system is 7.1, which has a good recognition effect, and has a high recognition rate in small vocabulary Chinese speech recognition. (2) aiming at the deficiency of HMM model, In this paper, the deep belief network (DBN) in depth neural network is deeply studied, the large vocabulary Chinese continuous speech recognition system is constructed by using Kaldi speech recognition tool, and the DNN acoustic model training is carried out on THCHS30, a Chinese open source speech database. The experimental results show that the DNN model has a better recognition effect in the large vocabulary speech recognition system than the trisyllabic model. At the same time, this paper uses Kaldi to train the TIMIT speech corpus to obtain a large vocabulary English speech recognition system, and obtains a high recognition rate. (3) noise interference is always a difficult point in speech recognition. In the process of using Kaldi to train acoustic model, By adding white noise, automobile background noise and buffet background noise into the training and testing speech, the DNN training is carried out, and compared with many models, the experimental results show that the DAE model is more effective in low dimensional representation. Can be used to restore noise damaged input.
【學位授予單位】:江西理工大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TN912.34
【參考文獻】
相關期刊論文 前10條
1 劉旺玉;SHIRAISHI HIROSHI;;基于GMM-HMM和深層循環(huán)神經網(wǎng)絡的復雜噪聲環(huán)境下的語音識別[J];制造業(yè)自動化;2016年05期
2 屈丹;張文林;;基于本征音子說話人子空間的說話人自適應算法[J];電子與信息學報;2015年06期
3 王山海;景新幸;楊海燕;;基于深度學習神經網(wǎng)絡的孤立詞語音識別的研究[J];計算機應用研究;2015年08期
4 尹寶才;王文通;王立春;;深度學習研究綜述[J];北京工業(yè)大學學報;2015年01期
5 戴禮榮;張仕良;;深度語音信號與信息處理:研究進展與展望[J];數(shù)據(jù)采集與處理;2014年02期
6 余凱;賈磊;陳雨強;徐偉;;深度學習的昨天、今天和明天[J];計算機研究與發(fā)展;2013年09期
7 陸俊;張瓊;楊俊安;王一;劉輝;;嵌入深度信念網(wǎng)絡的點過程模型用于關鍵詞檢出[J];信號處理;2013年07期
8 謝怡寧;黃金杰;何勇軍;;噪聲環(huán)境下智能機器人語音控制特征提取方法[J];北京郵電大學學報;2013年03期
9 楊雅婷;馬博;王磊;吐爾洪·吾司曼;李曉;;維吾爾語語音識別中發(fā)音變異現(xiàn)象[J];清華大學學報(自然科學版);2011年09期
10 孫峰;姚毅;李成剛;;LM算法在神經網(wǎng)絡語音識別中的應用[J];科學技術與工程;2011年09期
相關碩士學位論文 前3條
1 王琳;噪聲環(huán)境下的魯棒語音識別技術研究[D];哈爾濱工業(yè)大學;2016年
2 張建華;基于深度學習的語音識別應用研究[D];北京郵電大學;2015年
3 陳碩;深度學習神經網(wǎng)絡在語音識別中的應用研究[D];華南理工大學;2013年
,本文編號:2324964
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2324964.html