基于語譜圖的特定人二字漢語詞匯語音識別研究方法
發(fā)布時間:2018-02-01 21:30
本文關(guān)鍵詞: 語音識別 語譜圖 特征融合 支持向量機(SVM) 出處:《東北師范大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:自計算機誕生以來人類夢寐以求的想法就是讓計算機聽懂人類的語言。隨著電子產(chǎn)品的飛速發(fā)展,人們越來越迫切想要擺脫鍵盤的束縛,取而代之以語音輸入這樣便于性人性化的輸入方式。尤其是漢語漢字的輸入,一直以來都是計算機應(yīng)用普及的一大難題,因此利用漢語語音交流互動是一個非常重要的研究課題。因為現(xiàn)代漢語常用詞表中使用頻度較高的詞語有56008個,其中五音節(jié)和五音節(jié)以上詞語162個,四音節(jié)詞語5855個,三音節(jié)詞語6459個,雙音節(jié)詞語40351個,單音節(jié)詞語3181個,由此可見雙音節(jié)詞語占所有詞語比例的72%,在常用詞起著不可估量的作用。所以本文選用10個二字漢語詞匯進(jìn)行語音識別算法研究,具有較強的代表性。傳統(tǒng)的語音分析采用固定窗傅立葉變換獲取語音信號的時 頻局部化信息,以短時語音幀為基本單位進(jìn)行處理的分割方法破壞了音節(jié)承載信息的整體性,在一定程度上影響了語音識別的效果。本文采用圖像處理技術(shù)進(jìn)行語音識別,對二字漢語詞匯語音的語譜圖進(jìn)行特征分析、提取并采用了四種方法對語譜圖進(jìn)行特征量提取,對語譜圖進(jìn)行等寬度分帶行投影、列投影和二進(jìn)寬度分帶行投影,以及采用二維離散db4小波基分別對寬窄帶語譜圖進(jìn)行6層小波包分解,并計算出每層的水平細(xì)節(jié)能量值,垂直細(xì)節(jié)能量值和對角細(xì)節(jié)能量值。將這四種方法所提取出的特征集合作為識別的特征向量,以支持向量機為分類器對二字漢語詞匯識別。該算法利用語譜圖的整體特征逐字逐詞進(jìn)行語音識別,能夠凸顯語音信號的整體時頻特性,依據(jù)漢語的特點,將每一條語音命令作為一副圖像進(jìn)行詞匯研究,保證了語句的完整性,有助于提高語音識別系統(tǒng)的識別率和魯棒性。通過采用圖像處理技術(shù)對語音樣本進(jìn)行去噪處理,雖然去噪后的語音文件相對于無噪語音樣本效果很差,但是本文也進(jìn)行了系統(tǒng)的嘗試與探究,同時為后續(xù)語音增強方法的繼續(xù)深入探究提供了重要依據(jù)和線索。
[Abstract]:Since the birth of the computer, the dream is to make the computer understand the human language. With the rapid development of electronic products, people are increasingly eager to get rid of the shackles of the keyboard. Instead of speech input, which is a convenient and humanized input method, especially the input of Chinese characters, it has always been a difficult problem for computer application to popularize. Therefore, the use of Chinese phonetic communication and interaction is a very important research topic, because there are 56 008 words used frequently in the list of common words in modern Chinese. Among them, there are 162 words with five syllables and more than five syllables, 5855 words with four syllables, 6459 words with three syllables, 40351 words with two syllables and 3181 words with one syllable. It can be seen that the two-syllable words account for the proportion of all words and play an inestimable role in the common words. So this paper chooses 10 two-character Chinese words to study the speech recognition algorithm. The traditional speech analysis uses the fixed window Fourier transform to obtain the time and frequency localization information of the speech signal. The segmentation method based on short-time speech frames destroys the integrity of syllable information and affects the effect of speech recognition to a certain extent. In this paper, image processing technology is used for speech recognition. This paper analyzes the features of the two word Chinese vocabulary phonogram, extracts and uses four methods to extract the feature quantity of the spectrum image, and carries on the equal-width banding line projection to the language spectrum image. Column projection and dyadic width banding line projection, as well as 2-D discrete db4 wavelet basis, are used to decompose the broad and narrow band spectrum with six layers of wavelet packet, and calculate the horizontal detail energy value of each layer. Vertical detail energy value and diagonal detail energy value. The feature set extracted by these four methods is used as the recognition feature vector. The support vector machine (SVM) is used to recognize the two-character Chinese vocabulary. The algorithm uses the whole feature of the spectrum map to recognize the speech word by word word by word, which can highlight the overall time-frequency characteristics of the speech signal, according to the characteristics of Chinese. Every voice command is used as an image for lexical study to ensure the integrity of the sentence. It is helpful to improve the recognition rate and robustness of the speech recognition system. Image processing technology is used to Denoise the speech sample, although the effect of the de-noised speech file is very poor compared with the noiseless speech sample. However, this paper also makes a systematic attempt and exploration, and provides an important basis and clue for the further exploration of the subsequent speech enhancement methods.
【學(xué)位授予單位】:東北師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TN912.34
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 吳迪;趙鶴鳴;陶智;張曉俊;肖仲U,
本文編號:1482918
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/1482918.html
最近更新
教材專著