基于語音識別的構音及語音障礙自動評估系統(tǒng)研制
發(fā)布時間:2018-06-23 00:37
本文選題:語音識別 + 構音障礙評估; 參考:《華東師范大學》2014年博士論文
【摘要】:我國言語構音、語音障礙患者數量較多,而相關的障礙評估方法主要以主觀聽覺感知為主,缺乏一定的客觀性和穩(wěn)定性。近年來,語音識別技術在多個領域得到了廣泛的應用,在言語語言教育方面的應用研究也取得了一定成果。但是在言語障礙評估與康復研究領域,基于語音識別的相關研究成果并不多見,而且未能引起足夠的重視。本研究根據國內外言語構音、語音障礙的評估方法研究現狀和發(fā)展趨勢,綜合語音識別技術在言語語言教育中應用的研究成果,進行了言語構音障礙、語音障礙自動評估的探索性研究。 本研究首先提出基于語音識別進行構音障礙自動評估的基本思想,即能夠通過計算機等設備對患者的構音功能從內容、聲調以及障礙類型三方面進行自動評價。為驗證基于語音識別的構音障礙自動評估方法的可行性,本研究基于微軟公司發(fā)布的自帶識別引擎的Speech SDK開發(fā)了構音障礙自動評估可行性分析系統(tǒng)。通過比較使用該系統(tǒng)和主觀聽覺感知方法對3-6歲健聽兒童的構音能力評估的結果,得到構音障礙自動評估可行性分析系統(tǒng)對全部被試的平均識別準確率達到83.5%,初步說明基于語音識別的構音障礙評估方法是可行的,但仍無法滿足言語障礙評估與康復的臨床實際需求。 本研究也因此提出了基于改進技術方案的構音障礙自動評估方法。采用自行構建的識別引擎和微軟內置的識別引擎構建“雙識別引擎”的構音障礙自動評估系統(tǒng)。首先,本研究采用基于隱馬爾科夫模型的語音識別算法,從整體平均的角度來實現最優(yōu)的識別過程,在統(tǒng)計框架中尋找能夠使模型參數最大化的詞條作為識別結果。提取62名3-6歲健聽兒童按照指定詞表所發(fā)語音的39維參數制作標準聲學模型。根據前人對聽障兒童構音障礙評估的研究成果,得到普通話聲母和韻母的常見構音障礙具體情況,將構音障礙產生的詞條匯總成表;谖④汼peech SDK實現構音障礙類型的檢測模塊。最后,基于SMDSF算法提取語料的4維基頻特征,制作用于實現聲調識別的標準聲調識別模型。效果驗證包括兩方面,一方面使用與構音障礙自動評估可行性分析實驗相同的健聽兒童語料來驗證系統(tǒng)的識別性能,識別準確率達到98%以上;另一方面采用主觀聽覺感知評估與語音識別評估對比的方式,對3-5歲的聽障兒童的構音能力進行評估。結果證明兩種方法得到的結果沒有顯著性差異,四項構音清晰度指標的值基本一致,能夠基本實現構音障礙的自動評估。 在構音障礙自動評估系統(tǒng)構建的基礎上,本研究通過改進識別算法提出了語音障礙自動評估系統(tǒng)。在自行構建的標準聲學模型基礎上,分別提出了《語音重復能力測驗詞表》和《語音切換能力測驗詞表》以及基于這些詞表的語音障礙自動評估方法。然后,同樣基于微軟Speech SDK提供的具備優(yōu)先識別指定詞條功能的API函數,實現語音重復和語音切換的障礙檢測。最后,以聽障兒童為對象,采用主觀聽覺感知評估與構音識別評估對比的方式,對語音障礙自動評估系統(tǒng)進行效果驗證。結果證明。語音障礙自動評估系統(tǒng)的評估結果與主觀評估結果沒有顯著性差異,兩種方法得到的評估指標結果基本一致,能夠基本實現語音重復能力和語音切換能力的自動評估功能。
[Abstract]:In our country, the number of speech sounds and speech disorders is large, and the related obstacle assessment methods are mainly subjective auditory perception and lack of objectivity and stability. In recent years, speech recognition technology has been widely used in many fields, and some achievements have been achieved in the application and research of speech language education. In the field of language barrier assessment and rehabilitation research, the related research results based on speech recognition are not very common, and they have not been paid enough attention. This study is based on the research status and development trend of the evaluation methods of speech disorders at home and abroad, and the research results of the application of speech recognition technology to speech language education. An exploratory study of automatic assessment of dysarthria and speech disorders.
In this study, the basic idea of automatic evaluation of dysarthria based on speech recognition is first proposed, that is, it can be automatically evaluated from three aspects of content, tone and obstacle type by computer and other devices. This study is based on the feasibility of the automatic evaluation method based on speech recognition. This study is based on Microsoft. The Speech SDK issued by the company has developed an automatic evaluation feasibility analysis system for dysarthria. By comparing the results of the system and the subjective auditory perception method for the assessment of the sound building ability of 3-6 year old healthy children, the average recognition accuracy of the system is obtained for all the subjects. Up to 83.5%, it shows that the speech recognition based articulation disorder assessment method is feasible, but it still can not meet the clinical needs of speech impairment assessment and rehabilitation.
In this study, an automatic evaluation method based on improved technical scheme is proposed. The self constructed recognition engine and the built-in recognition engine of Microsoft are used to construct a "double recognition engine" automatic evaluation system. Firstly, the speech recognition algorithm based on the hidden Markov model is used in this study, from the overall average. In order to achieve the best recognition process, we find a word which can maximize the parameters of the model in the statistical framework. 62 children of 3-6 years old are extracted according to the 39 dimension parameters of the speech on the specified word list. According to the previous research results of the impairment assessment of the sound barrier of the hearing impaired children, the Chinese consonant is obtained. And the specific situation of the common sound barrier of vowel, the words which are generated by the dysarthria are summarized into tables. Based on the Microsoft Speech SDK, the detection module of the dysarthria type is realized. Finally, based on the SMDSF algorithm, the 4 dimension fundamental frequency characteristics of the corpus are extracted, and the standard tone recognition model for tone recognition is made. The effect verification includes two aspects, on the one hand We use the same sound hearing children's corpus to verify the recognition performance of the system, and the recognition accuracy is above 98%. On the other hand, we use the way of subjective auditory perception assessment and speech recognition evaluation to evaluate the sound building ability of the 3-5 year old hearing impaired children. The results prove that two kinds of methods are used. There is no significant difference in the results obtained by the method. The four articulation indices are basically the same, which can basically achieve the automatic assessment of articulation disorders.
On the basis of the construction of the automatic evaluation system for dysarthria, an automatic speech obstacle evaluation system is proposed by improving the recognition algorithm. On the basis of the standard acoustic model constructed by ourselves, the speech repeating ability test word list > speech switching ability test word list > and the speech barrier based on these words are proposed respectively. And then, based on the API function provided by Microsoft Speech SDK, which has the priority to identify the function of the specified word, it realizes the obstacle detection of speech repetition and speech switching. Finally, the hearing impaired children are used to compare the subjective auditory perception assessment and the construction of the speech recognition evaluation, and the effect of the speech obstacle automatic evaluation system is achieved. The results show that there is no significant difference between the evaluation results of the speech obstacle automatic evaluation system and the subjective evaluation results. The results of the two methods are basically the same, and can basically realize the automatic evaluation function of the speech repetition ability and the voice switching ability.
【學位授予單位】:華東師范大學
【學位級別】:博士
【學位授予年份】:2014
【分類號】:TN912.3
【參考文獻】
相關期刊論文 前10條
1 劉加;漢語大詞匯量連續(xù)語音識別系統(tǒng)研究進展[J];電子學報;2000年01期
2 戰(zhàn)普明,王作英,陸大 ;語音識別隱馬爾可夫模型的改進[J];電子學報;1994年01期
3 杜利民,侯自強;漢語語音識別研究面臨的一些科學問題[J];電子學報;1995年10期
4 曾定;劉加;;母語與非母語語音識別聲學建模[J];計算機工程;2010年08期
5 王士進;李宏言;柯登峰;李鵬;高鵬;徐波;;面向第二語言學習的口語大規(guī)模智能評估技術研究[J];中文信息學報;2011年06期
6 盧紅云;黃昭鳴;;特殊兒童言語康復的方法[J];社會福利;2011年07期
7 梁維謙,王國梁,劉加,劉潤生;基于音素的發(fā)音質量評價算法[J];清華大學學報(自然科學版);2005年01期
8 孫喜斌;于麗玫;曲成毅;梁巍;王琦;魏志云;;中國聽力殘疾構成特點及康復對策[J];中國聽力語言康復科學雜志;2008年02期
9 李嵬,祝華,BarbaraDodd,姜濤,彭聃齡,舒華;說普通話兒童的語音習得[J];心理學報;2000年02期
10 張爽;劉加;;語言學習機中使用韻律改進的發(fā)音質量評價方法研究[J];小型微型計算機系統(tǒng);2009年05期
,本文編號:2054936
本文鏈接:http://sikaile.net/kejilunwen/wltx/2054936.html
教材專著