鼻輔音感知線索研究
發(fā)布時間:2018-10-24 20:27
【摘要】:語音識別系統(tǒng)的性能受許多因素的影響,如不同的說話人、說話方式、環(huán)境噪音等。為了提高系統(tǒng)的識別率和穩(wěn)定性,一種重要的解決方法是尋找更好的、高強健性的基于人耳聽覺感知特性的感知線索。基于此,三維深度研究方法(3DDS)被發(fā)明,用來探究語音信號在人耳內(nèi)部的感知線索,并已成功的運用于對摩擦音和爆破音的感知線索識別。本文將這種方法拓展到鼻輔音的感知線索研究。在三個感知實驗結(jié)果分析的基礎(chǔ)上,定義了冗余感知線索和次要感知線索,并找到了/m/的感知線索是大約位于363~1250 Hz的語音部分,/n/的感知線索是大約位于939~2826 Hz的語音部分。
[Abstract]:The performance of speech recognition systems is affected by many factors, such as different speakers, speech styles, environmental noise and so on. In order to improve the recognition rate and stability of the system, an important solution is to find better and more robust cues based on human auditory perception. Based on this, a 3D depth study method (3DDS) was developed to explore the perceptual cues of speech signals in the human ear, and has been successfully applied to recognize the perceptual cues of frictional and explosive sounds. This paper extends this method to the study of perceptual cues of nasal consonants. Based on the analysis of the results of three perceptual experiments, the redundant perceptual cues and the secondary perceptual cues are defined, and the / m / perceptual cues are found to be the phonetic part located at about 363n / 1250 Hz, and the / n/ perception cues are about 939 / 2826 Hz.
【作者單位】: 電子科技大學電子工程學院;伊利諾伊大學厄巴拿香檳分校電子計算機工程系;
【基金】:美國National Institute of Health(Grant No.R21-RDC009277A)
【分類號】:TN912.34
本文編號:2292464
[Abstract]:The performance of speech recognition systems is affected by many factors, such as different speakers, speech styles, environmental noise and so on. In order to improve the recognition rate and stability of the system, an important solution is to find better and more robust cues based on human auditory perception. Based on this, a 3D depth study method (3DDS) was developed to explore the perceptual cues of speech signals in the human ear, and has been successfully applied to recognize the perceptual cues of frictional and explosive sounds. This paper extends this method to the study of perceptual cues of nasal consonants. Based on the analysis of the results of three perceptual experiments, the redundant perceptual cues and the secondary perceptual cues are defined, and the / m / perceptual cues are found to be the phonetic part located at about 363n / 1250 Hz, and the / n/ perception cues are about 939 / 2826 Hz.
【作者單位】: 電子科技大學電子工程學院;伊利諾伊大學厄巴拿香檳分校電子計算機工程系;
【基金】:美國National Institute of Health(Grant No.R21-RDC009277A)
【分類號】:TN912.34
【相似文獻】
中國重要報紙全文數(shù)據(jù)庫 前2條
1 貴陽市烏當中學 萬朝炯;淺談前后鼻韻母辨正的教學[N];貴州民族報;2010年
2 何廣見;取人名應兼顧語音美[N];語言文字周報;2007年
中國碩士學位論文全文數(shù)據(jù)庫 前1條
1 錢虹;漢藏語系鼻輔音的類型及歷史演變[D];安徽師范大學;2011年
,本文編號:2292464
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/2292464.html
最近更新
教材專著