當(dāng)前位置：主頁 > 科技論文 > 網(wǎng)絡(luò)通信論文 >

基于自適應(yīng)神經(jīng)模糊推理與隱馬爾可夫的語音分割研究

發(fā)布時間：2018-12-27 09:00

【摘要】：現(xiàn)代語音技術(shù)和研究需要高精確度和高可靠性的語音分割。人工分割一直被認(rèn)為是最為可靠和精確的方法。然而,人工分割方法不僅費(fèi)時費(fèi)力,還必須由語音專家來進(jìn)行實(shí)施。在大數(shù)據(jù)時代,尤其針對大型語音庫,這是一個致命的缺陷。因此,發(fā)展高精確度的自動語音分割技術(shù),是十分必要的。最主要的自動語音分割技術(shù),被稱為強(qiáng)制校準(zhǔn)。在此方法中,隱馬爾可夫模型(HMM)被用于構(gòu)建不同音素的語音模型。而語音信號被提取為一幀一組的特征向量。該模型可以得到音素間大概的語音邊界,但結(jié)果不夠準(zhǔn)確。傳統(tǒng)的基于隱馬爾科夫模型的強(qiáng)制校準(zhǔn)系統(tǒng),在TIMIT語音庫中,以20毫秒的容忍度來計(jì)算,精確度在80%?89%之間。迄今為止,許多方法被提出,用于改善基于隱馬爾科夫的自動語音分割技術(shù)。一些研究人員認(rèn)識到,基于隱馬爾科夫的自動語音分割與人工語音分割之間的差別,是語音專家具有語音分割的相關(guān)知識。而模糊邏輯可以將此類知識,直觀的轉(zhuǎn)化為可用于計(jì)算機(jī)的模糊規(guī)則。但模糊規(guī)則需要專家精心設(shè)計(jì),且無法保證規(guī)則的完備性。針對這些問題,提出一種更加合適的改善方法,是本研究的目的。自適應(yīng)神經(jīng)模糊推理系統(tǒng)(ANFIS)是一種結(jié)合神經(jīng)網(wǎng)絡(luò)與模糊推理系統(tǒng)的機(jī)器學(xué)習(xí)方法。與其他機(jī)器學(xué)習(xí)方法相比,它具有神經(jīng)網(wǎng)絡(luò)和模糊推理系統(tǒng)的優(yōu)點(diǎn),且具有較好的性能。其優(yōu)點(diǎn):實(shí)現(xiàn)簡單,非線性,使用模糊推理規(guī)則,非常適合解決我們之前提到的問題。在本課題中,自適應(yīng)神經(jīng)模糊推理系統(tǒng),被用于學(xué)習(xí)如何修正分割點(diǎn)位置,來補(bǔ)償人工分割與機(jī)器分割間的差異和隱馬爾科夫模型本身所產(chǎn)生的系統(tǒng)分割誤差。整個實(shí)驗(yàn)分為兩步:第一步,上下文無關(guān)的HMM被用于獲得初始的語音邊界。第二步,訓(xùn)練好的自適應(yīng)神經(jīng)模糊推理系統(tǒng)用于修正第一步所得到的分割邊界。實(shí)驗(yàn)使用TIMIT數(shù)據(jù)庫。實(shí)驗(yàn)的結(jié)果表明,自適應(yīng)神經(jīng)模糊推理系統(tǒng),可以顯著的提高,基于隱馬爾科夫的自動語音分割技術(shù)精確度。在TIMIT語音庫中,以20毫秒容忍度為評價標(biāo)準(zhǔn),自適應(yīng)神經(jīng)模糊推理系統(tǒng)使得精確度從86.25%提高92.08%。這也證明了自適應(yīng)神經(jīng)模糊推理系統(tǒng)在語音分割中的有效性。此外,我們的方法更加易于構(gòu)建和應(yīng)用。未來,我們要繼續(xù)提高系統(tǒng)精確度,并將其應(yīng)用于其它數(shù)據(jù)庫。
[Abstract]:Modern speech technology and research need high accuracy and high reliability of speech segmentation. Manual segmentation is always considered to be the most reliable and accurate method. However, the manual segmentation method is not only time-consuming and laborious, but also must be implemented by speech experts. This was a fatal flaw in big data's time, especially for large-scale speech banks. Therefore, it is necessary to develop automatic speech segmentation technology with high accuracy. The most important automatic speech segmentation technique is called forced calibration. In this method, the hidden Markov model (HMM) is used to construct different phoneme models. The speech signal is extracted into a set of feature vectors. The model can get the approximate phonemes boundary, but the results are not accurate. The traditional forced calibration system based on hidden Markov model is calculated with 20 millisecond tolerance in the TIMIT speech corpus, and the accuracy is between 80% and 89%. Up to now, many methods have been proposed to improve the automatic speech segmentation based on Hidden Markov. Some researchers have realized that the difference between automatic speech segmentation based on hidden Markov and artificial speech segmentation is that speech experts have knowledge of speech segmentation. Fuzzy logic can directly transform this knowledge into fuzzy rules that can be used in computers. However, fuzzy rules need to be carefully designed by experts, and the completeness of the rules cannot be guaranteed. To solve these problems, a more suitable method is proposed, which is the purpose of this study. Adaptive neural fuzzy inference system (ANFIS) is a machine learning method combining neural network and fuzzy inference system. Compared with other machine learning methods, it has the advantages of neural network and fuzzy inference system, and has better performance. Its advantages: simple, nonlinear, fuzzy reasoning rules, very suitable to solve the problems we mentioned earlier. In this paper, the adaptive neural fuzzy inference system is used to learn how to correct the location of segmentation points to compensate for the difference between manual segmentation and machine segmentation and the system segmentation error caused by Hidden Markov Model itself. The whole experiment is divided into two steps: first, context-free HMM is used to obtain the initial speech boundary. In the second step, the trained adaptive neural fuzzy inference system is used to modify the segmentation boundary obtained from the first step. The experiment uses TIMIT database. The experimental results show that the adaptive neural fuzzy inference system can significantly improve the accuracy of automatic speech segmentation based on Hidden Markov. In the TIMIT corpus, the adaptive neurofuzzy inference system can improve the accuracy from 86.25% to 92.08 by using 20 millisecond tolerance as the evaluation criterion. It also proves the effectiveness of adaptive neural fuzzy inference system in speech segmentation. In addition, our approach is easier to build and apply. In the future, we will continue to improve the accuracy of the system and apply it to other databases.
【學(xué)位授予單位】：天津大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2014
【分類號】：TN912.3

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 劉震;王厚軍;龍兵;張治國;;一種基于加權(quán)隱馬爾可夫的自回歸狀態(tài)預(yù)測模型[J];電子學(xué)報(bào);2009年10期

2 李成,宋執(zhí)環(huán),李平;基于小波域隱馬爾可夫樹模型的過程趨勢分析[J];信息與控制;2005年03期

3 楊兵,謝維信;基于基因算法的隱馬爾可夫模型參數(shù)估計(jì)[J];系統(tǒng)工程與電子技術(shù);2002年07期

4 孫俊喜,趙永明,陳亞珠;基于小波域隱馬爾可夫樹模型的超聲圖象貝葉斯去噪[J];中國圖象圖形學(xué)報(bào);2003年06期

5 周越,許晴;基于隱馬爾可夫復(fù)合樹模型的圖像紋理分析[J];數(shù)據(jù)采集與處理;2004年04期

6 王華華;周越;楊杰;戈新良;;基于正交余弦變換域概率主成分分析的嵌入隱馬爾可夫人臉識別模型[J];上海交通大學(xué)學(xué)報(bào);2007年06期

7 景明利;周雪芹;;基于小波域的隱馬爾可夫樹模型的圖像去噪[J];昆明理工大學(xué)學(xué)報(bào)(理工版);2008年05期

8 彭玲,趙忠明,馬江林;基于樹狀分解隱馬爾可夫樹的紋理分類模型研究[J];武漢科技大學(xué)學(xué)報(bào)(自然科學(xué)版);2004年04期

9 江艷霞;周宏仁;敬忠良;;基于拉普拉斯臉和隱馬爾可夫的視頻人臉識別[J];計(jì)算機(jī)工程;2007年01期

10 蘇濤,張登福,畢篤彥;基于小波域分類隱馬爾可夫樹模型的圖像去噪[J];紅外與激光工程;2005年02期

相關(guān)碩士學(xué)位論文前7條

1 苗聰聰;基于隱馬爾可夫樹模型與旋轉(zhuǎn)不變性的遙感圖像紋理檢索方法研究[D];中國礦業(yè)大學(xué);2015年

2 李遠(yuǎn)林;基于連續(xù)隱馬爾可夫的蘭州PM_（10）污染提前24小時預(yù)測研究[D];蘭州大學(xué);2016年

3 董良;基于自適應(yīng)神經(jīng)模糊推理與隱馬爾可夫的語音分割研究[D];天津大學(xué);2014年

4 鐘微;基于隱馬爾可夫協(xié)議分析的無線網(wǎng)絡(luò)入侵檢測技術(shù)研究[D];電子科技大學(xué);2013年

5 馬晶晶;基于隱馬爾可夫理論的駕駛意圖辨識研究[D];長沙理工大學(xué);2012年

6 韓景靈;基于協(xié)議的隱馬爾可夫網(wǎng)絡(luò)入侵檢測系統(tǒng)研究[D];山西大學(xué);2007年

7 葛馨遠(yuǎn);小波域HMT模型的應(yīng)用研究[D];華北電力大學(xué)（北京）;2009年

，

本文編號：2392821

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/wltx/2392821.html

上一篇：一種LTE異構(gòu)網(wǎng)絡(luò)下的移動性優(yōu)化方法
下一篇：一種單脈沖雷達(dá)多通道L1正則化波束銳化方法

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于自適應(yīng)神經(jīng)模糊推理與隱馬爾可夫的語音分割研究