漢藏雙語跨語言語音轉(zhuǎn)換方法的研究

發(fā)布時間：2019-06-22 09:00

【摘要】：近年來,隨著人機語音交互技術的迅速發(fā)展,語音轉(zhuǎn)換技術已經(jīng)得到眾多研究學者的重視,將被應用到教育、通信等諸多領域。在國內(nèi),對漢語普通話、廣東話等主流語言的語音轉(zhuǎn)換方法的研究已經(jīng)取得很大的進步。但目前還缺少民族語言與方言的跨語言語音轉(zhuǎn)換系統(tǒng)。藏族是我國古老的少數(shù)民族之一,藏語的使用人數(shù)眾多,分布區(qū)域廣泛。本文將藏語拉薩話作為研究對象,建立了2800句藏語拉薩話的語料庫,切分及標注聲韻母信息并建立了藏語的聲韻母庫。在進行漢藏雙語跨語言語音轉(zhuǎn)換時,首先是把待轉(zhuǎn)換的藏語文本翻譯得到對應的漢語文本,將漢語文本進行文本分析獲得所有的聲韻母,再查找已建立好目錄索引的聲韻母庫;以藏語的聲韻母為基元,同時利用邊界信息,根據(jù)語境有關的問題集與候選基元的頻譜距離進行決策樹的建立。對于目標漢語語句,利用決策樹算法選擇最符合語境信息的聲韻母,選取發(fā)該音位置和音質(zhì)最符合的聲韻母,然后分別利用波形拼接合成法和STRAIGHT算法得到對應的漢語語音語句,從而完成漢藏雙語跨語言語音轉(zhuǎn)換方法的研究。論文的主要工作和創(chuàng)新如下:1、建立了2800句藏語拉薩話的語料庫,提取并建立了藏語的聲韻母庫。首先進行藏語文本語料的設計,然后進行語音語料的錄制,再進行切分及標注得到所有聲韻母的信息,最后按照藏語的聲韻母進行歸類,建立目錄索引。從而完成藏語聲韻母庫的建立,為漢藏雙語跨語言語音轉(zhuǎn)換奠定了基礎。2、漢藏雙語跨語言語音轉(zhuǎn)換中采用了STRAIGHT算法。它可以很靈活地修改語音信號的基頻、非周期索引和平滑時頻譜等相關參數(shù),從而提高轉(zhuǎn)換目標語音的音質(zhì)。3、實現(xiàn)了漢藏雙語跨語言語音轉(zhuǎn)換。對于待轉(zhuǎn)換成的目標漢語語句,利用決策樹算法選擇最符合語境信息的聲韻母,選取發(fā)該音位置和音質(zhì)最合適的聲韻母,然后分別利用波形拼接合成法和STRAIGHT算法得到對應的漢語語音語句,并對轉(zhuǎn)換后語音進行了MOS評測、DMOS評測和ABX測試。實驗結(jié)果表明,使用STRAIGHT算法轉(zhuǎn)換得到語音的音質(zhì)要優(yōu)于使用波形拼接合成法。
[Abstract]:In recent years, with the rapid development of human-computer voice interaction technology, speech conversion technology has been paid attention to by many researchers, and will be applied to many fields such as education, communication and so on. In China, great progress has been made in the study of phonetic conversion methods in Mandarin, Cantonese and other mainstream languages. However, there is still a lack of cross-language phonetic conversion system between national languages and dialects. Tibetan is one of the ancient ethnic minorities in China, the number of Tibetan speakers is large and the distribution area is wide. In this paper, Tibetan Lhasa dialect is taken as the research object, the corpus of 2800 Tibetan Lhasa dialect is established, the consonant information is segmented and marked, and the phonological vowel database of Tibetan language is established. In the process of bilingual phonetics conversion between Chinese and Tibetan, first of all, the Tibetan text to be converted is translated into the corresponding Chinese text, the Chinese text is analyzed to obtain all the consonants, and then the consonant database of the catalogue index is found. Taking the consonant of Tibetan as the primitive, and using the boundary information, the decision tree is established according to the spectral distance between the context-related problem set and the candidate primitive. For the target Chinese sentence, the decision tree algorithm is used to select the consonant which is most in accordance with the contextual information, and the phonological position and quality of the phoneme are selected, and then the corresponding Chinese phonetic statements are obtained by using waveform splicing synthesis method and STRAIGHT algorithm respectively, so as to complete the research of Chinese-Tibetan bilingual cross-language speech conversion method. The main work and innovations of this paper are as follows: 1. The corpus of 2800 Tibetan Lhasa dialect is established, and the phonological alphabet database of Tibetan language is extracted and established. Firstly, the Tibetan text corpus is designed, then the phonetic corpus is recorded, and then all the information of consonant is obtained by segmentation and tagging. Finally, according to the consonant of Tibetan language, the catalogue index is established. In order to complete the establishment of Tibetan phonological alphabet database, it lays a foundation for Chinese-Tibetan bilingual cross-language speech conversion. 2, STRAIGHT algorithm is used in Chinese-Tibetan bilingual cross-language speech conversion. It can flexibly modify the fundamental frequency, aperiodic index and smooth time spectrum of speech signal, so as to improve the sound quality of the converted target speech. 3, the bilingual speech conversion between Chinese and Tibetan is realized. For the target Chinese sentence to be converted, the decision tree algorithm is used to select the consonant which is most in line with the contextual information, and the most suitable vowel position and quality are selected. Then the corresponding Chinese speech sentences are obtained by using waveform stitching synthesis method and STRAIGHT algorithm, respectively, and the converted speech is evaluated by MOS, DMOS evaluation and ABX test. The experimental results show that the sound quality of speech converted by STRAIGHT algorithm is better than that of waveform stitching synthesis method.
【學位授予單位】：西北師范大學
【學位級別】：碩士
【學位授予年份】：2015
【分類號】：TN912.3

【相似文獻】

相關期刊論文前10條

1 謝貴武;楊繼紅;肖勇;閔剛;;基于語音分段的自適應時長調(diào)整算法[J];軍事通信技術;2008年02期

2 樊建中;孫晴;楊永杰;;一種智能盲文學習機設計[J];現(xiàn)代電子技術;2010年05期

3 溫洪昌;黃應強;傅貴興;;單片機的多段語音組合錄放系統(tǒng)設計[J];單片機與嵌入式系統(tǒng)應用;2011年10期

4 張劍;袁華強;;Rhetorical-State SVM在抽取式語音摘要中的應用[J];科學技術與工程;2013年21期

5 盧堅 ,毛兵 ,孫正興 ,張福炎;一種改進的基于說話者的語音分割算法[J];軟件學報;2002年02期

6 章文義,朱杰;幾種無語音檢測噪音估計方法的比較研究[J];計算機工程與設計;2003年10期

7 林鑫;陳樺;王開志;王繼成;;語音驅(qū)動唇形自動合成算法[J];計算機工程;2007年17期

8 蔡鐵;;基于在線單類支持向量機的自適應語音活動檢測[J];深圳信息職業(yè)技術學院學報;2008年02期

9 章釗;郭武;;話者識別中結(jié)合模型和能量的語音激活檢測算法[J];小型微型計算機系統(tǒng);2010年09期

10 朱淑琴,裘雪紅;一種精確檢測語音端點的方法[J];計算機仿真;2005年03期

相關會議論文前9條

1 田野;王作英;陸大金;;基于韻律結(jié)構(gòu)信息的非語音拒識[A];第六屆全國人機語音通訊學術會議論文集[C];2001年

2 徐明;胡瑞敏;黃云森;;基于音素識別的語音評價方法[A];第二屆和諧人機環(huán)境聯(lián)合學術會議(HHME2006)——第15屆中國多媒體學術會議(NCMT'06)論文集[C];2006年

3 王歡良;韓紀慶;李海峰;王承發(fā);;面向嵌入式應用的小詞匯量語音串識別系統(tǒng)[A];第七屆全國人機語音通訊學術會議（NCMMSC7）論文集[C];2003年

4 那斯爾江·吐爾遜;吾守爾·斯拉木;麥麥提艾力;;維吾爾語大詞匯量連續(xù)語音識別研究——語音語料庫的建立[A];民族語言文字信息技術研究——第十一屆全國民族語言文字信息學術研討會論文集[C];2007年

5 簡志華;王向文;;考慮幀間信息的語音轉(zhuǎn)換算法[A];浙江省信號處理學會2012學術年會論文集[C];2012年

6 魏維;馬海燕;;一種丟失語音信包重建的新算法[A];通信理論與信號處理新進展——2005年通信理論與信號處理年會論文集[C];2005年

7 陳凡;羅四維;;一個實用語音開發(fā)應用系統(tǒng)的設計與實現(xiàn)[A];第二屆全國人機語音通訊學術會議論文集[C];1992年

8 劉紅星;戴蓓劏;陸偉;;基于圖像增強方法的共振峰諧波能量參數(shù)的語音和端點檢測[A];第九屆全國人機語音通訊學術會議論文集[C];2007年

9 林愛華;張文俊;王毅敏;;基于肌肉模型的語音驅(qū)動唇形動畫[A];第十三屆全國圖象圖形學學術會議論文集[C];2006年

相關重要報紙文章前5條

1 atvoc;數(shù)碼語音電路產(chǎn)品概述[N];電子資訊時報;2008年

2 記者李山;德用雙音素改進人工語音表達[N];科技日報;2012年

3 中國科學院自動化研究所模式識別國家重點實驗室于劍邋陶建華;個性化語音生成技術面面觀[N];計算機世界;2007年

4 江西林慧勇;語音合成芯片MSM6295及其應用[N];電子報;2006年

5 ;與“小超人”對話[N];中國計算機報;2001年

相關博士學位論文前10條

1 高偉勛;智能家居環(huán)境中個性化語音生成關鍵技術研究[D];東華大學;2015年

2 陳麗萍;說話人確認中語音段差異建模相關問題的研究[D];中國科學技術大學;2016年

3 陶冶;文本語音匹配的研究和應用[D];山東大學;2009年

4 何俊;聲紋身份識別中非常態(tài)語音應對方法研究[D];華南理工大學;2012年

5 李冬冬;基于拓展和聚類的情感魯棒說話人識別研究[D];浙江大學;2008年

6 雙志偉;個性化語音生成研究[D];中國科學技術大學;2011年

7 古今;語音感知認證的關鍵技術研究[D];中國科學技術大學;2009年

8 彭波;Internet上語音的魯棒性傳輸研究[D];華南理工大學;2001年

9 黃湘松;基于混淆網(wǎng)絡的漢語語音檢索技術研究[D];哈爾濱工程大學;2010年

10 應娜;基于正弦語音模型的低比特率寬帶語音編碼算法的研究[D];吉林大學;2006年

相關碩士學位論文前10條

1 王明明;基于GMM和碼本映射相結(jié)合的語音轉(zhuǎn)換方法研究[D];西安建筑科技大學;2015年

2 印雪晨;宋詞朗讀呼吸信號和韻律時長研究[D];西北民族大學;2015年

3 邱一良;噪聲環(huán)境下的語音檢測方法研究[D];電子科技大學;2015年

4 朱俊梅;基于性別預分類的年齡自動估計研究[D];江蘇師范大學;2014年

5 張占松;基于DSP的語音干擾方法研究與實現(xiàn)[D];北京交通大學;2016年

6 李鵬;基于系統(tǒng)融合的語音查詢項檢索技術研究[D];解放軍信息工程大學;2015年

7 趙蓉蓉;基于計算聽覺場景分析的單通道語音盲分離技術[D];太原理工大學;2016年

8 崔瑞蓮;語種識別中的語音段表示方法研究[D];中國科學技術大學;2016年

9 劉學;基于語音樣例查詢的關鍵詞識別方法研究[D];中國科學技術大學;2016年

10 王振文;漢藏雙語跨語言語音轉(zhuǎn)換方法的研究[D];西北師范大學;2015年

，

本文編號：2504430

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/wltx/2504430.html

上一篇：印刷型設備產(chǎn)生配向膜Mura研究
下一篇：電力線載波通信中電網(wǎng)結(jié)構(gòu)參數(shù)對信道容量的影響

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

漢藏雙語跨語言語音轉(zhuǎn)換方法的研究