語音轉換中聲道譜參數變換算法的研究
發(fā)布時間:2018-06-27 17:52
本文選題:語音轉換 + 語音信號處理。 參考:《南京郵電大學》2017年碩士論文
【摘要】:語音轉換技術就是指在維持說話人語言內容不變的情況下,將源說話人聲音的個性特征進行轉化,使得變換后的語音更貼近目標人語音。語音轉換技術屬于語音信號處理衍生出來的一個研究方向,語音轉換與語音信號分析、識別和合成等研究方向有著密不可分的聯系且相互之間促進發(fā)展,還有許多實際應用如文語轉換、制作影視作品配音、醫(yī)學領域等等。本文重點研究以下內容:(1)對語音轉換系統(tǒng)中各個部分的作用進行討論;主要針對聲道譜特征參數這一特征的轉換進行研究并且依此介紹許多經典轉換模型,如矢量量化、高斯混合、線性多變量回歸、人工神經網絡等等。(2)徑向基函數神經網絡常被用作轉換模型,該神經網絡的核函數參數通常采納K-均值聚類進行訓練,由于此方法具有一些缺點如收斂速度慢、容易落入局部最優(yōu)中、泛化能力不強等。本文提出改進粒子群算法優(yōu)化徑向基函數的方法來提高此網絡的性能,以便于更準確的獲得源說話人與目標人之間譜包絡的映射關系并研究其在語音轉換系統(tǒng)中起到的作用。實驗成果表明,本文提出的轉換方案能夠有效提升神經網絡的性能,使轉換后的語音更接近于目標語音。(3)常規(guī)語音轉換系統(tǒng)中聲道譜特征參數都根據單一的徑向基函數神經網絡規(guī)則進行轉換,這樣很難匹配所有的特征參數,使得轉換語音的質量有所下降。為了改善上述情況,本文提出自組織特征映射與改進粒子群優(yōu)化徑向基函數神經網絡聯合轉換聲道譜特征參數,利用自組織特征映射良好的分類能力建立多轉換規(guī)則。通過主觀和客觀的評價:這種多類別映射規(guī)則可以提升轉換的精確度,使得語音信號的質量得到提升。
[Abstract]:The technology of speech conversion is to transform the individual characteristics of the source speaker's voice under the condition of keeping the speaker's language content unchanged, so that the transformed speech is closer to the target person's speech. Speech conversion technology is a research direction derived from speech signal processing. Speech conversion is closely related to speech signal analysis, recognition and synthesis, and promotes the development of each other. There are many practical applications such as text-to-speech conversion, production of film and television dubbing, medical field and so on. This paper focuses on the following contents: (1) the role of each part of the speech conversion system is discussed, and the conversion of the characteristic parameter of the channel spectrum is mainly studied and many classical conversion models, such as vector quantization, are introduced. Gao Si mixing, linear multivariate regression, artificial neural network and so on. (2) Radial basis function neural network is often used as the transformation model, the kernel function parameters of the neural network are usually trained by K-means clustering. This method has some disadvantages, such as slow convergence rate, easy to fall into local optimum, weak generalization ability and so on. In this paper, an improved particle swarm optimization method is proposed to optimize the radial basis function (RBF) to improve the performance of the network, so as to obtain more accurately the mapping relationship of spectral envelope between the source speaker and the target, and to study its role in the speech conversion system. Experimental results show that the proposed conversion scheme can effectively improve the performance of neural networks. The transformed speech is closer to the target speech. (3) in the conventional speech conversion system, the characteristic parameters of the channel spectrum are converted according to a single radial basis function neural network rule, so it is difficult to match all the feature parameters. The quality of the converted speech is reduced. In order to improve the above situation, this paper presents a method of combining self-organizing feature mapping with improved particle swarm optimization radial basis function neural network to transform the acoustic spectrum feature parameters, and sets up multi-conversion rules by using the good classification ability of self-organizing feature mapping. Subjective and objective evaluation: this multi-class mapping rule can improve the accuracy of the conversion and improve the quality of speech signal.
【學位授予單位】:南京郵電大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TN912.3;TP18
【相似文獻】
相關期刊論文 前1條
1 孫新建;張雄偉;楊吉斌;曹鐵勇;鐘新毅;;基于雙因子高斯過程動態(tài)模型的聲道譜轉換方法[J];自動化學報;2014年06期
相關碩士學位論文 前10條
1 董添輝;語音轉換中聲道譜參數變換算法的研究[D];南京郵電大學;2017年
2 楊秀峰;基于神經網絡的語音轉換算法研究[D];西安建筑科技大學;2017年
3 呂中良;基于改進的BLFW下平行和非平行文本的語音轉換算法研究[D];南京郵電大學;2017年
4 靳棟棟;礦井運輸控制與語音融合系統(tǒng)的研究[D];中國礦業(yè)大學;2017年
5 王志龍;甘肅省VoLTE優(yōu)化研究與實踐[D];蘭州交通大學;2017年
6 賀偉;VOLTE互操作分析及優(yōu)化研究[D];電子科技大學;2017年
7 王建偉;基于深度學習的情緒感知系統(tǒng)的研究與設計[D];電子科技大學;2017年
8 劉沖沖;Sagnac/Φ-OTDR混合型光纖語音傳感器及其語音降噪方法研究[D];安徽師范大學;2017年
9 鮑承毅;基于語音媒體的移動學習系統(tǒng)的設計與實現[D];華中師范大學;2017年
10 水晶;語音調度WEB平臺服務器推送技術研究[D];長安大學;2017年
,本文編號:2074737
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2074737.html