基于核函數的語音情感識別技術的研究

發(fā)布時間：2018-10-05 19:48

【摘要】：作為情感計算的一個重要分支,情感識別在近年來引起了國內外研究者的廣泛關注。語音作為人類交流的重要方式之一,承載著說話人大量的情感信息。語音情感識別技術能夠使計算機通過語音信號識別說話人的情感狀態(tài),實現(xiàn)更和諧的人機交互,在實際生活中具有非常廣闊的應用前景。本文主要研究了基于核函數的語音情感識別,將核方法引入傳統(tǒng)的模式識別算法中,進一步提高算法的非線性處理能力,并針對相應的算法提出若干改進應用于語音情感識別中。本論文的主要研究內容和創(chuàng)新點如下：(1)闡述了語音情感識別的研究背景和意義,并總結了情感描述模型、情感數據庫、情感特征參數、特征降維及情感分類算法等方面的國內外研究現(xiàn)狀。(2)設計并錄制漢語語音情感數據庫,該庫包含高興、憤怒、悲傷、害怕、平靜等五種基本情感下的語音,且全部語音樣本都經過有效性檢驗以確保數據符合規(guī)范。對數據庫中的語音信號進行預處理工作,并提取出語速、能量和幅度、基頻、共振峰、MFCC等參數組成情感特征矢量并分析不同情感狀態(tài)下參數的變化規(guī)律,為后續(xù)語音情感實驗做好基礎工作。(3)提出一種核C均值聚類與核K近鄰分類相結合的算法用于語音情感識別中,該算法利用核映射將原輸入空間映射到高維特征空問,在特征空間內進行C均值聚類構造代表性的情感模板,再利用K近鄰算法對測試樣本分類。該算法不僅利用了核方法提高分類器的非線性處理能力,還克服了傳統(tǒng)核K近鄰分類時需要計算測試樣本與所有訓練樣本間距離的缺點,提高了分類速度。為了進一步提高該算的識別正確率,本文還將模糊集的理論引入該算法中,通過構造模糊聚類得到更優(yōu)的情感聚類集合并在近鄰分類時通過構造隸屬度函數使測試樣本以不同程度隸屬于各個情感類別,得到更加符合實際情況的分類結果。最終實驗表明,該算法具有更有效的識別效率。(4)提出將核稀疏表示分類算法應用在語音情感識別中,該算法利用核映射機制將傳統(tǒng)稀疏表示分類器推廣到核稀疏表示分類器,克服了稀疏表示分類器不能有效解決非線性問題的缺點,使測試樣本更準確地表示為訓練樣本的一個稀疏線性組合。最后利用局部編碼的思想對該算法進行改進,提出一種基于局部約束的加權核稀疏表示分類算法,與核稀疏表示分類算法相比,該算法能夠使測試樣本用更多近鄰的訓練樣本進行稀疏表示,在一定程度上能夠提高分類的準確度。(5)對支持向量機中的核函數進行了深入研究并提出改進,為了突出了不同特征對分類作用的差異性,本文將特征重要程度的信息融入多項式核函數和高斯核函數中,然后利用改進后的多項式核函數和高斯核函數組成組合核函數,最后再通過優(yōu)化算法尋找最優(yōu)核參數以獲得性能最優(yōu)的組合核函數。該算法不僅對基核函數進行了改進,還利用組合核函數代替單一核函數,并通過優(yōu)化算法尋找最優(yōu)核參數及組合參數,可以說對傳統(tǒng)支持向量機做了多重改進以提升算法性能。
[Abstract]:As an important branch of emotion calculation, emotion recognition has attracted the attention of researchers at home and abroad in recent years. As one of the important ways of human communication, speech carries a large amount of emotional information. The speech emotion recognition technology enables the computer to recognize the emotional state of the speaker through the voice signal, realize more harmonious human-computer interaction, and has a very wide application prospect in real life. This paper mainly studies the recognition of speech emotion based on kernel function, introduces kernel method into the traditional pattern recognition algorithm, further improves the non-linear processing ability of the algorithm, and puts forward some improvement to the speech emotion recognition according to the corresponding algorithm. The main research contents and innovation points of this thesis are as follows: (1) the research background and significance of speech emotion recognition are expounded, and the domestic and foreign research status of emotion description model, emotion database, emotion characteristic parameter, feature health-reduction and emotion classification algorithm are summarized. (2) Design and record the Chinese voice emotion database, which contains five basic emotions such as happiness, anger, sadness, fear, calm and so on, and all the speech samples pass the validity check to ensure that the data conforms to the specification. The speech signal in the database is pre-processed, and the speech speed, energy and amplitude, fundamental frequency, resonance peak, MFCC and other parameters are extracted to form the emotion characteristic vector and the change rule of parameters in different emotional states is analyzed, and the basic work is done for the subsequent voice emotional experiment. (3) a method for combining core C mean clustering and nuclear K nearest neighbor classification is proposed for speech emotion recognition. The algorithm uses kernel mapping to map the original input space to the high-dimensional feature empty question, and performs C-means clustering in the feature space to construct a representative emotion template. and then classifying the test samples by using a K-nearest algorithm. The algorithm not only improves the nonlinear processing capability of the classifier by using the core method, but also overcomes the defect that the distance between the test sample and all the training samples needs to be calculated in the traditional nuclear K nearest neighbor classification, and the classification speed is improved. In order to further improve the accuracy of the calculation, this paper also introduces the theory of fuzzy sets into the algorithm. By constructing fuzzy polytypes to get better emotion clustering sets and constructing membership functions in the neighborhood classification, the test samples are subordinate to each emotion category in different degrees. and a more realistic classification result is obtained. The final experiment shows that the algorithm has more effective recognition efficiency. (4) applying the kernel sparse representation classification algorithm in speech emotion recognition, using the kernel mapping mechanism to extend the traditional sparse representation classifier to the kernel sparse representation classifier, overcoming the defect that the sparse representation classifier can not effectively solve the non-linear problem, the test samples are more accurately represented as a sparse linear combination of the training samples. At last, using the idea of local coding to improve the algorithm, a weighted kernel sparse representation classification algorithm based on local constraints is proposed. Compared with the kernel sparse representation classification algorithm, the algorithm can make the test samples sparse representation with more neighbor training samples. the accuracy of the classification can be improved to a certain extent. (5) The kernel functions in the support vector machine are deeply researched and improved. In order to highlight the difference of different features on the classification, the information of the feature importance degree is integrated into the polynomial kernel function and the Gaussian kernel function. Then using the improved polynomial kernel function and the Gaussian kernel function to form the combined kernel function, finally finding the optimal kernel parameters by the optimization algorithm to obtain the optimal combination kernel function. The algorithm not only improves the kernel kernel function, but also replaces the single kernel function by using the combination kernel function, and finds the optimal kernel parameter and the combined parameter through the optimization algorithm, and can say that the traditional support vector machine has multiple improvements to improve the performance of the algorithm.
【學位授予單位】：東南大學
【學位級別】：碩士
【學位授予年份】：2015
【分類號】：TN912.34

【相似文獻】

相關期刊論文前10條

1 趙力;黃程韋;;實用語音情感識別中的若干關鍵技術[J];數據采集與處理;2014年02期

2 陳建廈,李翠華;語音情感識別的研究進展[J];計算機工程;2005年13期

3 王茜;;一個語音情感識別系統(tǒng)的設計與實現(xiàn)[J];大眾科技;2006年08期

4 孫亞;;遠程教學中語音情感識別系統(tǒng)的研究與實現(xiàn)[J];長春理工大學學報(高教版);2008年02期

5 章國寶;宋清華;費樹岷;趙艷;;語音情感識別研究[J];計算機技術與發(fā)展;2009年01期

6 石瑛;胡學鋼;方磊;;基于決策樹的多特征語音情感識別[J];計算機技術與發(fā)展;2009年01期

7 趙臘生;張強;魏小鵬;;語音情感識別研究進展[J];計算機應用研究;2009年02期

8 張石清;趙知勁;;噪聲背景下的語音情感識別[J];西南交通大學學報;2009年03期

9 黃程韋;金峗;王青云;趙艷;趙力;;基于特征空間分解與融合的語音情感識別[J];信號處理;2010年06期

10 袁健;賀祥;許華虎;馮肖維;劉玲;;服務機器人的語音情感識別與交互技術研究[J];小型微型計算機系統(tǒng);2010年07期

相關會議論文前8條

1 陳建廈;;語音情感識別綜述[A];第一屆中國情感計算及智能交互學術會議論文集[C];2003年

2 楊桃香;楊鑒;畢福昆;;基于模糊聚類的語音情感識別[A];第三屆和諧人機環(huán)境聯(lián)合學術會議（HHME2007）論文集[C];2007年

3 羅武駿;包永強;趙力;;基于模糊支持向量機的語音情感識別方法[A];2012'中國西部聲學學術交流會論文集(Ⅱ)[C];2012年

4 王青;謝波;陳根才;;基于神經網絡的漢語語音情感識別[A];第一屆中國情感計算及智能交互學術會議論文集[C];2003年

5 張鼎天;徐明星;;基于調制頻譜特征的自動語音情感識別[A];第十二屆全國人機語音通訊學術會議（NCMMSC'2013）論文集[C];2013年

6 童燦;;基于boosting HMM的語音情感識別[A];2008年中國高校通信類院系學術研討會論文集（下冊）[C];2009年

7 戴明洋;楊大利;徐明星;;語音情感識別中UBM訓練集的組成研究[A];第十一屆全國人機語音通訊學術會議論文集（一）[C];2011年

8 張衛(wèi);張雪英;孫穎;;基于HHT邊際Teager能量譜的語音情感識別[A];第十二屆全國人機語音通訊學術會議（NCMMSC'2013）論文集[C];2013年

相關博士學位論文前6條

1 孫亞新;語音情感識別中的特征提取與識別算法研究[D];華南理工大學;2015年

2 韓文靜;語音情感識別關鍵技術研究[D];哈爾濱工業(yè)大學;2013年

3 謝波;普通話語音情感識別關鍵技術研究[D];浙江大學;2006年

4 尤鳴宇;語音情感識別的關鍵技術研究[D];浙江大學;2007年

5 劉佳;語音情感識別的研究與應用[D];浙江大學;2009年

6 趙臘生;語音情感特征提取與識別方法研究[D];大連理工大學;2010年

相關碩士學位論文前10條

1 陳曉東;基于卷積神經網絡的語音情感識別[D];華南理工大學;2015年

2 孫志鋒;語音情感識別研究[D];陜西師范大學;2015年

3 譚發(fā)曾;語音情感狀態(tài)模糊識別研究[D];電子科技大學;2015年

4 陳鑫;相空間重構在語音情感識別中的研究[D];長沙理工大學;2014年

5 李昌群;基于特征選擇的語音情感識別[D];合肥工業(yè)大學;2015年

6 陳文汐;基于核函數的語音情感識別技術的研究[D];東南大學;2015年

7 韓文靜;基于神經網絡的語音情感識別技術研究[D];哈爾濱工業(yè)大學;2007年

8 王穎;自適應語音情感識別方法研究[D];江蘇大學;2009年

9 梁智蘭;基于獨立分量分析的語音情感識別研究[D];哈爾濱工程大學;2009年

10 郭春宇;語音情感識別技術的研究[D];哈爾濱工業(yè)大學;2006年

，

本文編號：2254646

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/wltx/2254646.html

上一篇：雙梳慢波結構特性研究
下一篇：視頻編碼標準HEVC中的環(huán)路濾波技術分析

論文發(fā)表

·知網|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于核函數的語音情感識別技術的研究