基于語(yǔ)音數(shù)據(jù)的有效特征分析及其在抑郁水平評(píng)估中的應(yīng)用
發(fā)布時(shí)間:2018-11-19 11:19
【摘要】:抑郁癥作為一種常見(jiàn)的精神障礙,具有發(fā)病率、復(fù)發(fā)率、自殺率高,知曉率、治療率低等特點(diǎn),嚴(yán)重危害個(gè)人身心健康。近年來(lái),隨著社會(huì)壓力不斷增大,抑郁癥發(fā)病率逐年上升,全球約有3億人正在遭受抑郁癥的折磨。而當(dāng)前診斷嚴(yán)重依賴醫(yī)師臨床經(jīng)驗(yàn)和患者自我描述,受主觀因素影響較大。因此需要一個(gè)客觀、有效、便捷的評(píng)估手段輔助抑郁診斷,語(yǔ)音以其非侵入、低成本的優(yōu)點(diǎn)成為檢測(cè)抑郁癥有力的客觀指標(biāo)。使用語(yǔ)音檢測(cè)抑郁的研究按照研究方法可分為橫斷面研究和追蹤研究。追蹤研究是指抑郁治療過(guò)程中定期采集抑郁患者語(yǔ)音數(shù)據(jù),關(guān)注語(yǔ)音特征隨抑郁水平變化的趨勢(shì),但該研究只能關(guān)注個(gè)體語(yǔ)音變化情況,所得結(jié)論在人群分類上不一定適用;橫斷面研究在一個(gè)較短時(shí)間區(qū)間內(nèi)采集語(yǔ)音數(shù)據(jù),關(guān)注正常、抑郁人群在語(yǔ)音特征上的差異,但較少關(guān)注不同抑郁水平人群的分類問(wèn)題,且由于個(gè)體差異,存在不一致的結(jié)論;不同于單純地區(qū)別正常、抑郁人群,不同抑郁水平人群在心理、生理上存在的差異不明顯,檢測(cè)語(yǔ)音特征差異存在困難。目前尚未有研究明確提出能夠有效區(qū)分不同抑郁水平人群的特征。為解決這一問(wèn)題,同時(shí)考慮個(gè)體差異的存在,本文進(jìn)行了一系列工作,主要工作及貢獻(xiàn)如下:(1)構(gòu)建語(yǔ)音數(shù)據(jù)集,并引入相關(guān)研究未討論過(guò)的新特征。本文選取132名被試(72名女性,60名男性被試),依據(jù)量表分?jǐn)?shù),將受試人群分為正常、輕度抑郁、重度抑郁三類,對(duì)其年齡、學(xué)歷、職業(yè)等因素進(jìn)行匹配,減少干擾因素的影響,采用該領(lǐng)域常用的范式與情緒刺激激發(fā)語(yǔ)音,構(gòu)建三分類語(yǔ)音數(shù)據(jù)集。數(shù)據(jù)集共包含14類特征,包括相關(guān)研究列舉的經(jīng)典特征,未被討論過(guò)的新特征。(2)應(yīng)用統(tǒng)計(jì)分析及數(shù)據(jù)降維中的多種方法進(jìn)行有效特征篩選,發(fā)現(xiàn)了多個(gè)能有效區(qū)分不同抑郁水平人群的特征集,均為韻律與頻譜特征的組合,如聲音強(qiáng)度等韻律特征,以及梅爾頻率與LPC系數(shù)等頻譜特征。男、女性數(shù)據(jù)分別選出了5個(gè)、4個(gè)特征集,在三分類問(wèn)題上取得了較好的分類結(jié)果。(3)利用上述特征集建立多特征集綜合決策分類系統(tǒng),應(yīng)用于語(yǔ)音數(shù)據(jù)中,改善了使用語(yǔ)音數(shù)據(jù)評(píng)估抑郁水平的效果。本文使用GMM建立多特征集決策系統(tǒng),在多個(gè)特征集上分別訓(xùn)練模型,然后對(duì)預(yù)測(cè)結(jié)果進(jìn)行決策融合,在男、女?dāng)?shù)據(jù)上分別得到了70%、75%的分類準(zhǔn)確率,與相關(guān)研究相比有所上升。本文構(gòu)建了一個(gè)基于抑郁水平的三分類語(yǔ)音數(shù)據(jù)集,并在這一數(shù)據(jù)集上,利用多種統(tǒng)計(jì)分析及數(shù)據(jù)降維方法發(fā)現(xiàn)了多個(gè)有效特征集,對(duì)語(yǔ)音數(shù)據(jù)的多分類效果較好;并利用它們建立了多特征集綜合決策分類系統(tǒng),與相關(guān)研究相比,提高了抑郁水平評(píng)估的準(zhǔn)確率。這一成果為使用語(yǔ)音數(shù)據(jù)評(píng)估抑郁水平提供了基礎(chǔ)。
[Abstract]:As a common mental disorder, depression has the characteristics of high incidence, relapse rate, suicide rate, awareness rate and low treatment rate, which seriously endangers the physical and mental health of individuals. In recent years, with the increasing social pressure, the incidence of depression is increasing year by year, about 300 million people worldwide are suffering from depression. At present, diagnosis depends heavily on physician's clinical experience and patient's self-description, which is greatly influenced by subjective factors. Therefore, it needs an objective, effective and convenient evaluation method to assist the diagnosis of depression. Speech has become a powerful objective index for the detection of depression because of its non-invasive and low-cost advantages. The study of depression using speech test can be divided into cross-sectional study and tracking study according to the research method. The tracking study refers to the regular collection of speech data of depression patients during the course of depression treatment and the tendency of phonological characteristics changing with depression level. However, this study only focuses on individual phonetic changes, and the conclusions are not necessarily applicable to the classification of population. Cross-sectional study collected voice data in a relatively short time interval, focusing on the differences in speech characteristics of normal and depressed people, but less on the classification of people with different levels of depression, and because of individual differences, there are inconsistent conclusions; Different from the simple difference between normal, depressed and depressed people, there are no obvious differences in psychology and physiology, so it is difficult to detect the difference of phonological characteristics. At present, there is no clear research that can effectively distinguish the characteristics of people with different levels of depression. In order to solve this problem and consider the existence of individual differences, a series of work has been done in this paper. The main work and contributions are as follows: (1) the speech data set is constructed, and some new features that have not been discussed in relevant research are introduced. 132 subjects (72 women and 60 men) were selected. According to the score of the scale, the subjects were divided into three groups: normal, mild depression and severe depression. Their age, educational background, occupation and other factors were matched. In order to reduce the influence of interference factors, we construct three kinds of speech data sets by using the usual paradigm and emotion stimulation in this field. The data set contains 14 kinds of features, including classical features listed by related studies, and new features that have not been discussed. (2) effective feature selection is carried out by using various methods in statistical analysis and data dimensionality reduction. Several characteristic sets which can effectively distinguish different depression levels were found, all of which were the combination of prosody and spectral characteristics, such as sound intensity, Mayer frequency and LPC coefficient. Five or four feature sets are selected for male and female data, and good classification results are obtained. (3) A multi-feature set comprehensive decision classification system is established based on the above feature sets, which is applied to speech data. Improved use of voice data to assess depression levels. In this paper, GMM is used to set up a multi-feature collection decision system. The model is trained on several feature sets, and then the prediction results are fused, and the classification accuracy of 70% or 75% is obtained in the data of male and female. There was an increase compared with related studies. In this paper, a three-classification speech data set based on depression level is constructed, and on this dataset, several effective feature sets are found by using a variety of statistical analysis and data dimensionality reduction methods, which have a good effect on multi-classification of speech data. A comprehensive decision classification system was established by using them, which improved the accuracy of evaluation of depression level compared with related research. The results provide a basis for using voice data to assess depression levels.
【學(xué)位授予單位】:蘭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TN912.3
本文編號(hào):2342147
[Abstract]:As a common mental disorder, depression has the characteristics of high incidence, relapse rate, suicide rate, awareness rate and low treatment rate, which seriously endangers the physical and mental health of individuals. In recent years, with the increasing social pressure, the incidence of depression is increasing year by year, about 300 million people worldwide are suffering from depression. At present, diagnosis depends heavily on physician's clinical experience and patient's self-description, which is greatly influenced by subjective factors. Therefore, it needs an objective, effective and convenient evaluation method to assist the diagnosis of depression. Speech has become a powerful objective index for the detection of depression because of its non-invasive and low-cost advantages. The study of depression using speech test can be divided into cross-sectional study and tracking study according to the research method. The tracking study refers to the regular collection of speech data of depression patients during the course of depression treatment and the tendency of phonological characteristics changing with depression level. However, this study only focuses on individual phonetic changes, and the conclusions are not necessarily applicable to the classification of population. Cross-sectional study collected voice data in a relatively short time interval, focusing on the differences in speech characteristics of normal and depressed people, but less on the classification of people with different levels of depression, and because of individual differences, there are inconsistent conclusions; Different from the simple difference between normal, depressed and depressed people, there are no obvious differences in psychology and physiology, so it is difficult to detect the difference of phonological characteristics. At present, there is no clear research that can effectively distinguish the characteristics of people with different levels of depression. In order to solve this problem and consider the existence of individual differences, a series of work has been done in this paper. The main work and contributions are as follows: (1) the speech data set is constructed, and some new features that have not been discussed in relevant research are introduced. 132 subjects (72 women and 60 men) were selected. According to the score of the scale, the subjects were divided into three groups: normal, mild depression and severe depression. Their age, educational background, occupation and other factors were matched. In order to reduce the influence of interference factors, we construct three kinds of speech data sets by using the usual paradigm and emotion stimulation in this field. The data set contains 14 kinds of features, including classical features listed by related studies, and new features that have not been discussed. (2) effective feature selection is carried out by using various methods in statistical analysis and data dimensionality reduction. Several characteristic sets which can effectively distinguish different depression levels were found, all of which were the combination of prosody and spectral characteristics, such as sound intensity, Mayer frequency and LPC coefficient. Five or four feature sets are selected for male and female data, and good classification results are obtained. (3) A multi-feature set comprehensive decision classification system is established based on the above feature sets, which is applied to speech data. Improved use of voice data to assess depression levels. In this paper, GMM is used to set up a multi-feature collection decision system. The model is trained on several feature sets, and then the prediction results are fused, and the classification accuracy of 70% or 75% is obtained in the data of male and female. There was an increase compared with related studies. In this paper, a three-classification speech data set based on depression level is constructed, and on this dataset, several effective feature sets are found by using a variety of statistical analysis and data dimensionality reduction methods, which have a good effect on multi-classification of speech data. A comprehensive decision classification system was established by using them, which improved the accuracy of evaluation of depression level compared with related research. The results provide a basis for using voice data to assess depression levels.
【學(xué)位授予單位】:蘭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TN912.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 龔栩;黃宇霞;王妍;羅躍嘉;;中國(guó)面孔表情圖片系統(tǒng)的修訂[J];中國(guó)心理衛(wèi)生雜志;2011年01期
2 韓一;王國(guó)胤;楊勇;;基于MFCC的語(yǔ)音情感識(shí)別[J];重慶郵電大學(xué)學(xué)報(bào)(自然科學(xué)版);2008年05期
3 林奕琳;韋崗;楊康才;;語(yǔ)音情感識(shí)別的研究進(jìn)展[J];電路與系統(tǒng)學(xué)報(bào);2007年01期
4 徐琳宏;林鴻飛;潘宇;任惠;陳建美;;情感詞匯本體的構(gòu)造[J];情報(bào)學(xué)報(bào);2008年02期
,本文編號(hào):2342147
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2342147.html
最近更新
教材專著