基于可穿戴社交感知系統(tǒng)的語音分割算法研究
本文選題:HMM算法 + HMM-KLD算法 ; 參考:《電子科技大學(xué)》2017年碩士論文
【摘要】:隨著科技的進(jìn)步和人們生活水平的提高,身心健康成為當(dāng)今社會(huì)的關(guān)注問題。通常,研究者可以通過社交感知特征客觀分析和評估身心健康狀態(tài)。語音信號處理是該領(lǐng)域重要的研究方向。它可以通過提取分析和融合語音特征客觀地綜合評估社交人群的心理健康。因此,高效的語音分割算法將有利于社交語音感知特征的提取。本文將針對傳統(tǒng)非監(jiān)督語音分割算法中基于HMM(Hidden Markov Model)分割精度的局限,提出了融合KL散度(kullback-leibler divergence)的HMM語音分割算法,進(jìn)一步提高語音分割準(zhǔn)確性。但是,從HMM-KLD的語音分割結(jié)果中,存在分割算法最優(yōu)判定問題。對于該難點(diǎn),本文提出了基于稀疏性相關(guān)特征的自動(dòng)判別語音分割方法,有效地解決這一難題。本文將從以下幾個(gè)方面進(jìn)行具體闡述。(1)基于可穿戴社交感知系統(tǒng)和語音分割系統(tǒng)的國內(nèi)外研究現(xiàn)狀,本文提出了了高效的語音分割算法可以較好地幫助社交感知系統(tǒng)進(jìn)行語音特征分析并有利于研究在實(shí)際應(yīng)用中社交感知行為或心理狀態(tài)與語音特征的關(guān)聯(lián)。(2)基于傳統(tǒng)的HMM非監(jiān)督語音分割方法,評估在不同噪音場景下的語音分割準(zhǔn)確性。這一過程主要分為三個(gè)階段,第一階段是基于譜熵和短時(shí)自相關(guān)特征進(jìn)行去噪;第二階段基于短時(shí)能量去除無聲音的信號部分;第三階段主要是基于不同穿戴社交感知設(shè)備的人的短時(shí)能量的不同,分割出穿戴者語音信號。(3)由于HMM算法分割精度的局限性,本文提出了融合KL散(kullback-leibler divergence)的HMM語音分割算法可以進(jìn)一步提高語音分割準(zhǔn)確性,并通過所采集的語音信號驗(yàn)證新算法的改進(jìn)效果。(4)根據(jù)HMM-KLD的語音分割算法中存下的最優(yōu)判定問題,基于語音稀疏性相關(guān)特征,本文提出了一種可以自動(dòng)優(yōu)化判別語音分割算法的策略,進(jìn)一步提高語音分割算法的準(zhǔn)確性。(5)基于基本語音特征和韻律語音特征,本文探索分析說話人之間親密度以及老年群體的社交特征與他們的語音感知特征的聯(lián)系。(6)總結(jié)全文和展望未來,主要總結(jié)本文中的語音分割算法的優(yōu)劣性和后期可改進(jìn)的一些方案,同時(shí)展望將來語音分割算法以及語音特征分析在可穿戴社交感知系統(tǒng)中的應(yīng)用。
[Abstract]:With the progress of science and technology and the improvement of people's living standards, physical and mental health has become a social concern. Usually, researchers can objectively analyze and evaluate the state of physical and mental health through the characteristics of social perception. Speech signal processing is an important research direction in this field. It can objectively and synthetically evaluate the mental health of social groups by extracting and analyzing and integrating speech features. Therefore, efficient speech segmentation algorithm will be conducive to feature extraction of social speech perception. In this paper, aiming at the limitation of traditional unsupervised speech segmentation algorithm based on hmm Hidden Markov Model, a hmm speech segmentation algorithm based on KL divergence and Kullback-leibler divergence is proposed to further improve the accuracy of speech segmentation. However, from the HMM-KLD speech segmentation results, there is an optimal decision problem of segmentation algorithm. For this difficulty, an automatic discriminant speech segmentation method based on sparse correlation features is proposed to solve this problem effectively. In this paper, the following aspects of the specific elaboration of the following aspects of the wearable social perception system and speech segmentation system based on the domestic and foreign research status, In this paper, an efficient speech segmentation algorithm is proposed, which can be used to analyze speech features in social perception systems and to study the relationship between social perception behavior or mental state and speech features in practical applications. Traditional hmm unsupervised speech segmentation method, To evaluate the accuracy of speech segmentation in different noise scenes. This process is mainly divided into three stages: the first stage is based on spectral entropy and short-time autocorrelation, the second stage is based on short-term energy to remove the soundless signal. The third stage is mainly based on the difference of short-term energy of people wearing different social perception devices, and the speech signal of the wearer is segmented. (3) because of the limitation of the segmentation accuracy of hmm algorithm, In this paper, we propose a speech segmentation algorithm based on KL scattered Kullback-leibler divergence (hmm), which can further improve the accuracy of speech segmentation. The improved effect of the new algorithm is verified by the collected speech signals. (4) according to the optimal decision problem in HMM-KLD 's speech segmentation algorithm, Based on the features of speech sparsity, this paper proposes a strategy to automatically optimize speech segmentation algorithm, which can further improve the accuracy of speech segmentation algorithm. It is based on basic speech features and prosodic speech features. This paper explores the relationship between the speaker's affinity and the social characteristics of the elderly and their phonological perception. (6) summing up the full text and looking forward to the future. This paper mainly summarizes the advantages and disadvantages of the speech segmentation algorithm in this paper and some schemes that can be improved in the later stage. At the same time, it looks forward to the application of speech segmentation algorithm and speech feature analysis in wearable social perception system in the future.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TN912.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 張芝旖;姚恩濤;石玉;;小波分析和MFCC融合的聲音信號端點(diǎn)檢測算法[J];電子測量技術(shù);2016年07期
2 張錚;王可欣;陳爽;周明潔;;外向性對感知到的朋友支持的影響——社交網(wǎng)站使用時(shí)間的調(diào)節(jié)作用[J];全球傳媒學(xué)刊;2016年01期
3 單燕燕;;基于LPC和MFCC得分融合的說話人辨認(rèn)[J];計(jì)算機(jī)技術(shù)與發(fā)展;2016年01期
4 王平;秦威;;基于藍(lán)牙無線傳感網(wǎng)絡(luò)的病人身體狀態(tài)實(shí)時(shí)監(jiān)護(hù)系統(tǒng)設(shè)計(jì)[J];西安科技大學(xué)學(xué)報(bào);2015年01期
5 薛詩靜;高帥鋒;周平;;可穿戴式心電監(jiān)護(hù)系統(tǒng)設(shè)計(jì)及實(shí)現(xiàn)[J];中國醫(yī)療設(shè)備;2015年01期
6 張昕然;查誠;徐新洲;宋鵬;趙力;;基于LDA+kernel-KNNFLC的語音情感識別方法[J];東南大學(xué)學(xué)報(bào)(自然科學(xué)版);2015年01期
7 曾小娟;蔣浩;李永鑫;;農(nóng)村留守初中生的心理健康與心理彈性、核心自我評價(jià)[J];中國心理衛(wèi)生雜志;2014年12期
8 陳煒亮;孫曉;;基于MFCCG-PCA的語音情感識別[J];北京大學(xué)學(xué)報(bào)(自然科學(xué)版);2015年02期
9 耿怡;安暉;李揚(yáng);江華;;可穿戴設(shè)備發(fā)展現(xiàn)狀和前景探析[J];電子科學(xué)技術(shù);2014年02期
10 魏平杰;樊興華;;語音傾向性分析中的特征抽取研究[J];計(jì)算機(jī)應(yīng)用研究;2014年12期
相關(guān)博士學(xué)位論文 前1條
1 李娜;基于人體運(yùn)動(dòng)狀態(tài)識別的可穿戴健康監(jiān)測系統(tǒng)研究[D];北京工業(yè)大學(xué);2013年
相關(guān)碩士學(xué)位論文 前1條
1 凌錦雯;基于多特征的說話人分割與聚類的研究[D];中國科學(xué)技術(shù)大學(xué);2011年
,本文編號:2033807
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2033807.html