一種噪聲環(huán)境下的復(fù)雜聲音識(shí)別方法

發(fā)布時(shí)間：2018-10-26 19:27

【摘要】：當(dāng)今社會(huì)已進(jìn)入人工智能的時(shí)代,語音識(shí)別技術(shù)已經(jīng)相當(dāng)成熟。而對(duì)于實(shí)際生活中的復(fù)雜聲音,由于其聲源的復(fù)雜性和多樣性,加之背景噪聲的干擾,目前對(duì)于這一領(lǐng)域的識(shí)別研究還遠(yuǎn)遠(yuǎn)不夠成熟,仍然存在許多問題和缺陷。因此對(duì)噪聲環(huán)境下復(fù)雜聲音的識(shí)別研究具有非常重大的實(shí)踐價(jià)值和理論價(jià)值。復(fù)雜聲音是指這樣一類包含多種聲音類型且這些聲音之間的邊界難以區(qū)分的聲音信號(hào)。目前對(duì)于這類聲音的檢測方法主要沿用傳統(tǒng)的語音識(shí)別技術(shù),語音信號(hào)發(fā)音方式較為固定且能量平穩(wěn),而復(fù)雜聲音種類繁多,發(fā)音原理各不相同,瞬間能量也較大,而且還會(huì)被環(huán)境噪音所干擾,因此僅僅采用傳統(tǒng)的語音識(shí)別技術(shù)不能夠較好地應(yīng)用于復(fù)雜聲音的識(shí)別。針對(duì)噪聲環(huán)境下這一類聲音識(shí)別準(zhǔn)確率低的問題,本文主要進(jìn)行了如下研究工作:(1)首先主要介紹了聲音識(shí)別中常用的幾種時(shí)頻域特征,通過提取和分析復(fù)雜聲音樣本的特征參數(shù),提出了由時(shí)頻域特征組合的方式來共同描述復(fù)雜聲音,并進(jìn)行了多種混合特征的對(duì)比實(shí)驗(yàn)。(2)在對(duì)噪聲環(huán)境下的復(fù)雜聲音識(shí)別方法研究過程中,針對(duì)人工選擇訓(xùn)練樣本的困難,提出了一種基于聚類標(biāo)注的訓(xùn)練樣本選擇算法,能夠更加快速精準(zhǔn)地選擇出訓(xùn)練樣本代表集,并進(jìn)行了不同聚類方法的對(duì)比實(shí)驗(yàn)。(3)最后提出了基于隱馬爾可夫模型(Hidden Markov Mode1,HMM)的復(fù)雜聲音識(shí)別框架,并進(jìn)行了訓(xùn)練和識(shí)別。通過對(duì)列車聲音以及鳥叫聲兩種不同類型的復(fù)雜聲音進(jìn)行仿真實(shí)驗(yàn),結(jié)果表明,利用時(shí)域特征短時(shí)自相關(guān)函數(shù)以及頻域特征梅爾頻率倒譜系數(shù)組合的混合特征參數(shù)表示復(fù)雜聲音特征,使用本文提出的基于近鄰傳播聚類標(biāo)注的訓(xùn)練樣本選擇算法,以及通過HMM模型識(shí)別框架進(jìn)行建模,可以顯著提高噪聲環(huán)境下復(fù)雜聲音的識(shí)別準(zhǔn)確率和效率。
[Abstract]:Nowadays, the society has entered the era of artificial intelligence, speech recognition technology has been quite mature. Because of the complexity and diversity of the sound sources and the interference of background noise, the research on the recognition of complex sound in real life is far from mature, and there are still many problems and defects. Therefore, it is of great practical and theoretical value to study the recognition of complex sound in noisy environment. Complex sound is a kind of sound signal which contains many kinds of sound types and whose boundaries are difficult to distinguish. At present, the detection methods of this kind of sound mainly use the traditional speech recognition technology. The speech signal pronunciation mode is relatively fixed and the energy is stable, and there are many kinds of complex sounds, different pronunciation principles and great instantaneous energy. And it will be interfered by environmental noise, so only traditional speech recognition technology can not be applied to the recognition of complex sound. In order to solve the problem of low accuracy in noise environment, the main work of this paper is as follows: (1) firstly, several time-frequency domain features commonly used in sound recognition are introduced. In the process of studying the method of complex sound recognition in noisy environment, a training sample selection algorithm based on clustering tagging is proposed to overcome the difficulty of manually selecting training samples. The training sample representative set can be selected more quickly and accurately, and the comparison experiments of different clustering methods are carried out. (3) finally, a complex voice recognition framework based on hidden Markov model (Hidden Markov Mode1,HMM) is proposed. Training and recognition are also carried out. The simulation results of two different types of complex sounds, train sounds and bird calls, show that, The time domain feature short time autocorrelation function and the mixed feature parameters of frequency domain feature Mel frequency cepstrum coefficient combination are used to represent the complex sound features, and the training sample selection algorithm based on nearest neighbor propagation clustering is proposed in this paper. The accuracy and efficiency of complex sound recognition in noisy environment can be significantly improved by modeling with HMM model recognition framework.
【學(xué)位授予單位】：合肥工業(yè)大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TN912.34

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 ;會(huì)找人的機(jī)器人[J];科學(xué);2006年01期

2 張宏超;聲音識(shí)別簡介[J];信息與控制;1979年03期

3 劉礫;日研制成功世界上第一個(gè)連續(xù)聲音識(shí)別系統(tǒng)[J];國外自動(dòng)化;1979年Z1期

4 王憲忠;;前景光明的聲音識(shí)別技術(shù)[J];華夏星火;2001年09期

5 郭利剛;方土富;;智能聲音識(shí)別技術(shù)在廣播電視廣告監(jiān)測中的應(yīng)用[J];廣播與電視技術(shù);2006年12期

6 施智雄;;基于聲音識(shí)別的氣味發(fā)生裝置設(shè)計(jì)與實(shí)現(xiàn)[J];電聲技術(shù);2009年05期

7 蔡時(shí)昊;顏偉國;;智能聲音識(shí)別技術(shù)構(gòu)建廣播電視廣告節(jié)目監(jiān)測系統(tǒng)[J];信息通信;2012年03期

8 王再歡;唐云建;韓鵬;;一種利用聲音識(shí)別的森林盜伐檢測方法[J];計(jì)算機(jī)工程與應(yīng)用;2012年30期

9 甘振新 ,金世龍;關(guān)于聲音識(shí)別的一些研究課題[J];信息與控制;1979年03期

10 千葉 ,成美 ,劉小立 ,祝景成;聲音識(shí)別技術(shù)的現(xiàn)狀與未來[J];國外自動(dòng)化;1983年02期

相關(guān)會(huì)議論文前3條

1 楊曜;郭斌;於志文;;一種基于背景聲音識(shí)別的社會(huì)情境感知方法[A];第八屆和諧人機(jī)環(huán)境聯(lián)合學(xué)術(shù)會(huì)議（HHME2012)論文集PCC[C];2012年

2 張明瀚;石為人;丁寧;;一種基于學(xué)習(xí)的異常聲音識(shí)別研究[A];2009中國儀器儀表與測控技術(shù)大會(huì)論文集[C];2009年

3 高思澤;倪邦發(fā);張貴英;趙常軍;肖才錦;劉存兄;劉超;管永精;;過熱液滴探測器的聲音識(shí)別系統(tǒng)設(shè)計(jì)[A];第十二屆全國活化分析學(xué)術(shù)交流會(huì)論文摘要匯編[C];2010年

相關(guān)重要報(bào)紙文章前3條

1 日立邋編譯;聲音識(shí)別：下一代手機(jī)輸入接口[N];中國電子報(bào);2007年

2 本報(bào)駐以色列記者　田學(xué)科;藏在舌尖上的“身份證”[N];科技日?qǐng)?bào);2006年

3 李莉;反恐戰(zhàn)場另類“靈眼”[N];中國國防報(bào);2004年

相關(guān)博士學(xué)位論文前1條

1 張文娟;基于聽覺仿生的目標(biāo)聲音識(shí)別系統(tǒng)研究[D];中國科學(xué)院研究生院（長春光學(xué)精密機(jī)械與物理研究所）;2012年

相關(guān)碩士學(xué)位論文前10條

1 張楠;西湖之聲“杭州味道”品牌戰(zhàn)略方案評(píng)估和建議[D];浙江大學(xué);2015年

2 張?zhí)K楠;基于視頻跟蹤與多模型聲音識(shí)別的豬行為檢測與分析[D];太原理工大學(xué);2016年

3 張小霞;基于能量檢測的復(fù)雜環(huán)境聲音識(shí)別[D];福州大學(xué);2014年

4 尤冠瑜;基于時(shí)間編碼的環(huán)境聲音識(shí)別[D];福州大學(xué);2013年

5 王熙;基于多頻段譜減法的魯棒性生態(tài)環(huán)境聲音識(shí)別[D];福州大學(xué);2013年

6 顏鑫;真實(shí)噪聲下利用抗噪冪歸一化倒譜系數(shù)的兩層魯棒環(huán)境聲音識(shí)別[D];福州大學(xué);2013年

7 史秋瑩;基于深度學(xué)習(xí)和遷移學(xué)習(xí)的環(huán)境聲音識(shí)別[D];哈爾濱工業(yè)大學(xué);2016年

8 崔金琦;Non-Speech Body Sounds的感知、識(shí)別與應(yīng)用研究[D];南京大學(xué);2017年

9 樊鵬;一種噪聲環(huán)境下的復(fù)雜聲音識(shí)別方法[D];合肥工業(yè)大學(xué);2017年

10 胡志峰;基于嵌入式聲音識(shí)別技術(shù)的列車預(yù)警研究[D];西南交通大學(xué);2007年

，

本文編號(hào)：2296724

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/xinxigongchenglunwen/2296724.html

上一篇：聲帶振動(dòng)發(fā)音過程機(jī)理研究與仿真
下一篇：方向圖可重構(gòu)天線研究與設(shè)計(jì)

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

一種噪聲環(huán)境下的復(fù)雜聲音識(shí)別方法