基于神經(jīng)網(wǎng)絡(luò)的聲音識(shí)別算法研究
本文選題:聲音識(shí)別 + 神經(jīng)網(wǎng)絡(luò); 參考:《北京郵電大學(xué)》2014年碩士論文
【摘要】:隨著大數(shù)據(jù)時(shí)代的到來(lái),無(wú)論在工業(yè)生產(chǎn)還是日常生活環(huán)境中都充斥著大量的多媒體數(shù)據(jù),而聲音作為多媒體數(shù)據(jù)中的重要組成部分蘊(yùn)含了大量的信息。對(duì)聲音數(shù)據(jù)進(jìn)行處理和分析可以從大量的數(shù)據(jù)中挖掘出對(duì)我們有用的信息,因此針對(duì)聲音的處理和分析技術(shù)一直以來(lái)都是各國(guó)學(xué)者深入研究的熱點(diǎn)。其中聲音識(shí)別技術(shù)近些年來(lái)也得到了大量的關(guān)注和應(yīng)用。聲音識(shí)別是將待識(shí)別聲音的特征與聲音樣本特征進(jìn)行比對(duì),從而得到待測(cè)聲音和樣本的一致性判斷。聲音識(shí)別可應(yīng)用于許多領(lǐng)域和場(chǎng)合,如環(huán)境聲音異常監(jiān)測(cè)、音頻資料檢索、音頻媒體版權(quán)監(jiān)測(cè)等。 在對(duì)聲音進(jìn)行識(shí)別之前先需要對(duì)其進(jìn)行前期處理,聲音識(shí)別前期的處理流程包括預(yù)加重、分幀加窗和端點(diǎn)檢測(cè)等。在前期處理的基礎(chǔ)上,對(duì)聲音進(jìn)行特征提取得到聲音的特征向量。接著是模式匹配階段,通過(guò)模式匹配得到聲音識(shí)別的最終結(jié)果。 基于神經(jīng)網(wǎng)絡(luò)的基本工作原理,本文主要研究了如何應(yīng)用神經(jīng)網(wǎng)絡(luò)解決多類聲音識(shí)別中的模式匹配問(wèn)題。本論文的主要工作如下: 1,在介紹了神經(jīng)網(wǎng)絡(luò)基礎(chǔ)知識(shí)的基礎(chǔ)上,針對(duì)兩類識(shí)別網(wǎng)絡(luò)的具體參數(shù)進(jìn)行了研究探討,確定了傳輸函數(shù)、神經(jīng)元和神經(jīng)層個(gè)數(shù)等多個(gè)參數(shù)。 2,探討了多類聲音識(shí)別的識(shí)別方案,對(duì)線型識(shí)別、并行排名、兩類晉級(jí)三種不同的識(shí)別方案進(jìn)行對(duì)比論證,確定了以兩類晉級(jí)識(shí)別為基礎(chǔ)的多類聲音識(shí)別方案。 3,詳細(xì)闡述了運(yùn)用兩類識(shí)別神經(jīng)網(wǎng)絡(luò)對(duì)多類聲音進(jìn)行識(shí)別的方法。對(duì)多組競(jìng)爭(zhēng)方法和可信率進(jìn)行了全面的闡述,通過(guò)多組競(jìng)爭(zhēng)的方法可以大幅提高兩類神經(jīng)網(wǎng)絡(luò)的識(shí)別率,具體實(shí)例驗(yàn)證了多組競(jìng)爭(zhēng)方法在多類識(shí)別中的效用。將可信率的計(jì)算應(yīng)用到識(shí)別程序中可以讓用戶主動(dòng)掌握識(shí)別進(jìn)程,得到在識(shí)別時(shí)間和識(shí)別率之間權(quán)衡后滿意的識(shí)別結(jié)果。 4,針對(duì)聲音類別總數(shù)是任意數(shù)的情況,論述了分組匹配競(jìng)爭(zhēng)方法。通過(guò)多個(gè)具體實(shí)例討論了分組匹配識(shí)別方法的基本規(guī)律,總結(jié)了類別數(shù)十以內(nèi)的聲音識(shí)別推薦分組模型,類別數(shù)更大的問(wèn)題可以通過(guò)先分組到十以內(nèi)的小組來(lái)解決。
[Abstract]:With the arrival of big data era, both industrial production and daily life environment are full of a lot of multimedia data, and sound as an important part of multimedia data contains a lot of information. The processing and analysis of sound data can extract useful information from a large number of data, so the technology of sound processing and analysis has always been the focus of deep research by scholars all over the world. In recent years, sound recognition technology has also been a lot of attention and application. Sound recognition is to compare the features of the sound to be identified with the characteristics of the sound samples, so as to obtain the consistency of the sound and the samples to be tested. Sound recognition can be used in many fields and applications, such as environmental sound anomaly monitoring, audio data retrieval, audio media copyright monitoring and so on. The pre-processing of voice recognition is needed before it is recognized. The pre-processing process includes pre-weighting, framing and endpoint detection. On the basis of previous processing, the feature vector of sound is obtained by feature extraction. Then there is the pattern matching stage, and the final result of sound recognition is obtained by pattern matching. Based on the basic working principle of neural network, this paper mainly studies how to use neural network to solve the problem of pattern matching in multi-class sound recognition. The main work of this thesis is as follows: 1. On the basis of introducing the basic knowledge of neural network, the specific parameters of two kinds of recognition networks are studied and discussed, and several parameters, such as transmission function, number of neurons and neural layers, are determined. 2. The recognition schemes of multi-class sound recognition are discussed, and the linear recognition, parallel ranking and two kinds of promotion three different recognition schemes are compared and proved, and the multi-class sound recognition scheme based on the two kinds of promotion recognition is determined. 3. Two kinds of recognition neural networks are used to recognize multi-class sound in detail. In this paper, the method of multi-group competition and the probability of trustworthiness are expounded. The recognition rate of two kinds of neural networks can be greatly improved by the method of multi-group competition. The effectiveness of multi-group competition method in multi-class recognition is verified by an example. The application of the trust rate calculation to the recognition program can enable the user to master the identification process actively and obtain satisfactory recognition results after balancing the recognition time and the recognition rate. 4. In view of the situation that the total number of sound categories is arbitrary, the competition method of grouping matching is discussed. The basic rules of grouping matching recognition are discussed through several concrete examples, and the recommended grouping model of sound recognition within tens of categories is summarized. The problem of larger number of categories can be solved by grouping into groups within ten first.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP18;TN912.34
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 蘇倩;宋鐵成;胡靜;沈連豐;;一種認(rèn)知無(wú)線電自適應(yīng)雙門(mén)限能量檢測(cè)算法(英文)[J];Journal of Southeast University(English Edition);2011年04期
2 鄧維斌;王國(guó)胤;王燕;;基于Rough Set的加權(quán)樸素貝葉斯分類算法[J];計(jì)算機(jī)科學(xué);2007年02期
3 張宇波;基于信號(hào)處理的聲音模式識(shí)別過(guò)程及方法研究[J];計(jì)算機(jī)仿真;2004年09期
4 趙琦;陳佳品;陳凱;宋葉波;;基于無(wú)線傳感器網(wǎng)絡(luò)的聲音識(shí)別與定位[J];計(jì)算機(jī)測(cè)量與控制;2010年07期
5 竺樂(lè)慶;張真;;基于MFCC和GMM的昆蟲(chóng)聲音自動(dòng)識(shí)別[J];昆蟲(chóng)學(xué)報(bào);2012年04期
6 謝濤;何怡剛;姚建剛;李兵;侯周國(guó);;基于改進(jìn)BP算法的微帶射頻帶通濾波器設(shè)計(jì)[J];儀器儀表學(xué)報(bào);2009年06期
7 李志鵬;馬田香;杜蘭;徐丹蕾;劉宏偉;張子敬;;在雷達(dá)HRRP識(shí)別中多特征融合多類分類器設(shè)計(jì)[J];西安電子科技大學(xué)學(xué)報(bào);2013年01期
,本文編號(hào):1820716
本文鏈接:http://sikaile.net/kejilunwen/wltx/1820716.html