基于改進(jìn)極限學(xué)習(xí)機(jī)的口音識(shí)別
發(fā)布時(shí)間:2021-01-07 21:10
非母語說話人用英語講話時(shí),會(huì)表現(xiàn)出不同的口音或非英語母語口音的特點(diǎn),基于該特點(diǎn)可識(shí)別出說話人的口音及其母語。外國口音的自動(dòng)識(shí)別在眾多語音系統(tǒng)中具有重要作用,如辨認(rèn)說話人、數(shù)字學(xué)習(xí)、電話銀行、語音郵件、語音轉(zhuǎn)換和移民篩選等,此外,在保證自動(dòng)語音識(shí)別(ASR)系統(tǒng)的魯棒性方面也十分重要。但是,自動(dòng)化的口語識(shí)別也面臨很多困難,主要包括口音特征往往和語言內(nèi)容、韻律、環(huán)境噪聲以及說話人自身語音特點(diǎn)混雜,需要搭建復(fù)雜和非線性的口音識(shí)別模型。另外,包含大量樣本的口音語料庫也需耗費(fèi)大量時(shí)力。本論文通過對(duì)語言學(xué)發(fā)音方法的研究,獲得有效體現(xiàn)母語的發(fā)音特征,并采用改進(jìn)的極限學(xué)習(xí)機(jī)算法,獲取較為權(quán)威和豐富的英語方言語料庫,分別實(shí)現(xiàn)二元口音分類和多元口音分類識(shí)別模型,獲得了較好的識(shí)別結(jié)果。本文首先通過研究阿拉伯人在英語輔音方面的發(fā)音差異,提出了基于極限學(xué)習(xí)機(jī)(ELM)的口音識(shí)別模型。將切分好的輔音音素的梅爾倒頻譜系數(shù)(MFCC)作為聲學(xué)特征輸入,對(duì)ELM分類器進(jìn)行訓(xùn)練。分類器采用KFold驗(yàn)證的方式表現(xiàn)出更快的學(xué)習(xí)效率和性能,其精度可達(dá)88%,標(biāo)準(zhǔn)偏差為0.0167而SVM和DBN分類器的精度分別只有76%和6...
【文章來源】:東華大學(xué)上海市 211工程院校 教育部直屬院校
【文章頁數(shù)】:70 頁
【學(xué)位級(jí)別】:碩士
【文章目錄】:
Dedication
Acknowledgement1
摘要
Abstract
Chapter1 Introduction and Background
1.1 Introduction of Accent Identification(AID)
1.2 Work Background
1.2.1 Binery vs multiple Accent Identification(AID)
1.2.2 Methods used in AID
1.2.3 Linguistic characteristics
1.2.4 Features Extraction level
1.3 Research Outline
1.4 Thesis Outline
Chapter2 Literature Review
2.1 Linguistic basics
2.1.1 Language Transfer
2.1.2 Acoustic Analysis of Consonants
2.2 Features Extraction Technique
2.2.1 MFCC(Mel-frequency Cepstral Coefficients)
2.2.2 Prosodic Speech Features
2.3 Classification Techniques Overview
2.3.1 Support Vector Machine Model(SVM)
2.3.2 LSTM classifier Model
2.3.3 Extreme Learning Machine Model(ELM)
2.3.4 Kernel based Extreme Learning Machine Model
Chapter3 The System Model
3.1 Introduction
3.2 AID Model Design
3.2.1 Speech Corpus
3.3 Pre-processing
3.4 Features Extraction
3.4.1 Mel-frequency Cepstral Coefficients(MFCCs)
3.4.2 Prosodic features
3.5 Features Reduction
3.6 Classification Phase
3.6.1 Classification algorithms
3.6.2 Training and Testing Phase
3.6.3 Evaluating a classifier
3.6.4 Performance index
3.7 Summary
Chapter4 Consonant Phonemes based ELM Model for Foreign Accent Identification
4.1 Introduction
4.2 Consonant Phonemes based discriminative features
4.2.1 Bilabial stop/b/vs/p/pronunciation
4.2.2 Alveolar plosive/t/vs/d/pronunciation
4.2.3 Pronunciation of Velar plosive/k/vs/g/
4.3 ELM Model Framework
4.4 Experimental setup
4.4.1 Algorithm Implementation and Classification
4.5 Comparative experiments
4.5.1 Time-consuming performance
4.5.2 Accuracies Comparison
4.6 Conclusion
4.7 Summary of the chapter
Chapter5 MKELM based Multi-Classification Model for Foreign Accent Identification
5.1 Introduction
5.2 Model Design
5.3 Weighted scheme for Multi-classfication
5.3.1 Weighted classification
5.3.2 Accent decision
5.4 Derivation of Multi-Kernel ELM
5.5 Experimental Setup
5.5.1 Software hardware setup
5.5.2 Experimental procedure and Results
5.5.3 Comparitive experiments
5.5.4 Time-consuming performance comparision
5.5.5 Comparison of accent classificaiton results
5.6 Conclusion
Chapter6 Conclusions and Future work
6.1 Conclusions
6.2 Future work
ACKNOWLEDGEMENT2
Bibliography
本文編號(hào):2963248
【文章來源】:東華大學(xué)上海市 211工程院校 教育部直屬院校
【文章頁數(shù)】:70 頁
【學(xué)位級(jí)別】:碩士
【文章目錄】:
Dedication
Acknowledgement1
摘要
Abstract
Chapter1 Introduction and Background
1.1 Introduction of Accent Identification(AID)
1.2 Work Background
1.2.1 Binery vs multiple Accent Identification(AID)
1.2.2 Methods used in AID
1.2.3 Linguistic characteristics
1.2.4 Features Extraction level
1.3 Research Outline
1.4 Thesis Outline
Chapter2 Literature Review
2.1 Linguistic basics
2.1.1 Language Transfer
2.1.2 Acoustic Analysis of Consonants
2.2 Features Extraction Technique
2.2.1 MFCC(Mel-frequency Cepstral Coefficients)
2.2.2 Prosodic Speech Features
2.3 Classification Techniques Overview
2.3.1 Support Vector Machine Model(SVM)
2.3.2 LSTM classifier Model
2.3.3 Extreme Learning Machine Model(ELM)
2.3.4 Kernel based Extreme Learning Machine Model
Chapter3 The System Model
3.1 Introduction
3.2 AID Model Design
3.2.1 Speech Corpus
3.3 Pre-processing
3.4 Features Extraction
3.4.1 Mel-frequency Cepstral Coefficients(MFCCs)
3.4.2 Prosodic features
3.5 Features Reduction
3.6 Classification Phase
3.6.1 Classification algorithms
3.6.2 Training and Testing Phase
3.6.3 Evaluating a classifier
3.6.4 Performance index
3.7 Summary
Chapter4 Consonant Phonemes based ELM Model for Foreign Accent Identification
4.1 Introduction
4.2 Consonant Phonemes based discriminative features
4.2.1 Bilabial stop/b/vs/p/pronunciation
4.2.2 Alveolar plosive/t/vs/d/pronunciation
4.2.3 Pronunciation of Velar plosive/k/vs/g/
4.3 ELM Model Framework
4.4 Experimental setup
4.4.1 Algorithm Implementation and Classification
4.5 Comparative experiments
4.5.1 Time-consuming performance
4.5.2 Accuracies Comparison
4.6 Conclusion
4.7 Summary of the chapter
Chapter5 MKELM based Multi-Classification Model for Foreign Accent Identification
5.1 Introduction
5.2 Model Design
5.3 Weighted scheme for Multi-classfication
5.3.1 Weighted classification
5.3.2 Accent decision
5.4 Derivation of Multi-Kernel ELM
5.5 Experimental Setup
5.5.1 Software hardware setup
5.5.2 Experimental procedure and Results
5.5.3 Comparitive experiments
5.5.4 Time-consuming performance comparision
5.5.5 Comparison of accent classificaiton results
5.6 Conclusion
Chapter6 Conclusions and Future work
6.1 Conclusions
6.2 Future work
ACKNOWLEDGEMENT2
Bibliography
本文編號(hào):2963248
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2963248.html
最近更新
教材專著