天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 信息工程論文 >

Research of Automatic Speech Recognition of the Asante-Twi D

發(fā)布時間:2021-12-17 07:14
  自動語音識別(ASR)是語音翻譯系統(tǒng)的第一個也是最重要的階段,語音數(shù)據(jù)庫是其中最重要的資源。然而,高質(zhì)量的ASR需要一個非常大的語音數(shù)據(jù)庫資源。屬于阿肯語的阿桑特-特維方言被認(rèn)為資源極為匱乏,語音數(shù)據(jù)資源的收集成為嚴(yán)重障礙。本文提出了一種利用小型數(shù)據(jù)庫構(gòu)建低資源方言ASR系統(tǒng)的新方法,并取得了良好的效果。首先分析了該方言的特點(diǎn),設(shè)計并收集整理了一個典型的Asante-Twi語音數(shù)據(jù)庫,為更多的語音識別工作奠定了基礎(chǔ)。由于沒有相關(guān)人員進(jìn)行過Asante-Twi方言識別的相關(guān)工作,沒有可信參照,為了選擇一個可靠地Asante Twi語音識別系統(tǒng)的算法和特征,本文利用Kaldi工具包建立了三個不同特征和方法的ASR系統(tǒng)。為了提高ASR系統(tǒng)的性能,采用倒譜均值方差歸一化(CMVN)和δ(Δ)動態(tài)特征對系統(tǒng)的所有特征提取方法進(jìn)行了改進(jìn)。此外,采用GMM-HMM模式分類器算法對每個ASR系統(tǒng)的聲學(xué)模型單元進(jìn)行了改進(jìn),訓(xùn)練了兩個上下文相關(guān)(triphone)模型,以提供更好的性能。第一個ASR系統(tǒng)采用了MFCC特征提取方法,第二個ASR系統(tǒng)使用上下文相關(guān)參數(shù)的MFCCs,第三個ASR系統(tǒng)則使用PLP... 

【文章來源】:西南科技大學(xué)四川省

【文章頁數(shù)】:72 頁

【學(xué)位級別】:碩士

【文章目錄】:
摘要
ABSTRACT
Main Symbol Table
1 Introduction
    1.1 Background and Significance of Study
    1.2 Problem Statement
    1.3 Akan Language and the Twi Dialect
    1.4 Related Work
    1.5 Goals of the Thesis
    1.6 Thesis Chapter Arrangement
2 Basics of Automatic Speech Recognition
    2.1 Mathematical Representation of an ASR System
    2.2 Basic Architecture of an ASR System
        2.2.1 Signal Processing / Feature Extraction
        2.2.2 Language Model
        2.2.3 Lexicon
        2.2.4 Acoustic Model
        2.2.5 Pattern Classification of Acoustic Vectors
        2.2.6 Decoding
    2.3 Metrics for Performance Measurement
    2.4 Summary of the Chapter
3 Approach to Asante-Twi ASR System Realization
    3.1 The Kaldi Toolkit Overview
    3.2 Asante-Twi Dialect Manual Data Preparation
        3.2.1 Audio Data
        3.2.2 Acoustic Data
        3.2.3 Language Data
    3.3 Asante-Twi Dialect Feature Extraction Processes
        3.3.1 Mel Frequency Cepstral Coefficients (MFCC)
        3.3.2 Perceptual Linear Prediction (PLP)
        3.3.3 Cepstral Mean and Variance Normalization(CMVN)
        3.3.4 Delta and Delta-Delta Features
    3.4 Asante-Twi Dialect Language Modeling
    3.5 Acoustic Modeling
        3.5.1 Gaussian Mixture Model(GMM)
        3.5.2 Hidden Markov Model(HMM)
        3.5.3 Generative Learning Approach: GMM-HMM Algorithm
    3.6 Asante-Twi Dialect ASR Systems Training
        3.6.1 Monophone Training
        3.6.2 First Triphone Training
        3.6.3 Second Triphone Training
    3.7 Asante-Twi Dialect ASR Systems Testing
        3.7.1 Monophone Testing
        3.7.2 First Triphone Testing
        3.7.3 Second Triphone Testing
    3.8 Summary of the Chapter
4 Results and Discussion of Asante-Twi ASR Systems
    4.1 Performance Measurement Metrics for Asante-Twi ASR Systems
        4.1.1 Word Error Rate(WER)
        4.1.2 Sentence Error Rate(SER)
    4.2 Analysis of Results of Decoding
        4.2.1 First Asante-Twi Dialect ASR System Using MFCCs and ?(2000Leaves, 11000Gaussians) and?-?(2500Leaves, 15000Gaussians)transformations
        4.2.2 Second Asante-Twi Dialect ASR System Using MFCCs and ?(2000Leaves, 10000Gaussians) and ?-?(2500Leaves, 15000Gaussians)transformations
        4.2.3 Third Asante-Twi Dialect ASR System using PLPs and ?(2000Leaves, 10000Gaussians) + ?-?(2500Leaves, 15000Gaussians)transformations
        4.2.4 Comparison of the Best Performances of All Three Asante-Twi Dialect ASR Systems
    4.3 Summary of the Chapter
5 Conclusion
    5.1 Overall Summary
    5.2 Limitations, Future Works and Beyond
Acknowledgement
References



本文編號:3539633

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/3539633.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶caeaa***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com