基于VC的廣告語音識別系統(tǒng)的設(shè)計研究

發(fā)布時間：2018-03-01 15:36

本文關(guān)鍵詞： 語音識別特征提取線性預(yù)測倒譜系數(shù) 梅爾倒頻譜系數(shù) K均值動態(tài)時間規(guī)整　出處：《南京理工大學(xué)》2007年碩士論文　論文類型：學(xué)位論文

【摘要】： 隨著經(jīng)濟的發(fā)展，電視廣告成為社會生活中越來越重要的一部分，而其帶來的社會問題也日漸顯著，特別是虛假廣告嚴(yán)重誤導(dǎo)了消費者，坑害了廣大人民，因此廣告監(jiān)測成為社會急需處理的問題。本課題主要研究對廣告語音識別技術(shù)的軟件實現(xiàn)。基于語音識別的基本原理和過程，介紹了語音端點檢測，語音特征提取，語音建模及模型匹配的基本原理和計算方法。在廣告語音端點檢測部分，主要介紹了短時能量和短時過零率，并結(jié)合仿真結(jié)果給出了適合本系統(tǒng)的雙門限端點檢測法。在廣告語音特征提取部分，主要介紹了語音的倒譜以及常用的線性預(yù)測倒譜系數(shù)(LPCC)，，梅爾倒頻譜系數(shù)(MFCC)。在語音建模及匹配部分，為了解決特征參數(shù)數(shù)據(jù)量過大以及廣告音頻中出現(xiàn)的多幀、丟幀等問題，本課題應(yīng)用了K均值聚類，矢量量化技術(shù)和DTW算法。通過上述的算法，在MATLAB環(huán)境下計算仿真，比較分析了這些算法的特性及參數(shù)選取的方法，給出了一種適合本課題的建模與識別方法。在上述工作的基礎(chǔ)上，本課題在VC環(huán)境下進(jìn)行了對指定廣告語音的測試實驗。通過實驗表明，該系統(tǒng)對廣告監(jiān)測有著一定的實用意義。
[Abstract]:With the development of economy, TV advertisement has become an increasingly important part of social life, and the social problems brought by it are becoming more and more obvious, especially the false advertisement has misled consumers seriously and harmed the masses of people. Therefore, advertising monitoring has become a problem urgently needed to be dealt with in society. Based on the basic principle and process of speech recognition, this paper introduces the basic principles and calculation methods of speech endpoint detection, speech feature extraction, speech modeling and model matching. In the part of advertising speech endpoint detection, the short time energy and short time zero crossing rate are mainly introduced, and combined with the simulation results, a double threshold endpoint detection method suitable for this system is presented. In the part of advertising speech feature extraction, the speech cepstrum and the commonly used linear prediction cepstrum coefficients (LPCCX), Mel cepstrum coefficients (MFCCs) are introduced. In the part of speech modeling and matching, the K-means clustering, vector quantization and DTW algorithm are applied in order to solve the problems of excessive data volume of feature parameters and multi-frame and frame loss in advertising audio. Based on the above algorithms, the characteristics of these algorithms and the methods of parameter selection are compared and analyzed under the MATLAB environment. A modeling and identification method suitable for this project is presented. On the basis of the above work, this paper has carried on the test experiment to the designated advertisement voice under the VC environment. The experiment shows that the system has certain practical significance to the advertisement monitoring.
【學(xué)位授予單位】：南京理工大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2007
【分類號】：TN912.34

【引證文獻(xiàn)】

相關(guān)碩士學(xué)位論文前2條

1 亢明;基于矢量量化的語音識別及全文檢索研究[D];重慶大學(xué);2009年

2 張濤;基于FPGA的小波提升算法語音去噪系統(tǒng)的設(shè)計與實現(xiàn)[D];廣西師范大學(xué);2012年

本文編號：1552569

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/wenyilunwen/guanggaoshejilunwen/1552569.html

上一篇：雜志語境情緒對后續(xù)廣告“框架效果”的影響:一項實驗研究
下一篇：奢侈品品牌符號價值生產(chǎn)的深層動因與形成機制

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于VC的廣告語音識別系統(tǒng)的設(shè)計研究