語音識別在視頻會議中的應用研究及實現(xiàn)
發(fā)布時間:2018-03-01 18:09
本文關鍵詞: 視頻會議 語音識別 Android平臺 出處:《華南理工大學》2014年碩士論文 論文類型:學位論文
【摘要】:視頻會議作為一種遠程實時信息交流與互動的通信方式,已經(jīng)在醫(yī)療、教育、金融、政府等領域獲得廣泛應用。在傳統(tǒng)的視頻會議系統(tǒng)中,主要利用手動控制方式對視頻會議進行操控,隨著科技的進步和用戶體驗要求的提高,將語音識別技術應用于視頻會議系統(tǒng)具有現(xiàn)實意義,語音識別技術是指計算機將人的語音信號,通過識別和理解過程,將其轉(zhuǎn)換為相應的文本或命令,語音識別技術正逐漸成為信息技術中人機接口的關鍵技術,語音識別技術的應用已經(jīng)成為一個具有競爭性的新興高技術產(chǎn)業(yè)。 本文以視頻會議為背景,將語音識別技術應用于視頻會議系統(tǒng)中,通過語音識別技術識別出預設的語音命令從而對視頻會議進行操作控制,利用語音控制方式取代通過鼠標、鍵盤或移動智能終端等設備的手動控制方式,使視頻會議系統(tǒng)更加人性化和智能化。 本文基于CoolView視頻會議系統(tǒng),以其中的Android平臺上的遙控器為基礎,設計出基于遙控器平臺的語音識別系統(tǒng)的整體結構并對其進行功能模塊劃分,根據(jù)視頻會議遙控器的使用場景,分別實現(xiàn)了基于Google語音識別技術的在線語音識別系統(tǒng)和基于CMU PocketSphinx語音識別引擎的本地語音識別系統(tǒng),在線語音識別系統(tǒng)用于會議的選擇,而本地語音識別系統(tǒng)用于遙控器對其受控終端的控制,它是一個小詞匯量的語音識別系統(tǒng)。此外,,為了降低周圍環(huán)境噪聲的影響,提高語音信號的質(zhì)量,語音識別系統(tǒng)中設計實現(xiàn)了一個音頻處理模塊,用于噪聲抑制和音頻無損壓縮處理等。最后,通過測試,實現(xiàn)的語音識別系統(tǒng)能夠滿足視頻會議系統(tǒng)的基本操作需求,驗證了語音識別在視頻會議系統(tǒng)中應用的可行性,而且本地小詞匯量的語音識別系統(tǒng)具有較高的識別率和較短的識別處理時間,極大地提升了系統(tǒng)的用戶體驗。
[Abstract]:As a remote and real-time information exchange and interactive communication method, videoconferencing has been widely used in medical, education, finance, government and other fields. With the development of science and technology and the improvement of user experience, it is of practical significance to apply speech recognition technology to video conference system. Speech recognition technology means that the computer converts the human speech signal into the corresponding text or command through the recognition and understanding process. Speech recognition technology is gradually becoming the key technology of man-machine interface in information technology. The application of speech recognition technology has become a competitive new high-tech industry. In this paper, based on the background of video conference, the speech recognition technology is applied to the video conference system, the preset voice command is recognized by the speech recognition technology to control the video conference, and the voice control method is used to replace the mouse. The manual control mode of keyboard or mobile intelligent terminal makes the video conference system more humanized and intelligent. Based on the CoolView video conference system, based on the remote control on the Android platform, this paper designs the whole structure of the speech recognition system based on the remote control platform and divides its function modules. According to the usage scene of video conference remote controller, the online speech recognition system based on Google speech recognition technology and the local speech recognition system based on CMU PocketSphinx speech recognition engine are implemented, respectively. The online speech recognition system is used for meeting selection. The local speech recognition system is used for the remote control of its controlled terminal. It is a small vocabulary speech recognition system. In addition, in order to reduce the influence of ambient noise and improve the quality of speech signal, In the speech recognition system, an audio processing module is designed and implemented, which is used for noise suppression and audio lossless compression. Finally, through testing, the realized speech recognition system can meet the basic operational requirements of the video conference system. The feasibility of the application of speech recognition in video conference system is verified, and the local small vocabulary speech recognition system has higher recognition rate and shorter processing time, which greatly improves the user experience of the system.
【學位授予單位】:華南理工大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TN912.34
【參考文獻】
相關期刊論文 前10條
1 朱淑鑫;謝忠紅;;淺談語音識別技術的應用及發(fā)展[J];長春理工大學學報(高教版);2009年02期
2 鄧永紅;視頻會議技術的應用與發(fā)展概況[J];廣播電視信息;2005年02期
3 胡偉;;Android系統(tǒng)架構及其驅(qū)動研究[J];廣州廣播電視大學學報;2010年04期
4 周英;;關于語音識別技術發(fā)展趨勢的分析[J];計算機光盤軟件與應用;2012年19期
5 屈振華;李慧云;張海濤;龍顯軍;;WebRTC技術初探[J];電信科學;2012年10期
6 向模軍;;利用JNI實現(xiàn)Java與C++通信[J];計算機時代;2009年12期
7 任俊偉,林東岱;JNI技術實現(xiàn)跨平臺開發(fā)的研究[J];計算機應用研究;2005年07期
8 高新濤;陳乖麗;;語音識別技術的發(fā)展現(xiàn)狀及應用前景[J];甘肅科技縱橫;2007年04期
9 徐濟仁,牛紀海,陳家松;WAV文件格式實例分析[J];微型機與應用;2002年03期
10 魯帆;;移動智能終端發(fā)展趨勢研究[J];現(xiàn)代傳播(中國傳媒大學學報);2011年11期
本文編號:1552999
本文鏈接:http://sikaile.net/kejilunwen/wltx/1552999.html
最近更新
教材專著