當(dāng)前位置：主頁 > 科技論文 > 網(wǎng)絡(luò)通信論文 >

基于iOS系統(tǒng)的語音云開放平臺客戶端SDK的設(shè)計與實現(xiàn)

發(fā)布時間：2018-06-12 04:45

本文選題：語音云 + 語音識別�。� 參考：《北京郵電大學(xué)》2014年碩士論文

【摘要】：在智能手機(jī)與智能平板等移動終端高度普及的今天,移動互聯(lián)網(wǎng)飛速發(fā)展,移動終端應(yīng)用對文字輸入的要求也變得越來越高,導(dǎo)航類、聊天類等應(yīng)用更是希望通過語音識別技術(shù)解放用戶雙手進(jìn)行文字輸入。隨著iOS設(shè)備上Siri平臺的日漸成熟,各大互聯(lián)網(wǎng)公司也相繼推出了自己的語音識別系統(tǒng),但就目前來看iOS系統(tǒng)還未能給開發(fā)者提供公共的Siri API來調(diào)用語音識別功能,而各大互聯(lián)網(wǎng)公司對客戶端語音識別SDK又有嚴(yán)格限制,iOS系統(tǒng)缺乏通用的開放的語音識別SDK供開發(fā)者使用。本文主要研究了目前在iOS系統(tǒng)上可用的開放語音識別SDK,對比各語音識別SDK的產(chǎn)品功能,分析開發(fā)者對語音識別SDK的需求,提出了一整套新的解決方案來實現(xiàn)客戶端語音識別SDK,全稱為語音云開放平臺客戶端SDK,簡稱語音云SDK。語音云SDK使開發(fā)者可以輕松地在iOS設(shè)備上構(gòu)建功能完備、交互性強(qiáng)的語音識別應(yīng)用程序,在整個開發(fā)和使用過程中,開發(fā)者無需維護(hù)語音引擎即可享有語音識別服務(wù)。本文在軟件工程思想的指導(dǎo)下,按照軟件開發(fā)的過程,逐步實現(xiàn)語音云SDK系統(tǒng)。首先在了解了語音識別服務(wù)器端的基本流程,結(jié)合用戶對語音識別的使用習(xí)慣,提出了語音云開放平臺客戶端SDK的需求,需求分析主要列出了語音云SDK給用戶提供的功能以及語音云與服務(wù)器交互需要實現(xiàn)的功能。在詳細(xì)的需求分析后對語音云SDK進(jìn)行了詳細(xì)地設(shè)計,設(shè)計過程中將整個語音云SDK按照功能分成了幾個主要模塊,分別為：錄音模塊、有效聲音檢測模塊、音頻壓縮編碼模塊、網(wǎng)絡(luò)收發(fā)模塊以及識別結(jié)果回傳模塊等,并詳細(xì)地列舉了各個模塊內(nèi)的參數(shù)和方法,最后通過圖表解釋了各模塊之間的工作流程以及交互關(guān)系。接下來根據(jù)設(shè)計進(jìn)行了代碼實現(xiàn),代碼實現(xiàn)的過程是按照音頻數(shù)據(jù)在各模塊中的流程順序分先后實現(xiàn)。最后對整個語音云SDK進(jìn)行了系統(tǒng)化的軟件測試,并通過軟件測試進(jìn)一步完善了整個語音云SDK的可用性和安全性。
[Abstract]:With the popularity of mobile terminals, such as smart phones and intelligent tablets, mobile Internet has developed rapidly, and mobile terminal applications have become more and more demanding for text input. The applications of navigation and chat classes are more likely to emancipate users through speech recognition technology. With the increasing of the Siri platform on iOS devices Mature, the major Internet Co have also launched their own speech recognition system, but at present, the iOS system has not provided the developer with the public Siri API to call the voice recognition function, and the major Internet Co has strict restrictions on the client voice recognition SDK, and the iOS system lacks general open speech recognition SDK for opening. The hair is used.
This paper mainly studies the open speech recognition SDK available on the iOS system, compares the product function of each voice recognition SDK, analyzes the developer's demand for the voice recognition SDK, and puts forward a set of new solutions to realize the client voice recognition SDK, which is called the voice cloud open platform client SDK, abbreviated as voice cloud SDK. voice cloud SD. K makes it easy for developers to build a fully functional and interactive voice recognition application on iOS devices. In the whole process of development and use, developers can enjoy voice recognition services without the need to maintain a voice engine.
Under the guidance of software engineering thought, the speech cloud SDK system is gradually realized in accordance with the software development process. First, the basic flow of the voice recognition server is understood, and the requirement of the voice cloud open platform client SDK is put forward by combining the user's habit of using speech recognition. The requirement analysis mainly lists the voice cloud SDK. The function provided by the user and the function of the voice cloud and the server interaction need to be realized. After detailed requirement analysis, the voice cloud SDK is designed in detail. The whole voice cloud SDK is divided into several main modules in the design process, which are the recording module, the effective sound detection module, the audio compression coding module, and the network. The parameters and methods of each module are enumerated in detail. Finally, the work flow and interaction between each module are explained by the chart. Then the code implementation is carried out according to the design. The process of the code realization is divided into the process sequence of each module according to the audio data. Finally, the whole voice cloud SDK is tested in a systematic way, and the usability and security of the whole voice cloud SDK is further improved through software testing.
【學(xué)位授予單位】：北京郵電大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2014
【分類號】：TN912.34

【參考文獻(xiàn)】

相關(guān)期刊論文前2條

1 劉澤琛;;語音端點檢測的常用方法及改進(jìn)[J];高等函授學(xué)報(自然科學(xué)版);2008年03期

2 李榮榮;胡昌奎;余娟;;基于譜熵的語音端點檢測算法改進(jìn)研究[J];武漢理工大學(xué)學(xué)報;2013年07期

，

本文編號：2008426

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/wltx/2008426.html

上一篇：復(fù)雜環(huán)境下基于時延估計的聲源定位技術(shù)研究
下一篇：無線專網(wǎng)在電力通信中的應(yīng)用

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于iOS系統(tǒng)的語音云開放平臺客戶端SDK的設(shè)計與實現(xiàn)