基于Android平臺的圖像文字識別及語音播放系統(tǒng)

發(fā)布時間：2018-03-17 00:01

本文選題：安卓平臺　切入點：文字識別　出處：《南京郵電大學》2017年碩士論文　論文類型：學位論文

【摘要】：據(jù)統(tǒng)計全球約超過1.5%的人群因視覺方面的障礙不能像正常人那樣學習和生活,圖像文字識別和語音播放技術(shù)在一定程度上可以為他們提供閱讀幫助。雖然目前市場上已有基于Androi d終端的類似產(chǎn)品,如云脈文檔識別、OCR(Optical Character Recognition)文字識別等,但這些識別軟件對圖像拍攝要求較高,往往要求拍攝的文字清晰、圖像不能傾斜、圖像僅僅只包含文字等,否則將無法識別或者導(dǎo)致識別準確率降低,故這些要求對于存在視力障礙人群并不現(xiàn)實。為此本文研究開發(fā)了基于Android的文字圖像識別軟件,并增加了語音播放的功能,使用者可通過聽覺獲取文字信息。本文完成的主要工作如下:首先,提出文字圖像傾斜矯正和文字區(qū)域裁剪算法,并通過灰度化、二值化、傾斜矯正和文字區(qū)域裁剪等過程降低了待識別的文字圖像冗余信息,實現(xiàn)了文字圖像的預(yù)處理。然后,基于google公司優(yōu)化的tesseract識別引擎開發(fā)了文字識別功能,并通過訓練和擴展字符庫的方法來提高文字識別的準確率。最后,基于手說TTS(Text To S peech)引擎開發(fā)了語音播放功能,該功能不僅可以播放識別出來的文字,而且可以以不同性別、不同音量、不同語速進行播放。通過對該系統(tǒng)進行測試驗證了本文開發(fā)的基于Android平臺的圖像文字識別及語音播放系統(tǒng)的有效性,并且它同市場上應(yīng)用最廣泛的識別軟件之一的云脈文檔識別進行了識別對比,驗證了其在識別有傾斜或者包含非文字部分的文本圖像時效果更好。
[Abstract]:According to statistics, more than 1.5% people in the world are unable to study and live like normal people because of visual difficulties. To a certain extent, the technology of image recognition and speech playback can help them to read. Although there are already similar products based on Androi d terminals in the market, such as cloud pulse document recognition, optical Character recognition, character recognition, etc. However, the recognition software often requires the text to be clear, the image can not be tilted, and the image only contains text, otherwise, the recognition accuracy will be reduced. Therefore, these requirements are not realistic for people with visual impairment. In this paper, the text and image recognition software based on Android is developed, and the function of speech playing is added. The main work of this paper is as follows: firstly, the text image tilt correction and text region clipping algorithm are proposed. The process of skew correction and text region clipping reduces the redundant information of the text image to be recognized, and realizes the preprocessing of the text image. Then, based on the tesseract recognition engine optimized by google Company, the text recognition function is developed. And improve the accuracy of character recognition by training and expanding the character base. Finally, based on the handheld TTS(Text to S peech-based engine, a speech playback function has been developed, which not only can play the recognized text, but also can be of different gender. By testing the system, the validity of the image text recognition and speech playback system based on Android platform is verified. And it is compared with cloud pulse document recognition which is one of the most widely used recognition software in the market. It is proved that it is more effective in recognizing text images with skew or non-text parts.
【學位授予單位】：南京郵電大學
【學位級別】：碩士
【學位授予年份】：2017
【分類號】：TP391.41;TN912

【參考文獻】

相關(guān)期刊論文前10條

1 劉晟橋;牛連強;馮庸;;一種改進的退化文本圖像二值化方法[J];智能計算機與應(yīng)用;2016年04期

2 顏建強;高新波;;一種基于Google的OCR結(jié)果校對新方法[J];計算機學報;2014年06期

3 張國海;;基于TTS的中英文語音軟件設(shè)計與實現(xiàn)[J];安徽電子信息職業(yè)技術(shù)學院學報;2014年02期

4 孫潔娣;溫江濤;李書茉;任瑞軍;;局部高亮干擾文本圖像的二值化方法研究[J];光電工程;2012年11期

5 井曉陽;羅飛;王亞棋;;漢語語音合成技術(shù)綜述[J];計算機科學;2012年S3期

6 朱懷中;;基于Android的手機OCR識別技術(shù)設(shè)計與實現(xiàn)[J];電子科技;2012年09期

7 余佳;黃智超;蔣端保;梁治峰;楊兵;帖軍;;基于Android圖片文字朗讀軟件的盲人電子眼[J];軟件導(dǎo)刊;2012年08期

8 童立靖;張艷;舒巍;占國亮;錢W，

本文編號：1622235

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/xinxigongchenglunwen/1622235.html

上一篇：ZigBee網(wǎng)絡(luò)的路由優(yōu)化算法研究
下一篇：彈性旋轉(zhuǎn)對稱布爾函數(shù)的構(gòu)造、計數(shù)和判別方法

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于Android平臺的圖像文字識別及語音播放系統(tǒng)