圖像中無約束文本的定位與分割研究

發(fā)布時間：2018-06-19 21:05

本文選題：iFAST檢測算法 + 筆劃連通分割　；參考：《廣西師范大學》2017年碩士論文

【摘要】：靜態(tài)圖像和動態(tài)視頻(幀)中的文本識別,分兩個階段進行:首先對圖像中的文本進行檢測與提取,從輸入的原始圖像中分割出文本區(qū)域,即文本檢測;然后對檢測出的文本區(qū)域進行識別,從輸入的文本區(qū)域圖像識別出相應的文本結(jié)果,即文本識別。其中文本檢測和定位主要用來確定圖像中文本的位置,并找出這些文本的邊界框,是整個流程中最為關(guān)鍵的一步。文本分割盡可能去除文本周圍的背景,便于隨后的文本識別。計算機視覺要實現(xiàn)圖像的處理、分析和理解,文本檢測和定位是必不可少的基礎(chǔ)步驟和關(guān)鍵階段,這是本文研究的意義所在。文獻研究顯示,自然場景圖像中的文本識別,難以直接套用傳統(tǒng)標準(有約束)圖像中的文本識別算法,因為自然場景圖像中文本字與字之間存在著尺寸大小不同、方向不同、字體不同、模糊程度不同、光照度不同、被障礙物遮掩程度不同等差異;另外實時性要求相對較高。任何文本皆由筆劃組成,而筆劃檢測的關(guān)鍵在于檢測筆劃上的角點。角點檢測常用算法 SURF、AGAST、BRISK、FAST、SIFT、ORB 中,FAST(Features from Accelerated Segment Test)算法雖不具尺度不變性,但具有一定程度旋轉(zhuǎn)不變性和仿射不變性,更為重要的是速度明顯較快,較適合于實時應用,故本文基于FAST算法和筆劃寬度轉(zhuǎn)換算法,提出了一種改進FAST檢測算法(iFAST-improved FAST)--一種快速文本角點檢測算法,用于定位和分割圖像中含有無約束文本的區(qū)域。iFAST檢測算法,首先檢測圖中筆劃的角點,然后根據(jù)角點屬性提取成文本片段,接著使用多尺度自適金字塔模型訓練級聯(lián)分類器以去除多余的非文本區(qū)域。該算法能快速、魯棒、精確地檢測與分割出圖像中大小不同文本區(qū)域。還采用基于文本方向投票的有效文本聚類算法,將檢測到區(qū)域聚集到文本行,以允許后續(xù)階段(例如OCR模塊)處理。利用文本識別領(lǐng)域常用的ICDAR2013和MSRA-TD500兩個數(shù)據(jù)集作為訓練集和測試集,并與其它算法做了性能對比,結(jié)果發(fā)現(xiàn)本文提出的iFAST可以在多樣性文本和多方向的文本取得較好的性能,iFAST檢測算法與常用MSER文本檢測算法相比,產(chǎn)生的區(qū)域數(shù)目減少為原區(qū)域數(shù)目的2分之1,且能檢測多25%的字符,同時檢測速度高4倍。采用后續(xù)分類階段的iFAST檢測算法可減少為1/7的原區(qū)域分割數(shù)目,且比MSER檢測算法快近3倍。
[Abstract]:The text recognition in static image and dynamic video (frame) is divided into two stages: firstly, text detection and extraction are carried out in the image, and the text region is segmented from the input original image, that is, text detection; Then the detected text region is recognized, and the corresponding text result is recognized from the input text region image, that is, text recognition. Text detection and location is the most important step in the whole process, which is mainly used to determine the location of the Chinese text of the image, and to find the boundary box of the text. Text segmentation removes the background around the text as much as possible to facilitate subsequent text recognition. In order to realize image processing, analysis and understanding, text detection and location are essential basic steps and key stages of computer vision, which is the significance of this study. Literature studies show that text recognition in natural scene images is difficult to directly apply to text recognition algorithms in traditional (constrained) images, because there are different sizes and directions between Chinese characters and characters in natural scene images. Different fonts, different fuzzy degree, different illumination, different degree of occlusion by obstacles, the other requirements are relatively high real-time. Any text consists of strokes, and the key to stroke detection is to detect corner points on strokes. Although the algorithm of corner detection is not scale-invariant, but has a certain degree of rotation invariance and affine invariance, it is more important that the speed is obviously faster, and it is more suitable for real-time application, although the algorithm of corner detection is not scale-invariant, but has a certain degree of rotation invariance and affine invariance. Therefore, based on fast algorithm and stroke width conversion algorithm, an improved fast text corner detection algorithm is proposed, which is used to locate and segment the region. IFAST detection algorithm contains unconstrained text. Firstly, the corner points of strokes in the graph are detected, then extracted into text fragments according to the corner attributes, and then cascaded classifiers are trained by multi-scale adaptive pyramid model to remove redundant non-text regions. The algorithm is fast, robust and accurate to detect and segment different text regions. An efficient text clustering algorithm based on text direction voting is also used to cluster the detected regions into text lines to allow subsequent stages (such as OCR modules) to process. Two data sets, ICDAR2013 and MSRA-TD500, which are commonly used in the field of text recognition, are used as training set and test set, and the performance of ICDAR2013 and MSRA-TD500 are compared with other algorithms. The results show that the iFAST proposed in this paper can achieve better performance in the diversity of text and multi-directional text detection algorithm compared with the usual MSER text detection algorithm. The number of regions generated is reduced to 1 / 2 of the original region number and can detect more than 25% of the characters, and the detection speed is 4 times higher. Using the iFAST detection algorithm in the subsequent classification stage can reduce the number of original regions to 1 / 7, and is nearly three times faster than the MSER detection algorithm.
【學位授予單位】：廣西師范大學
【學位級別】：碩士
【學位授予年份】：2017
【分類號】：TP391.41

【相似文獻】

相關(guān)期刊論文前10條

1 ;信息文本快速閱讀技術(shù)[J];黃石高等�？茖W校學報;2004年02期

2 劉建毅;王菁華;王樅;;文本網(wǎng)絡表示研究與應用[J];中國科技論文在線;2007年10期

3 吳思竹;張智雄;錢慶;;基于語言網(wǎng)絡的文本表示模型研究[J];情報科學;2013年12期

4 于屏方;杜家利;;文本排歧語義圖式的自動獲取與選擇[J];計算機工程與應用;2007年31期

5 陳燕敏;樓喜中;;一種基于集聚確定文本意向結(jié)構(gòu)的方法[J];微計算機信息;2010年18期

6 袁鼎榮;鐘寧;張師超;;文本信息處理研究述評[J];計算機科學;2011年02期

7 林鴻飛,戰(zhàn)學剛,姚天順;文本層次分析與文本瀏覽[J];中文信息學報;1999年04期

8 姚天f ;“自然語言多語種文本生成系統(tǒng)”在上海交通大學研制成功[J];中文信息學報;1999年04期

9 石晶;;文本分割綜述[J];計算機工程與應用;2006年35期

10 劉紅紅;安海忠;高湘昀;;基于文本復雜網(wǎng)絡的內(nèi)容結(jié)構(gòu)特征分析[J];現(xiàn)代圖書情報技術(shù);2011年01期

相關(guān)會議論文前6條

1 楊艷;李巍;玄萍;;數(shù)字圖書館中基于Ontology的文本模型[A];黑龍江省計算機學會2009年學術(shù)交流年會論文集[C];2010年

2 唐云廷;;基于TSBT(Text Structure Binary Tree)的文本結(jié)構(gòu)的自動分析[A];第三屆全國信息檢索與內(nèi)容安全學術(shù)會議論文集[C];2007年

3 海麗且木·艾沙;維尼拉·木沙江;;Web文本分類及其維、哈、柯多文種信息檢索中的應用研究[A];少數(shù)民族青年自然語言處理技術(shù)研究與進展——第三屆全國少數(shù)民族青年自然語言信息處理、第二屆全國多語言知識庫建設聯(lián)合學術(shù)研討會論文集[C];2010年

4 劉玲;周經(jīng)野;羅慧慧;;基于XML的文本規(guī)劃方法[A];2005年全國理論計算機科學學術(shù)年會論文集[C];2005年

5 蘇貴洋 ;李建華 ;馬穎華;;XML統(tǒng)一文本自動處理描述接口[A];NCIRCS2004第一屆全國信息檢索與內(nèi)容安全學術(shù)會議論文集[C];2004年

6 揭春雨;劉曉月;冼景炬;衛(wèi)真道;;從網(wǎng)絡獲取香港法律雙語語料庫[A];全國第八屆計算語言學聯(lián)合學術(shù)會議（JSCL-2005）論文集[C];2005年

相關(guān)博士學位論文前10條

1 方瑩;面向熱點新聞話題的文本處理技術(shù)研究[D];北京理工大學;2015年

2 李巖;基于深度學習的短文本分析與計算方法研究[D];北京科技大學;2016年

3 程齊凱;學術(shù)文本的詞匯功能識別[D];武漢大學;2015年

4 劉赫;文本分類中若干問題研究[D];吉林大學;2009年

5 賴彥;新聞話語對話性的文本分析與闡釋[D];南京師范大學;2011年

6 鐘茂生;基于內(nèi)容相關(guān)度計算的文本結(jié)構(gòu)分析方法研究[D];上海交通大學;2010年

7 廖一星;文本分類及其特征降維研究[D];浙江大學;2012年

8 單建芳;面向事件的文本表示研究[D];上海大學;2012年

9 孫巧榆;復雜背景圖像的文本信息提取研究[D];華東師范大學;2012年

10 眭新光;文本信息隱藏及分析技術(shù)研究[D];解放軍信息工程大學;2007年

相關(guān)碩士學位論文前10條

1 江長柱;用戶咨詢文本的語義相似度計算方法研究[D];江蘇科技大學;2015年

2 李欣;基于維度判別的文本情感聚類方法研究[D];山西大學;2015年

3 黃志鋒;中職語文教學“反文本”傾向探究[D];寧波大學;2015年

4 李麗娜;基于BIM的建設項目文本信息集成管理研究[D];大連理工大學;2015年

5 葛文鎮(zhèn);面向微博的短文本多分類研究[D];寧波大學;2015年

6 高士林;圖像中的文本定位技術(shù)研究[D];解放軍信息工程大學;2014年

7 肖誠求;自然場景圖像中文本提取技術(shù)研究[D];解放軍信息工程大學;2015年

8 劉春晨;HANA系統(tǒng)文本情感分析模塊的設計與實現(xiàn)[D];南京大學;2014年

9 楊海振;文本·傳播·組織：網(wǎng)絡公共事件的特征分析[D];西南大學;2016年

10 黃天宇;自然場景文本檢測方法研究[D];華南理工大學;2016年

，

本文編號：2041264

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2041264.html

上一篇：不完全角度CT圖像重建的模型與算法
下一篇：上下文感知的智慧城市空間信息服務組合

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

圖像中無約束文本的定位與分割研究