場(chǎng)景圖像文本定位與字符識(shí)別方法研究
[Abstract]:The text in the scene image contains rich and accurate information, which has a wide range of application requirements in the fields of industrial automation, traffic management, automatic translation, service for the disabled and so on. However, due to the influence of non-uniform lighting, background texture and text diversity, the accuracy of scene text extraction is low. Therefore, how to extract text information accurately from these scene images has become a research focus in the field of pattern recognition. The research of this project has important practical value to improve the accuracy and robustness of scene image text recognition system. The main work and contributions of this paper are as follows: firstly, based on the consistency of the gray value of the characters in the text area, the amplitude of the gradient in the x direction is convex and the nearest neighbor of the text characters. In this paper, a text location method of scene image based on convolution neural network (CNN) and support vector machine (SVM) output score is proposed. According to the convexity distribution of the gradient amplitude in the x direction of the text region and the consistency of the character gray value, the typical points in the text region are detected, and the candidate connected components are extracted by the typical point position and gray clustering, and then the regions other than the candidate connected components are extracted. Other candidate connected components were further extracted by k-means clustering method. Then, the text connected component SVM classifiers based on CNN are used, the texture features of connected components are extracted by CNN, and then the non-text connected components are suppressed by SVM output score, and the nearest neighbor connected components are combined into candidate text regions. Finally, the support vector machine (SVM) is used to verify the candidate region according to the gradient direction histogram HOG feature of the candidate region. For the scene text image datasets of ICDAR2011 and ICDAR2013, the F values of 76% and 78% are obtained by the localization method, respectively, which shows that the method can effectively suppress the complex background texture interference. Secondly, based on the similarity of character color in text line, a text region character cutting method based on color clustering and gradient vector stream is proposed. Firstly, k-means clustering method is used to cluster the spatial position distribution of pixel color to obtain k candidate layers, and then the geometric features such as duty cycle and aspect ratio of connected components are used to extract the layers in which the candidate characters are connected. In the homogeneous region, the point far from the edge is found as the candidate segmentation pixel point, and the square of the gray difference is used as the cost to find the cutting path with the lowest cumulative cost. On the text dataset of ICDAR2013 scene image, the F value of 87.9% is obtained by this method. The experimental results show that color clustering can effectively suppress the interference of non-uniform light and occlusion. Finally, based on the rotation invariance of character structure, a multi-direction single character recognition model is proposed. The deformed HOG operator and concentric circular template sampling are used to extract the local joint HOG texture features and the quadrant structure features between the sampling points, and the character features are obtained by combining the above two features. Then the character word bag model of feature dictionary is established by learning, and then the character is recognized by support vector machine (SVM). Character recognition experiments are carried out for ICDAR character datasets, Chars74K datasets and manual collected datasets. The accuracy of the proposed method is 82%, 87% and 73% respectively, which shows that the proposed model has good robustness to rotation change.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.41
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 ;有限自然碼非接觸光電字符識(shí)別[J];中國(guó)計(jì)量學(xué)院學(xué)報(bào);2001年02期
2 許振新;字符識(shí)別要面向應(yīng)用[J];中國(guó)計(jì)算機(jī)用戶;2003年13期
3 盧達(dá),浦煒,謝銘培;一種用于提高字符識(shí)別速度的字符預(yù)分類法研究 [J];計(jì)算機(jī)工程與應(yīng)用;2000年04期
4 孫廣玲,唐降龍;基于識(shí)別結(jié)果反饋信息的閉環(huán)聯(lián)機(jī)字符識(shí)別系統(tǒng)[J];計(jì)算機(jī)工程與應(yīng)用;2002年22期
5 烏凌超,莫玉龍;基于獨(dú)立分量分析的字符識(shí)別方法[J];上海大學(xué)學(xué)報(bào)(自然科學(xué)版);2003年03期
6 陳薇,李勇;基于塊輸入的神經(jīng)網(wǎng)絡(luò)英語(yǔ)字符識(shí)別研究[J];計(jì)算機(jī)時(shí)代;2005年07期
7 湯茂斌;謝渝平;李就好;;基于神經(jīng)網(wǎng)絡(luò)算法的字符識(shí)別方法研究[J];微電子學(xué)與計(jì)算機(jī);2009年08期
8 田立巖;胡曉光;;一種改進(jìn)的快速嵌入式字符識(shí)別方法[J];光電子.激光;2010年10期
9 陳默;何小海;吳煒;楊曉敏;付光榮;;結(jié)合獨(dú)立與連續(xù)字符識(shí)別的集裝箱號(hào)識(shí)別技術(shù)[J];四川大學(xué)學(xué)報(bào)(工程科學(xué)版);2011年S1期
10 韓林峰;趙暉;;基于支持向量機(jī)的聯(lián)機(jī)手寫(xiě)維吾爾字符識(shí)別[J];計(jì)算機(jī)應(yīng)用與軟件;2012年03期
相關(guān)會(huì)議論文 前10條
1 湯茂斌;謝渝平;李就好;;基于神經(jīng)網(wǎng)絡(luò)算法的字符識(shí)別方法研究[A];2009年全國(guó)開(kāi)放式分布與并行計(jì)算機(jī)學(xué)術(shù)會(huì)議論文集(上冊(cè))[C];2009年
2 洪漢玉;郭強(qiáng);章秀華;張艷;林志敏;;復(fù)雜背景條件下字符識(shí)別新方法研究[A];第十四屆全國(guó)圖象圖形學(xué)學(xué)術(shù)會(huì)議論文集[C];2008年
3 車(chē)揚(yáng);鄭智捷;;速記字符識(shí)別的預(yù)處理模式和方法探討[A];2010通信理論與技術(shù)新發(fā)展——第十五屆全國(guó)青年通信學(xué)術(shù)會(huì)議論文集(下冊(cè))[C];2010年
4 李玉良;王良松;李晶;;圖像中數(shù)字字符識(shí)別技術(shù)概覽[A];節(jié)能環(huán)保 和諧發(fā)展——2007中國(guó)科協(xié)年會(huì)論文集(一)[C];2007年
5 劉云曼;王磊;;盲人閱讀機(jī)中圖像字符識(shí)別方法的研究[A];天津市生物醫(yī)學(xué)工程學(xué)會(huì)第三十三屆學(xué)術(shù)年會(huì)論文集[C];2013年
6 余曉華;陳曉春;劉好炯;;手持式儀表字符識(shí)別技術(shù)研究[A];《IT時(shí)代周刊》論文專版(第300期)[C];2014年
7 陸璐;張旭東;趙瑩;高雋;;基于卷積神經(jīng)網(wǎng)絡(luò)的車(chē)牌照字符識(shí)別研究[A];第十二屆全國(guó)圖象圖形學(xué)學(xué)術(shù)會(huì)議論文集[C];2005年
8 朱小燕;史一凡;馬少平;;脫機(jī)手寫(xiě)體字符識(shí)別研究[A];面向21世紀(jì)的科技進(jìn)步與社會(huì)經(jīng)濟(jì)發(fā)展(上冊(cè))[C];1999年
9 歐梅芳;宋瑞霞;;V-系統(tǒng)在信息重構(gòu)與字符識(shí)別中的應(yīng)用探索[A];中國(guó)圖學(xué)新進(jìn)展2007——第一屆中國(guó)圖學(xué)大會(huì)暨第十屆華東六省一市工程圖學(xué)學(xué)術(shù)年會(huì)論文集[C];2007年
10 張雪山;田慧;;字符識(shí)別系統(tǒng)的一種定位算法[A];圖像 仿真 信息技術(shù)——第二屆聯(lián)合學(xué)術(shù)會(huì)議論文集[C];2002年
相關(guān)重要報(bào)紙文章 前3條
1 尼克;計(jì)算歷史學(xué):大數(shù)據(jù)時(shí)代的讀書(shū)[N];東方早報(bào);2014年
2 王慶國(guó);票據(jù)印刷視覺(jué)字符檢測(cè)系統(tǒng)中硬件的選擇[N];中國(guó)包裝報(bào);2008年
3 方忠誠(chéng);OCR技術(shù)及其應(yīng)用[N];北京電子報(bào);2000年
相關(guān)博士學(xué)位論文 前4條
1 巫義銳;視覺(jué)場(chǎng)景理解與交互關(guān)鍵技術(shù)研究[D];南京大學(xué);2016年
2 文穎;數(shù)字、字符識(shí)別及其應(yīng)用研究[D];上海交通大學(xué);2009年
3 彭健;多類小字符集自適應(yīng)字符識(shí)別技術(shù)及系統(tǒng)的研究[D];重慶大學(xué);2002年
4 羅特飛(Mohammed Lutf);基于HMM與決策樹(shù)的多字體阿拉伯文的字符識(shí)別[D];華中科技大學(xué);2015年
相關(guān)碩士學(xué)位論文 前10條
1 張佳偉;基因組自動(dòng)化進(jìn)化儀的研制[D];浙江大學(xué);2015年
2 邱立松;國(guó)際音標(biāo)字符識(shí)別算法的研究[D];上海師范大學(xué);2015年
3 張靖婭;鋼板點(diǎn)陣噴印字符識(shí)別方法研究[D];沈陽(yáng)理工大學(xué);2015年
4 武威;基于模板匹配與結(jié)構(gòu)特征的字符識(shí)別算法研究[D];鄭州大學(xué);2015年
5 王勁松;基于神經(jīng)網(wǎng)絡(luò)的字符識(shí)別系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];電子科技大學(xué);2014年
6 周炳昱;基于手機(jī)攝像取詞的電子詞典的設(shè)計(jì)與實(shí)現(xiàn)[D];大連理工大學(xué);2015年
7 戴威;聯(lián)機(jī)手寫(xiě)智能計(jì)算系統(tǒng)的研究[D];華北電力大學(xué);2015年
8 尹少東;基于嵌入式Linux的字符識(shí)別[D];河北科技大學(xué);2015年
9 周軍;圖像中自然場(chǎng)景字符區(qū)域定位[D];東北大學(xué);2014年
10 周品;車(chē)牌分割和字符識(shí)別的算法研究[D];南京郵電大學(xué);2015年
,本文編號(hào):2484739
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2484739.html