天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 軟件論文 >

自然場景圖像中的文本定位和提取算法研究

發(fā)布時間:2018-11-24 19:28
【摘要】:近些年來,隨著互聯(lián)網(wǎng)技術(shù)和信息技術(shù)的飛速發(fā)展,手機(jī)、數(shù)碼相機(jī)等便攜式設(shè)備的普及,人們可以根據(jù)需要隨時隨地獲取圖像并上傳到網(wǎng)絡(luò)。而文字,作為人與人之間交流的媒介,也是信息傳遞的重要方式。但是自然場景圖像中的文本提取仍然是一個復(fù)雜的問題。首先,文本作為人工設(shè)計的結(jié)構(gòu),不同語言的文本會表現(xiàn)出不同的結(jié)構(gòu)特點,例如像中國、日本、韓國等東亞國家擁有大規(guī)模的字符集,復(fù)雜的字符結(jié)構(gòu)和多樣的字形。因此,使用一個簡單的方法來檢測所有語言仍然難以實現(xiàn)。其次,在圖像的采集過程中,不可避免會受到各種因素的影響,如不均勻光照、復(fù)雜的背景圖案等,這些都給文本檢測造成了困難。因此,自然場景圖像文本定位識別技術(shù)仍然是一個熱門的研究課題。圖像文本定位作為圖像文本信息提取中關(guān)鍵的一步,其定位結(jié)果將直接影響著后續(xù)的文本識別OCR過程。這里,本文主要針對水平英文文本,研究設(shè)計了一種多分辨率策略的自然場景圖像文本定位框架,可以對自然場景文本圖像進(jìn)行由粗到精的定位提取,從而獲得文本區(qū)域圖像。首先,在文本區(qū)域粗定位階段,該框架會將每張圖像轉(zhuǎn)換為3個尺度,目的是能夠讓算法檢測出不同大小的字符。之后,通過訓(xùn)練卷積神經(jīng)網(wǎng)絡(luò)來對提取的對象區(qū)域分類。在這一階段里,主要使用兩種方法獲得對象區(qū)域,第一種是基于最穩(wěn)定極值區(qū)域的提取方法,第二種是基于筆畫寬度變換的方法。實驗結(jié)果證明,由于卷積神經(jīng)網(wǎng)絡(luò)能夠有效地檢測出可能存在的字符區(qū)域,所以圖像文本區(qū)域粗定位階段的性能效果主要受提取的對象區(qū)域集合的影響。在本文實驗中,用基于最穩(wěn)定極值區(qū)域方法比使用筆畫寬度變換方法得到的對象集更加完整。然后,在文本區(qū)域精提取階段,本文首先設(shè)計了一套基于圖像灰度共生矩陣特征和對比度顯著性特征的規(guī)則來對多分辨率圖像的粗定位結(jié)果進(jìn)行融合。之后,為了去除假陽性文本區(qū)域,本文將融合后的結(jié)果送入自適應(yīng)增強(qiáng)分類器,并得到最終的圖像文本行。其中,自適應(yīng)增強(qiáng)分類器是使用梯度方向直方圖作為特征描述器來進(jìn)行訓(xùn)練的。實驗結(jié)果證明該階段的方法能夠有效地提高圖像文本定位的準(zhǔn)確率。從本文設(shè)計的自然場景圖像文本定位框架里得到的結(jié)果可以進(jìn)一步的使用圖像二值化方法來分割處理,最終可以直接利用OCR程序來對其進(jìn)行文本識別。
[Abstract]:In recent years, with the rapid development of Internet technology and information technology, mobile phones, digital cameras and other portable devices, people can get images anytime and anywhere and upload them to the network. And text, as a medium of communication between people, is also an important way of information transmission. However, text extraction from natural scene images is still a complex problem. First of all, text as a manually designed structure, text in different languages will show different structural characteristics, such as China, Japan, Korea and other East Asian countries have a large-scale character set, complex character structure and a variety of glyph. Therefore, using a simple method to detect all languages is still difficult to implement. Secondly, in the process of image acquisition, it is inevitable to be affected by various factors, such as uneven illumination, complex background patterns and so on, which make text detection difficult. Therefore, text location and recognition technology of natural scene image is still a hot research topic. As a key step in image text information extraction, image text location will directly affect the subsequent OCR process of text recognition. In this paper, a multi-resolution strategy based text localization framework for natural scene images is designed for horizontal English text, which can extract the text images from coarse to fine, and then obtain text region images. Firstly, in the rough location phase of the text region, the framework converts each image into three scales, so that the algorithm can detect characters of different sizes. After that, the extracted object regions are classified by training convolution neural network. In this stage, two methods are mainly used to obtain the object region, one is based on the most stable extremum region, the other is based on the stroke width transformation. Experimental results show that due to the convolution neural network can effectively detect possible character regions, the performance of image text regions in rough location stage is mainly affected by the extracted object region set. In this paper, the method based on the most stable extremum region is more complete than the method of stroke width transformation. Then, in the stage of text region extraction, a set of rules based on gray level co-occurrence matrix feature and contrast salience feature is designed to fuse the rough location results of multi-resolution image. After that, in order to remove the false positive text area, the fused results are sent to the adaptive enhancement classifier and the final line of image text is obtained. The adaptive enhancement classifier is trained by using gradient histogram as feature descriptor. Experimental results show that this method can effectively improve the accuracy of image text location. The results obtained from the text localization framework of the natural scene image can be further segmented by image binarization method, and finally the text recognition can be carried out directly by using OCR program.
【學(xué)位授予單位】:東南大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP391.41

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 歐文武,朱軍民,劉昌平;自然場景文本定位[J];中文信息學(xué)報;2004年05期

相關(guān)博士學(xué)位論文 前1條

1 張健;復(fù)雜圖像文本提取關(guān)鍵技術(shù)與應(yīng)用研究[D];南開大學(xué);2014年

,

本文編號:2354765

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2354765.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶ed1f0***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com