基于機(jī)器學(xué)習(xí)的自然圖像中文本檢測(cè)及多文種辨識(shí)方法研究
本文選題:文本檢測(cè) + 文種辨識(shí); 參考:《延邊大學(xué)》2017年碩士論文
【摘要】:文字在人類思想情感以及文化傳承中是十分重要的符號(hào)工具,在社會(huì)生產(chǎn)生活的各個(gè)方面都體現(xiàn)出了文字的重要性與不可替代性。在現(xiàn)代城市環(huán)境中,文字是普遍存在的元素,如海報(bào)、道路標(biāo)志、牌匾燈箱等,其中不乏大量的文字信息。在自然圖像中,文字所表達(dá)的語義信息是理解圖像內(nèi)容時(shí)一個(gè)很重要的參考信息。自然圖像中的文種辨識(shí)是基于內(nèi)容的圖像檢索和多語種系統(tǒng)開發(fā)領(lǐng)域的一個(gè)重要方向。在自然圖像場(chǎng)景中文字的檢測(cè)及其文種的辨識(shí)有相當(dāng)大的難度:不同自然場(chǎng)景中的文字含有不同的特性,例如顏色不同、數(shù)量不一、大小與間隔不同等;而且在自然圖像中,文字的背景往往很復(fù)雜,同時(shí)存在著諸如噪聲、傾斜和透視變換等各種問題。這些都對(duì)自然圖像中的文字檢測(cè)和文種辨識(shí)工作帶來了極大的困難。如何有效地對(duì)包含有多種語言文字的自然圖像進(jìn)行處理成為自然場(chǎng)景分析與理解中亟待解決的難題。本學(xué)位論文提出了一種基于視覺顯著性和邊緣密集度的文本區(qū)域檢測(cè)方法以及基于圖像特征和機(jī)器學(xué)習(xí)方法的文種辨識(shí)方法。首先,提出了基于視覺顯著性和邊緣密集度的文本區(qū)域檢測(cè)方法。該文本區(qū)域檢測(cè)方法通過多尺度譜殘差方法來檢測(cè)視覺顯著性區(qū)域,接著在視覺顯著性區(qū)域內(nèi)使用Sobel算子來對(duì)圖像進(jìn)行檢測(cè)邊緣,通過計(jì)算圖像的邊緣密集度,再使用數(shù)學(xué)形態(tài)學(xué)方法對(duì)圖像邊緣進(jìn)行預(yù)處理,最終通過自然圖像中文字排列的先驗(yàn)知識(shí)來檢測(cè)文本區(qū)域。其次,提出了基于基本圖像特征與機(jī)器學(xué)習(xí)方法的文種辨識(shí)方法。該方法對(duì)阿拉伯?dāng)?shù)字、英文、俄文、日文假名、簡(jiǎn)體中文和朝鮮文構(gòu)建了文字樣本圖像并提取其骨架,利用該骨架的基本圖像特征構(gòu)造相應(yīng)文種的特征集,并根據(jù)不同文種的結(jié)構(gòu)特征,結(jié)合分類方法的特性,將文種辨識(shí)分為兩個(gè)階段.·粗分類階段和細(xì)分類階段。在粗分類階段,使用支持向量機(jī)將文字劃分為兩大類,第一類中包含阿拉伯?dāng)?shù)字、英文、俄文和日文假名,第二類中包含簡(jiǎn)體中文和朝鮮文。在辨識(shí)階段,使用支持向量機(jī)對(duì)第一類文字進(jìn)行文種辨識(shí),使用BP神經(jīng)網(wǎng)絡(luò)對(duì)第二類文字進(jìn)行辨識(shí)。實(shí)驗(yàn)結(jié)果表明,本文提出的基于視覺顯著性與文字邊緣密集度的文本檢測(cè)方法得到了 73%的檢測(cè)率,基于基本圖像特征與機(jī)器學(xué)習(xí)方法的文種辨識(shí)方法得到了 73.33%的辨識(shí)率,解決了自然圖像中的文本檢測(cè)與文種辨識(shí)問題,同時(shí)也驗(yàn)證了本學(xué)位論文所提出方法的正確性與可行性。
[Abstract]:Writing is a very important symbolic tool in human thoughts and emotions as well as cultural heritage. It embodies the importance and irreplaceable character in all aspects of social production and life. In modern urban environment, characters are common elements, such as posters, road signs, plaques and lampboxes, among which there is a lot of text information. In natural images, the semantic information expressed by text is an important reference information in understanding image content. Language recognition in natural images is an important direction in the field of content-based image retrieval and multilingual system development. Text detection and text recognition in natural image scenes are quite difficult: text in different natural scenes contains different characteristics, such as different colors, different quantities, different sizes and intervals, and in natural images, The background of text is often very complex, and there are many problems such as noise, tilt and perspective transformation. All these bring great difficulties to text detection and language recognition in natural images. How to effectively process the natural images containing many languages and characters has become a difficult problem to be solved in the analysis and understanding of natural scenes. In this dissertation, a text region detection method based on visual salience and edge intensity, and a text recognition method based on image features and machine learning methods are proposed. Firstly, a text region detection method based on visual salience and edge intensity is proposed. The text region detection method uses multi-scale spectral residuals method to detect the visual significant region, then uses Sobel operator to detect the edge of the image in the visual salience region, and calculates the edge density of the image. Then the edge of the image is preprocessed by mathematical morphology, and the text region is detected by the prior knowledge of the text arrangement in the natural image. Secondly, a language identification method based on basic image features and machine learning is proposed. In this method, the Arabic numerals, English, Russian, Japanese pseudonyms, simplified Chinese and Korean characters were constructed and their skeleton was extracted, and the feature sets of the corresponding languages were constructed by using the basic image features of the skeleton. According to the structural characteristics of different languages and the characteristics of classification methods, the text identification is divided into two stages: coarse classification stage and fine classification stage. In the rough classification stage, the support vector machine is used to divide the characters into two categories. The first includes Arabic numerals, English, Russian and Japanese pseudonyms, and the second includes simplified Chinese and Korean. In the phase of identification, support vector machine (SVM) is used to identify the first kind of characters and BP neural network is used to identify the second kind of characters. The experimental results show that the proposed text detection method based on visual salience and text edge density has a 73% detection rate, and a text recognition rate of 73.33% based on basic image features and machine learning methods. The problems of text detection and text identification in natural images are solved, and the correctness and feasibility of the methods proposed in this dissertation are also verified.
【學(xué)位授予單位】:延邊大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 張國(guó)和;黃凱;張斌;符歡歡;趙季中;;最大穩(wěn)定極值區(qū)域與筆畫寬度變換的自然場(chǎng)景文本提取方法[J];西安交通大學(xué)學(xué)報(bào);2017年01期
2 楊玲玲;葉東毅;;一種基于圖像矩和紋理特征的自然場(chǎng)景文本檢測(cè)算法[J];小型微型計(jì)算機(jī)系統(tǒng);2016年06期
3 唐有寶;卜巍;鄔向前;;多層次MSER自然場(chǎng)景文本檢測(cè)[J];浙江大學(xué)學(xué)報(bào)(工學(xué)版);2016年06期
4 許肖;顧磊;;結(jié)合顯著性檢測(cè)和中心分割算法的文本檢測(cè)方法[J];計(jì)算機(jī)科學(xué);2016年04期
5 尹芳;鄭亮;陳田田;;基于Adaboost算法的場(chǎng)景中文文本定位[J];計(jì)算機(jī)工程與應(yīng)用;2017年04期
6 買買提依明·哈斯木;吾守爾·斯拉木;維尼拉·木沙江;努爾麥麥提·尤魯瓦斯;;基于統(tǒng)計(jì)專用字符的維、哈、柯文文種識(shí)別研究[J];中文信息學(xué)報(bào);2015年02期
7 劉亞亞;于鳳芹;陳瑩;;基于連通區(qū)域和統(tǒng)計(jì)特征的圖像文本定位[J];計(jì)算機(jī)工程與應(yīng)用;2016年05期
8 閔華清;鄭華強(qiáng);羅榮華;;自然場(chǎng)景圖像中基于視覺顯著性的文本區(qū)域檢測(cè)[J];華南理工大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年08期
9 崔榮一;金世珍;;朝鮮文字信息結(jié)構(gòu)的研究[J];中文信息學(xué)報(bào);2011年05期
10 郭龍;平西建;周林;童莉;;基本圖像特征用于文本圖像文種識(shí)別[J];應(yīng)用科學(xué)學(xué)報(bào);2011年01期
相關(guān)碩士學(xué)位論文 前3條
1 樸明姬;自然圖像中文字語種辨識(shí)方法的研究[D];延邊大學(xué);2014年
2 金貞;漢字特征提取及識(shí)別技術(shù)的研究[D];上海交通大學(xué);2010年
3 項(xiàng)思俊;基于SVM的脫機(jī)手寫體漢字識(shí)別方法的研究[D];合肥工業(yè)大學(xué);2009年
,本文編號(hào):2047957
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2047957.html