藏語單字符手寫識別與應(yīng)用
本文選題:藏語單字符手寫識別 切入點(diǎn):特征提取 出處:《西安電子科技大學(xué)》2015年碩士論文 論文類型:學(xué)位論文
【摘要】:我國幅員遼闊人口眾多,是由56個(gè)民族組成的和諧大家庭,其中藏族是我國主要少數(shù)民族之一。藏語文獻(xiàn)日積月累,除漢語之外,是我國歷史最悠久、文獻(xiàn)最豐富的語言文明遺產(chǎn)。目前,英語與漢語識別技術(shù)已經(jīng)成熟,并且廣泛地應(yīng)用在各領(lǐng)域。相比于中英文,藏語的識別由于研究起步晚與研究人員需要熟悉藏語等原因,導(dǎo)致目前技術(shù)并不成熟,成果相對較少。藏語的人機(jī)交互方式還停留在鍵盤編碼的方法上,輸入方式唯一、速度慢與效率低,不能滿足用戶的需求。與傳統(tǒng)鍵盤編碼輸入方式相比,手寫輸入與人類自然書寫方式有更多相同。伴隨著各種移動設(shè)備的普及,手寫輸入成為人機(jī)交互的一種重要方式,所以藏語手寫識別不僅具有重要的社會意義,還有廣闊的市場前景。本文采集并建立了1000套手寫藏語字符數(shù)據(jù)庫,詳細(xì)介紹了藏文字符的特征,對手寫藏文單字符識別進(jìn)行了詳細(xì)的研究,具體工作如下:1.詳細(xì)介紹了藏語識別的背景、現(xiàn)狀和研究意義。2.系統(tǒng)分類描述藏語字符的特征,同時(shí)對手寫藏語特征具體的分析,并講述手寫藏語字符識別難點(diǎn),接著介紹了目前研究領(lǐng)域常用的文字識別方法。3.詳細(xì)說明了藏語手寫字符預(yù)處理的步驟,最大程度保留手寫藏語字符的原始信息,并且濾除冗余信息,便于特征提取與識別。4.介紹了藏語字符四個(gè)特征:方向線素特征、筆段結(jié)構(gòu)特征、梯度特征和Gabor特征。其中方向線素特征與筆段結(jié)構(gòu)特征是聯(lián)機(jī)特征,梯度特征和Gabor特征是脫機(jī)特征。實(shí)驗(yàn)對比聯(lián)機(jī)與脫機(jī)特征兩者之間的性能,從中找出性能較好的特征。同時(shí)采用了兩種融合方法,將兩種識別方法進(jìn)行融合,通過實(shí)驗(yàn)證明了系統(tǒng)的識別率較好。5.介紹Android平臺的手寫數(shù)據(jù)采集軟件與手寫按鍵結(jié)合式藏語輸入軟件。
[Abstract]:Our country has a vast population and is a harmonious family of 56 ethnic groups, among which the Tibetan nationality is one of the major ethnic groups in our country. The Tibetan language literature is accumulated day by day, besides Chinese, it is the longest history of our country. The most abundant cultural heritage of language. At present, English and Chinese recognition techniques are mature and widely used in various fields. Compared with Chinese and English, Tibetan language recognition is due to the late beginning of research and the need for researchers to be familiar with the Tibetan language. As a result, the current technology is not mature, and the achievements are relatively few. The human-computer interaction mode in Tibetan language still stays in the method of keyboard coding, the input mode is unique, the speed is slow and the efficiency is low. Compared with the traditional keyboard coding input, handwritten input is much the same as human nature. With the popularity of various mobile devices, handwritten input has become an important way of human-computer interaction. Therefore, the recognition of Tibetan handwriting not only has important social significance, but also has a broad market prospect. This paper collects and establishes 1000 sets of handwritten Tibetan characters database, and introduces the characteristics of Tibetan characters in detail. This paper makes a detailed study on handwritten Tibetan single character recognition. The specific work is as follows: 1. The background, current situation and research significance of Tibetan recognition are introduced in detail. 2. The characteristics of Tibetan characters are systematically classified and described. At the same time, the specific analysis of handwritten Tibetan characters is made. It also describes the difficulties of handwritten Tibetan character recognition, and then introduces the commonly used method of character recognition in the field of current research .3.The steps of Tibetan handwritten character preprocessing are described in detail, and the original information of handwritten Tibetan character is preserved to the maximum extent. And the redundant information is filtered out to facilitate feature extraction and recognition. 4. Four features of Tibetan characters are introduced: directional line element feature, pen segment structure feature, gradient feature and Gabor feature. Among them, directional line element feature and pen segment structure feature are on-line features. Gradient feature and Gabor feature are offline features. The experiment compares the performance between online feature and offline feature, and finds out the better feature. At the same time, two fusion methods are used to fuse the two recognition methods. The experiments show that the recognition rate of the system is good. 5. The handwritten data acquisition software based on Android platform and the Tibetan input software combined with handwritten keys are introduced.
【學(xué)位授予單位】:西安電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:H214;TP391.43
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 李亞男;陳興文;張丹;;印刷體維文切分算法的改進(jìn)——基于像素積分投影法和連通域搜索法[J];大連民族學(xué)院學(xué)報(bào);2014年03期
2 阿地力·依米提;劉吉超;杜力坤·蘇來曼;;復(fù)雜背景圖像中維吾爾文字切分與識別技術(shù)的研究[J];新疆師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2014年01期
3 許亞美;盧朝陽;李靜;;部件字典結(jié)合時(shí)分方向特征的手寫維吾爾字符識別[J];吉林大學(xué)學(xué)報(bào)(工學(xué)版);2013年03期
4 李曉;袁保社;陳卿;任宏宇;張建華;;基于像素積分投影的印刷體維文字母切分方法[J];計(jì)算機(jī)技術(shù)與發(fā)展;2012年04期
5 李燕;陳瑩;董秀蘭;閆琰;;基于神經(jīng)網(wǎng)絡(luò)的遙感圖像識別算法[J];測繪與空間地理信息;2012年02期
6 顧晨勤;葛萬成;;基于模板匹配算法的字符識別研究[J];通信技術(shù);2009年03期
7 王維蘭;柳洪軼;;聯(lián)機(jī)手寫藏文字符筆劃的分類統(tǒng)計(jì)與分析[J];科技創(chuàng)新導(dǎo)報(bào);2008年06期
8 柳洪軼;王維蘭;;聯(lián)機(jī)手寫藏文識別中字丁規(guī)范化處理[J];計(jì)算機(jī)應(yīng)用研究;2006年09期
9 柳洪軼,王曉東,王維蘭;藏文聯(lián)機(jī)手寫識別的難點(diǎn)及其解決方法[J];西北民族大學(xué)學(xué)報(bào)(自然科學(xué)版);2005年01期
10 高學(xué);金連文;尹俊勛;;一種基于筆畫密度的彈性網(wǎng)格特征提取方法[J];模式識別與人工智能;2002年03期
,本文編號:1576898
本文鏈接:http://sikaile.net/wenyilunwen/yuyanxuelw/1576898.html