基于混合核LS-SVM的古漢字圖像識別
發(fā)布時間:2018-11-20 04:10
【摘要】:中國古漢字記錄了大量的政治、經濟、歷史等資料,具有很高的史料價值。古漢字具有筆劃不規(guī)則、異體字繁多等特點,以碑刻和帛書等形式出現(xiàn)的古漢字,殘損較為嚴重,上述特點使得古漢字識別非常困難。利用圖像處理技術識別古漢字,解決古籍電子化進程中的流通和典藏困難,對民族文化的繼承和發(fā)展具有重要的意義。由于古漢字的異體字與局部形變大量存在,現(xiàn)有的圖像識別方法難以獲得準確結果。支持向量機具有小樣本下的強泛化與抗噪能力,在圖像識別中已被廣泛應用。本文將混合核最小方差支持向量機(LS-SVM)結合圖像特征抽取、曲波變換等實現(xiàn)古漢字的圖像識別,主要工作和結論如下:1.針對古漢字間的高度相似性導致誤分類率高的問題,對傳統(tǒng)的支持向量機進行改進,采用混合核加權LS-SVM進行分類識別;旌虾思訖郘S-SVM可以減少異常樣本的負面影響,避免出現(xiàn)分類越好或者越壞點的懲罰也越大的情況,提高分類的準確率。2.研究了時域多特征融合的特征提取方法。提取部件結構特征與整體廣義密度特征作為全局特征,該全局特征具有魯棒性強和算法復雜度低等特點;提取網格筆劃特征與偽二維彈性網格內的局部點密度特征作為局部特征,所提的局部特征對于局部形變有很好的吸收能力。將提取的全局特征和局部特征融合后作為分類器的特征輸入。3.針對古漢字筆劃多為不規(guī)則曲線導致分類率不高的問題,利用二代曲波變換提取古漢字的頻域特征,研究了頻域多特征融合的特征提取方法。采用快速離散二代曲波變換對古漢字圖像進行多分辨率分解,對不同分辨率下的古漢字圖像求取灰度共生矩陣,得到各層子圖像的紋理特征參數(shù),然后將所有子圖的特征參數(shù)進行多特征融合,形成高維的特征向量,并對此特征向量抽取主成分,作為分類器的特征輸入。仿真實驗結果驗證了該方法的有效性。
[Abstract]:Chinese ancient Chinese characters record a large number of political, economic, historical and other materials, with high historical value. Ancient Chinese characters are characterized by irregular strokes and various heterogeneous characters. Ancient Chinese characters, which appear in the form of inscriptions and silk books, are seriously damaged, which makes the recognition of ancient Chinese characters very difficult. It is of great significance for the inheritance and development of national culture to use image processing technology to recognize ancient Chinese characters and to solve the difficulties of circulation and collection in the process of electronization of ancient books. Due to the large number of variant characters and local deformation of ancient Chinese characters, the existing image recognition methods are difficult to obtain accurate results. Support vector machine (SVM) has been widely used in image recognition because of its strong generalization and anti-noise capability under small samples. In this paper, the hybrid kernel minimum variance support vector machine (LS-SVM) is combined with image feature extraction and Qu Bo transform to realize the image recognition of ancient Chinese characters. The main work and conclusions are as follows: 1. Aiming at the problem of high misclassification rate caused by the high similarity among ancient Chinese characters, the traditional support vector machine is improved and the hybrid kernel weighted LS-SVM is used for classification recognition. Hybrid kernel-weighted LS-SVM can reduce the negative effects of abnormal samples, avoid the situation that the better or worse the classification is, and improve the accuracy of classification. 2. The feature extraction method of time domain multi-feature fusion is studied. The structure feature and the global generalized density feature are extracted as global features, which have the characteristics of strong robustness and low algorithm complexity. The feature of stroke and the local point density in pseudo-two-dimensional elastic mesh are extracted as local features. The proposed local features have good absorption ability to local deformation. The extracted global feature and local feature are fused as the feature input of the classifier. 3. 3. Aiming at the problem that most strokes of ancient Chinese characters are irregular curves and the classification rate is not high, the frequency domain features of ancient Chinese characters are extracted by using the second generation Qu Bo transform, and the feature extraction method of frequency domain multi-feature fusion is studied. The fast discrete second generation Qu Bo transform is used to decompose the ancient Chinese character image with multi-resolution. The gray level co-occurrence matrix is obtained for the ancient Chinese character image with different resolution, and the texture characteristic parameters of each layer sub-image are obtained. Then, the feature parameters of all subgraphs are fused to form a high-dimensional feature vector, and the principal components are extracted from the feature vector as the feature input of the classifier. Simulation results show that the proposed method is effective.
【學位授予單位】:安徽大學
【學位級別】:碩士
【學位授予年份】:2015
【分類號】:TP391.41
本文編號:2343668
[Abstract]:Chinese ancient Chinese characters record a large number of political, economic, historical and other materials, with high historical value. Ancient Chinese characters are characterized by irregular strokes and various heterogeneous characters. Ancient Chinese characters, which appear in the form of inscriptions and silk books, are seriously damaged, which makes the recognition of ancient Chinese characters very difficult. It is of great significance for the inheritance and development of national culture to use image processing technology to recognize ancient Chinese characters and to solve the difficulties of circulation and collection in the process of electronization of ancient books. Due to the large number of variant characters and local deformation of ancient Chinese characters, the existing image recognition methods are difficult to obtain accurate results. Support vector machine (SVM) has been widely used in image recognition because of its strong generalization and anti-noise capability under small samples. In this paper, the hybrid kernel minimum variance support vector machine (LS-SVM) is combined with image feature extraction and Qu Bo transform to realize the image recognition of ancient Chinese characters. The main work and conclusions are as follows: 1. Aiming at the problem of high misclassification rate caused by the high similarity among ancient Chinese characters, the traditional support vector machine is improved and the hybrid kernel weighted LS-SVM is used for classification recognition. Hybrid kernel-weighted LS-SVM can reduce the negative effects of abnormal samples, avoid the situation that the better or worse the classification is, and improve the accuracy of classification. 2. The feature extraction method of time domain multi-feature fusion is studied. The structure feature and the global generalized density feature are extracted as global features, which have the characteristics of strong robustness and low algorithm complexity. The feature of stroke and the local point density in pseudo-two-dimensional elastic mesh are extracted as local features. The proposed local features have good absorption ability to local deformation. The extracted global feature and local feature are fused as the feature input of the classifier. 3. 3. Aiming at the problem that most strokes of ancient Chinese characters are irregular curves and the classification rate is not high, the frequency domain features of ancient Chinese characters are extracted by using the second generation Qu Bo transform, and the feature extraction method of frequency domain multi-feature fusion is studied. The fast discrete second generation Qu Bo transform is used to decompose the ancient Chinese character image with multi-resolution. The gray level co-occurrence matrix is obtained for the ancient Chinese character image with different resolution, and the texture characteristic parameters of each layer sub-image are obtained. Then, the feature parameters of all subgraphs are fused to form a high-dimensional feature vector, and the principal components are extracted from the feature vector as the feature input of the classifier. Simulation results show that the proposed method is effective.
【學位授予單位】:安徽大學
【學位級別】:碩士
【學位授予年份】:2015
【分類號】:TP391.41
【參考文獻】
相關期刊論文 前6條
1 陳丹;李寧;李亮;;古文字的聯(lián)機手寫識別研究[J];北京機械工業(yè)學院學報;2008年04期
2 周曉文;李國英;;建立“信息交換用古漢字編碼字符集”的必要性及可行性[J];北京師范大學學報(社會科學版);2006年01期
3 居琰,汪同慶,彭建,劉建勝,袁祥輝;特征融合用于手寫體漢字識別研究[J];電子科技大學學報;2002年03期
4 ;“中華古籍保護計劃”大事記[J];國家圖書館學刊;2014年05期
5 許賀楠;添玉;黃道;;K聚類加權最小二乘支持向量機在分類中的應用[J];華東理工大學學報(自然科學版);2010年02期
6 湯印華;;淺議古籍修復人才隊伍建設[J];科技情報開發(fā)與經濟;2011年32期
相關碩士學位論文 前6條
1 傅向華;金文操作平臺及金文資料庫系統(tǒng)的設計與實現(xiàn)[D];西北農林科技大學;2002年
2 楊玲;脫機手寫體漢字識別研究[D];西華大學;2008年
3 靳天飛;基于筆段的脫機手寫體漢字識別方法研究[D];山東大學;2008年
4 孫華;基于多特征融合SVM的古漢字圖像識別研究[D];中南大學;2010年
5 張欣;基于四角結構特征的脫機手寫漢字識別[D];河北大學;2010年
6 時培培;基于第二代曲波變換結合改進子空間技術的人臉識別技術研究[D];北京化工大學;2012年
,本文編號:2343668
本文鏈接:http://sikaile.net/jingjilunwen/zhengzhijingjixuelunwen/2343668.html
最近更新
教材專著