基于xPlore的圖片搜索系統(tǒng)的設(shè)計與實現(xiàn)
發(fā)布時間:2018-05-23 08:03
本文選題:XML + xPlore ; 參考:《南京大學(xué)》2013年碩士論文
【摘要】:隨著企業(yè)級文檔管理系統(tǒng)不斷發(fā)展,其信息管理能力和搜索能力不斷提高,而對圖片各方面信息的支持還不成熟。普通的文檔搜索主要關(guān)注文件基本信息以及對文檔內(nèi)容建立全文索引,但是圖片文件通常本身是沒有文本信息的,而且圖片的圖像信息和EXIF信息都不在基本的文件信息中。在企業(yè)級文檔搜索系統(tǒng)xPlore中,圖片搜索的支持也很不成熟,只能針對圖片的基本文件信息進(jìn)行搜索,因此需要研究如何使得xPlore更好地支持圖片搜索。 圖片搜索需要全面的圖片信息,圖片信息主要可以分為文本信息、圖像信息和EXIF信息,文本信息可以根據(jù)光學(xué)字符識別獲得,圖像信息可以根據(jù)感知哈希算法獲得圖片指紋,EXIF信息則包括相機信息和位置信息等。而在xPlore中,文本信息提取過程是在CPS中完成的,該過程是需要針對圖片文件的修改,而其他圖片信息的提取可以通過非結(jié)構(gòu)化數(shù)據(jù)注釋器添加到文檔處理過程中。 本文設(shè)計并實現(xiàn)了一個基于xPlore圖片搜索管理系統(tǒng),首先是對xPlore的一些修改與配置,包括對文檔內(nèi)容提取過程中的完善并實現(xiàn)和配置圖片相關(guān)的非結(jié)構(gòu)化數(shù)據(jù)注釋器;然后是一個基于修改后xPlore構(gòu)建的圖片搜索系統(tǒng),該系統(tǒng)使用GWT構(gòu)建界面,然后提供針對文本、圖像、EXIF信息多角度的搜索功能。此外,圖片的文本識別之后,本項目增加了拼寫檢查的過程,以使結(jié)果更加準(zhǔn)確。 本項目完善了xPlore對圖片各方面信息搜索的支持,同時實現(xiàn)了一個基于該搜索引擎的圖片系統(tǒng)。在本項目的開發(fā)過程中,進(jìn)一步驗證了xPlore新版本對非結(jié)構(gòu)化注釋器的支持。搭建的圖片搜索系統(tǒng)包含以下功能,圖片和相冊的管理、搜索以及社交網(wǎng)絡(luò)發(fā)布,給同事圖片管理提供了諸多方便。
[Abstract]:With the development of enterprise document management system, its information management ability and searching ability are improved, and the support for all aspects of image information is not mature. Ordinary document search mainly focuses on the basic information of the file and the full text index of the document content, but the picture file itself usually has no text information, and the image information and the EXIF information of the picture are not in the basic file information. In the enterprise document search system (xPlore), the support of image search is also very immature, so it is necessary to study how to make xPlore support image search better because it can only search the basic file information of images. Image search needs comprehensive picture information. Picture information can be divided into text information, image information and EXIF information. Text information can be obtained according to optical character recognition. Image information can be obtained by perceptual hashing algorithm, which includes camera information and location information. In xPlore, the text information extraction process is completed in CPS, which needs to be modified for image files, while other image information extraction can be added to the document processing process through unstructured data annotator. This paper designs and implements an image search and management system based on xPlore. Firstly, some modifications and configurations of xPlore are introduced, including the improvement of document content extraction and the implementation of non-structured data annotator related to the configuration of images. Then there is a picture search system based on modified xPlore, which uses GWT to build interface, and then provides multi-angle search function for text and image information. In addition, after the text recognition of the picture, this item adds the spelling check process to make the results more accurate. This project consummates the xPlore to the picture each aspect information search support, at the same time has realized a picture system based on this search engine. During the development of this project, the support for unstructured annotators in the new version of xPlore is further verified. The system includes the following functions: image and photo album management, search and social network publishing, which provides a lot of convenience for colleague image management.
【學(xué)位授予單位】:南京大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP311.52
【參考文獻(xiàn)】
相關(guān)期刊論文 前6條
1 畢紅軍,裘正定,杜錫鈺;等漢明距離編碼的研究[J];北方交通大學(xué)學(xué)報;1997年05期
2 荊濤,王仲;光學(xué)字符識別技術(shù)與展望[J];計算機工程;2003年02期
3 李東;鄺子民;;XPath結(jié)構(gòu)連接順序優(yōu)化[J];計算機科學(xué)與探索;2010年11期
4 王波;王瀚波;;基于JQuery的自動完成功能的實現(xiàn)[J];三門峽職業(yè)技術(shù)學(xué)院學(xué)報;2010年03期
5 唐擁政;衡冬梅;;基于Hibernate的數(shù)據(jù)持久層關(guān)鍵技術(shù)的研究[J];鹽城工學(xué)院學(xué)報(自然科學(xué)版);2006年02期
6 李斌;姚建民;朱巧明;;英文作文的自動拼寫檢查研究[J];鄭州大學(xué)學(xué)報(理學(xué)版);2008年03期
,本文編號:1923928
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1923928.html
最近更新
教材專著