基于Lucene的全文檢索系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-05-11 22:29
本文選題:Lucene + 全文搜索; 參考:《廈門大學(xué)》2014年碩士論文
【摘要】:二十世紀(jì)九十年代開(kāi)始,計(jì)算機(jī)技術(shù)和互聯(lián)網(wǎng)技術(shù)獲得了巨大的發(fā)展,隨著計(jì)算機(jī)以及互聯(lián)網(wǎng)技術(shù)的大規(guī)模普及應(yīng)用,人們所接觸到的信息量也呈現(xiàn)指數(shù)級(jí)的增長(zhǎng),信息量的增大迫使人們必須想出各種方法來(lái)快速獲得所需要的有用信息,為此,人們發(fā)明了各式各樣的信息查找技術(shù),但是,如何才能快速高效地完成信息的存儲(chǔ)以及查找操作呢,這是非常值得國(guó)內(nèi)外讀者去研究的課題。 當(dāng)前,搜索引擎已經(jīng)成為信息網(wǎng)絡(luò)化時(shí)代最主流的技術(shù)之一,作為搜索引擎核心的技術(shù),全文檢索(Full-text Retrieval)是指使用自然語(yǔ)言進(jìn)行檢索,基于全文索引并以文本數(shù)據(jù)為主要處理對(duì)象的檢索技術(shù)。全文檢索與普通的數(shù)據(jù)庫(kù)檢索設(shè)計(jì)不太一致,前者需要處理包括結(jié)構(gòu)化數(shù)據(jù)以及非結(jié)構(gòu)化數(shù)據(jù),而后者只能處理結(jié)構(gòu)化數(shù)據(jù),所以,比起普通的數(shù)據(jù)庫(kù)檢索,全文檢索具有更強(qiáng)大的功能,更容易滿足用戶的需求。 論文主要是探討藝術(shù)學(xué)院辦公系統(tǒng)的全文檢索模塊,全文檢索的基本要求就是能夠?qū)崿F(xiàn)對(duì)公文內(nèi)容,通知公告,內(nèi)部新聞等文本信息進(jìn)行內(nèi)容檢索。系統(tǒng)基于J2EE體系架構(gòu)進(jìn)行開(kāi)發(fā),采用SSH2項(xiàng)目開(kāi)發(fā)技術(shù)架構(gòu),使用MYSQL數(shù)據(jù)庫(kù)系統(tǒng)。 本文先論述相關(guān)技術(shù),從搜索引擎的原理、組成、數(shù)據(jù)結(jié)構(gòu)、工作流程等方面做深入細(xì)致地研究分析,然后根據(jù)項(xiàng)目的實(shí)際需求,以Lucene工具庫(kù)為基礎(chǔ),設(shè)計(jì)并且實(shí)現(xiàn)一個(gè)基于全文檢索的站內(nèi)搜索引擎系統(tǒng),為用戶提供更為方便的搜索功能。
[Abstract]:Since the 1990s, computer technology and Internet technology have gained tremendous development. With the large-scale popularization and application of computer and Internet technology, the amount of information that people come into contact with has also increased exponentially. The increasing amount of information has forced people to come up with ways to get the useful information they need quickly. For this reason, people have invented various information lookup techniques, but, How to quickly and efficiently complete the information storage and search operation, this is a very worthy of domestic and foreign readers to study the subject. At present, search engine has become one of the most popular technologies in the era of information networking. As the core technology of search engine, Full-text Retrieval (Full-text Retrieval) refers to the use of natural language for retrieval. Retrieval technology based on full-text index and taking text data as main processing object. Full-text retrieval is not exactly the same as the common database retrieval design, which involves both structured and unstructured data, while the latter can only handle structured data, so, compared to ordinary database retrieval, Full-text retrieval has more powerful functions and is easier to meet the needs of users. This paper mainly discusses the full-text retrieval module of the office system of the College of Art. The basic requirement of full-text retrieval is to achieve the content retrieval of official document content, notice announcement, internal news and other text information. The system is developed on the basis of J2EE architecture, SSH2 project development technology framework and MYSQL database system. This article first discusses the related technology, from the search engine principle, the constitution, the data structure, the work flow and so on aspect makes the thorough detailed research and analysis, then according to the project actual demand, takes the Lucene tool library as the foundation, A web search engine system based on full-text search is designed and implemented to provide users with more convenient search functions.
【學(xué)位授予單位】:廈門大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 劉寧;陸榮國(guó);繆萬(wàn)勝;;MVC體系架構(gòu)從模式到框架的持續(xù)抽象進(jìn)化[J];計(jì)算機(jī)工程;2008年04期
2 曹強(qiáng);;基于Lucene的Web站點(diǎn)站內(nèi)全文檢索系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[J];圖書(shū)情報(bào)工作;2007年09期
3 曹大有;王瑜;;基于MyEclipse的Hibernate持久層框架的開(kāi)發(fā)過(guò)程[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2007年12期
,本文編號(hào):1875905
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1875905.html
最近更新
教材專著