Lucene全文索引效率的改進(jìn)
發(fā)布時(shí)間:2019-05-18 09:03
【摘要】:Lucene是一個(gè)優(yōu)秀的開(kāi)源的全文搜索技術(shù)框架,按照框架規(guī)范,擴(kuò)展它的功能,可以將它很好地嵌入到搜索引擎中。研究了Lucene的索引結(jié)構(gòu)和原理,通過(guò)改進(jìn)增量索引、增大索引緩沖區(qū)的大小和減少往磁盤(pán)上寫(xiě)索引文件的頻率,達(dá)到提高創(chuàng)建索引效率的目的。設(shè)計(jì)了全文檢索實(shí)驗(yàn),實(shí)驗(yàn)結(jié)果表明,該方法使10 000篇文檔創(chuàng)建索引的平均效率比前人方法提高了19.5%,具有良好的應(yīng)用前景。
[Abstract]:Lucene is an excellent open source full-text search technology framework, according to the framework specification, expand its functions, can be well embedded in the search engine. The index structure and principle of Lucene are studied. by improving the incremental index, increasing the size of index buffer and reducing the frequency of writing index files to disk, the efficiency of index creation can be improved. The full-text retrieval experiment is designed. The experimental results show that the average efficiency of 10,000 documents index creation is 19.5% higher than that of the previous methods, and has a good application prospect.
【作者單位】: 廊坊燕京職業(yè)技術(shù)學(xué)院計(jì)算機(jī)工程系;北京信息科技大學(xué)網(wǎng)絡(luò)文化與數(shù)字傳播北京市重點(diǎn)實(shí)驗(yàn)室;北華航天工業(yè)學(xué)院;北京拓爾思信息技術(shù)股份有限公司;
【基金】:網(wǎng)絡(luò)文化與數(shù)字傳播北京市重點(diǎn)實(shí)驗(yàn)室開(kāi)放課題項(xiàng)目(ICDD201404) 國(guó)家自然科學(xué)基金資助項(xiàng)目(61271304) 北京市教委科技發(fā)展計(jì)劃重點(diǎn)項(xiàng)目暨北京市自然科學(xué)基金B(yǎng)類(lèi)重點(diǎn)項(xiàng)目(KZ201311232037) 2013年河北省高等學(xué)?茖W(xué)技術(shù)研究自籌資金項(xiàng)目(Z2013162)
【分類(lèi)號(hào)】:TP391.3
[Abstract]:Lucene is an excellent open source full-text search technology framework, according to the framework specification, expand its functions, can be well embedded in the search engine. The index structure and principle of Lucene are studied. by improving the incremental index, increasing the size of index buffer and reducing the frequency of writing index files to disk, the efficiency of index creation can be improved. The full-text retrieval experiment is designed. The experimental results show that the average efficiency of 10,000 documents index creation is 19.5% higher than that of the previous methods, and has a good application prospect.
【作者單位】: 廊坊燕京職業(yè)技術(shù)學(xué)院計(jì)算機(jī)工程系;北京信息科技大學(xué)網(wǎng)絡(luò)文化與數(shù)字傳播北京市重點(diǎn)實(shí)驗(yàn)室;北華航天工業(yè)學(xué)院;北京拓爾思信息技術(shù)股份有限公司;
【基金】:網(wǎng)絡(luò)文化與數(shù)字傳播北京市重點(diǎn)實(shí)驗(yàn)室開(kāi)放課題項(xiàng)目(ICDD201404) 國(guó)家自然科學(xué)基金資助項(xiàng)目(61271304) 北京市教委科技發(fā)展計(jì)劃重點(diǎn)項(xiàng)目暨北京市自然科學(xué)基金B(yǎng)類(lèi)重點(diǎn)項(xiàng)目(KZ201311232037) 2013年河北省高等學(xué)?茖W(xué)技術(shù)研究自籌資金項(xiàng)目(Z2013162)
【分類(lèi)號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 孫志軍;鄭p,
本文編號(hào):2479869
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2479869.html
最近更新
教材專(zhuān)著