基于概念語義相似度的長句查詢擴展研究
[Abstract]:With the rapid development of Internet, network information retrieval has also developed rapidly. At present, the main form of network information retrieval is search engine, which is the second largest service after email service. The current search engine mainly uses the query keywords entered by the user to retrieve the information, but the limited words entered by the user can not completely accurately express the real intention of the retrieval. The ambiguity of the query itself leads the search engine to return a large number of documents independent of user requirements, resulting in a low recall and precision. On the other hand, users sometimes enter long sentence queries, and the length of query words provided to search engines is increasing, which makes the retrieval results not ideal because of query topic offset and other problems. Therefore, in order to solve the above problems, the relevant scholars have proposed query expansion technology, that is, by modifying the original query words to improve the accuracy and recall rate of query retrieval, and have indeed achieved some results. However, most of them are aimed at short queries. In recent years, foreign countries are increasing the research on long sentence query expansion technology, which is mainly due to the use of natural language-long sentence, which can better express complex and specific information requirements, which is a development trend of user query expression in the future. Moreover, with the continuous development of query extension technology in the direction of semantic query, the rich semantic association contained in long sentence query provides a better research basis for the implementation of semantic query extension. It is very helpful to understand the complex language characteristics and different grammatical habits of users. Therefore, in order to solve the problems of long sentence query topic offset, low accuracy and low recall rate, a long sentence query extension method based on conceptual semantic similarity is proposed in this paper. In this method, the AAlesk word meaning disambiguation method is used to determine the correct word meaning of the query word, and then the related semantic concepts of the WordNet synonym set under the meaning of the word are added to the original query. The query clustering set is obtained by clustering the semantic similarity between different concepts in the query, and the best candidate concept set is obtained according to the overall semantic similarity level of the clustering set and the semantic correlation importance of the concept itself. Finally, the keywords with the highest score are extracted from the concept set to represent the original query, so as to improve the query of the original long sentence. In addition, a KeyGraph keyword extraction method is used to process the original long sentence query, and the improved results of these two different long sentence queries are put into three different types of mainstream retrieval models for retrieval experiments. The experimental results show that the improved long sentence query retrieval efficiency has been improved, especially the long sentence query extension method proposed in this paper can better express the real information needs of users from the semantic level. It greatly improves the accuracy and recall rate of long sentence query, and is more suitable for application in the existing mainstream language retrieval model.
【學位授予單位】:山東理工大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP391.3
【相似文獻】
相關(guān)期刊論文 前10條
1 陽小華;蔣輝;馬家宇;;基于任務上下文的查詢擴展方法[J];鄭州大學學報(理學版);2010年01期
2 吳煈;張奇;黃萱菁;;基于整數(shù)線性規(guī)劃的查詢擴展[J];計算機研究與發(fā)展;2013年08期
3 何燕;;基于用戶反饋的查詢擴展研究[J];情報理論與實踐;2013年08期
4 黃偉群;;基于用戶視角的交互式查詢擴展研究[J];圖書情報工作;2014年05期
5 黃名選;嚴小衛(wèi);張師超;;查詢擴展技術(shù)進展與展望[J];計算機應用與軟件;2007年11期
6 林國俊;葉飛躍;耿冬;鄭國良;;基于語義的概念查詢擴展[J];計算機工程與設計;2009年06期
7 鞏玉璽;王大玲;;一種改進的基于偽相關(guān)反饋的查詢擴展[J];微計算機信息;2009年15期
8 黃名選;張師超;嚴小衛(wèi);;基于查詢行為和關(guān)聯(lián)規(guī)則的相關(guān)反饋查詢擴展[J];計算機工程;2009年10期
9 張超盟;李戰(zhàn)懷;溫宗臣;;局部上下文分析剪枝概念樹的查詢擴展[J];計算機工程;2009年14期
10 羅小聰;;基于專用雙語詞典的查詢擴展[J];現(xiàn)代計算機(專業(yè)版);2009年10期
相關(guān)會議論文 前10條
1 黃明初;鐘威;何擁軍;蒙斌;;基于查詢擴展的數(shù)字檔案檢索策略[A];廣西計算機學會2010年學術(shù)年會論文集[C];2010年
2 呂碧波;趙軍;;基于相關(guān)文檔池建模的查詢擴展[A];第二屆全國信息檢索與內(nèi)容安全學術(shù)會議(NCIRCS-2005)論文集[C];2005年
3 林建方;李生;鄭德權(quán);;基于詞語搭配關(guān)系的查詢擴展方法[A];第四屆全國信息檢索與內(nèi)容安全學術(shù)會議論文集(上)[C];2008年
4 丁國棟;白碩;王斌;;一種基于局部共現(xiàn)的查詢擴展方法[A];第二屆全國信息檢索與內(nèi)容安全學術(shù)會議(NCIRCS-2005)論文集[C];2005年
5 李東園;白宇;蔡東風;;基于用戶日志分析的查詢擴展研究[A];第四屆全國學生計算語言學研討會會議論文集[C];2008年
6 張志強;孟慶海;謝曉芹;;個性化的社會標簽查詢擴展技術(shù)研究[A];NDBC2010第27屆中國數(shù)據(jù)庫學術(shù)會議論文集A輯二[C];2010年
7 王秉卿;張奇;吳立德;黃萱菁;;機器學習的查詢擴展在博客檢索中的應用[A];第四屆全國學生計算語言學研討會會議論文集[C];2008年
8 王秉卿;黃萱菁;;基于線性模型的查詢擴展方法[A];第五屆全國信息檢索學術(shù)會議論文集[C];2009年
9 晉松;林鴻飛;蘇綏;;基于標簽共現(xiàn)的查詢擴展研究[A];中國計算機語言學研究前沿進展(2007-2009)[C];2009年
10 郭文;史曉東;陳毅東;;跨語言信息檢索中的查詢擴展[A];第四屆全國學生計算語言學研討會會議論文集[C];2008年
相關(guān)重要報紙文章 前1條
1 鐘威 何擁軍;數(shù)字檔案信息擴展查詢功能需求分析及實現(xiàn)方式[N];中國檔案報;2011年
相關(guān)博士學位論文 前3條
1 郭曉黎;煤礦安全事件本體及其在查詢擴展中的應用研究[D];中國礦業(yè)大學(北京);2016年
2 仲兆滿;事件本體及其在查詢擴展中的應用[D];上海大學;2011年
3 王俊義;正負相關(guān)反饋與查詢擴展技術(shù)的研究[D];內(nèi)蒙古大學;2012年
相關(guān)碩士學位論文 前10條
1 鄭永軍;基于DMLS的語音關(guān)鍵詞檢測技術(shù)研究[D];解放軍信息工程大學;2014年
2 李云飛;基于查詢?nèi)罩镜膭討B(tài)查詢擴展研究[D];內(nèi)蒙古大學;2016年
3 楊振瑜;基于概念語義相似度的長句查詢擴展研究[D];山東理工大學;2013年
4 趙晶;漢語—泰語的跨語言查詢翻譯和擴展[D];昆明理工大學;2016年
5 秦廣順;漢越雙語新聞事件檢索方法研究[D];昆明理工大學;2016年
6 成昊;基于Word2Vec的中文問句檢索技術(shù)研究及系統(tǒng)實現(xiàn)[D];哈爾濱工業(yè)大學;2016年
7 姚小同;查詢擴展技術(shù)研究[D];北京郵電大學;2009年
8 許威;基于概念格的查詢擴展系統(tǒng)及建格算法研究[D];北京郵電大學;2008年
9 胡保祥;基于查詢?nèi)罩镜牟樵償U展研究[D];北京郵電大學;2013年
10 董靜;基于信任網(wǎng)絡的查詢擴展技術(shù)研究[D];哈爾濱工程大學;2013年
,本文編號:2487432
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2487432.html