基于依存關(guān)系網(wǎng)絡(luò)的查詢擴展研究
發(fā)布時間:2018-06-05 19:10
本文選題:實時查詢擴展 + 依存關(guān)系; 參考:《北京郵電大學(xué)》2013年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)信息規(guī)模的飛速增長,搜索引擎成為了人們快速獲取網(wǎng)絡(luò)信息所必不可少的工具。用戶只需向搜索引擎輸入查詢詞,便會得到相應(yīng)的搜索結(jié)果。但是,查詢輸入通常只有幾個詞,且常常有歧義,所以有時并不能準確地反映用戶的查詢意圖,導(dǎo)致返回?zé)o關(guān)信息。 實時查詢擴展是一種對用戶輸入進行擴充以便更準確的體現(xiàn)用戶查詢意圖的技術(shù);谙蛴脩魧崟r的推薦新查詢詞,它可以補全用戶查詢句,減少用戶的輸入量,同時消解意圖上的歧義。傳統(tǒng)的實時查詢擴展技術(shù)大多是利用查詢?nèi)罩?基于關(guān)鍵詞頻率進行查詢詞補全和查詢詞推薦。 本文首先提出了一種基于“動詞+修飾詞+名詞”依存關(guān)系的查詢意圖表示方法,并基于對總大小為1.15G的915600篇文檔的大規(guī)模語料分析,構(gòu)造了一個超過5萬個節(jié)點的依存關(guān)系網(wǎng)絡(luò)。 然后,提出了一個利用上述大規(guī)模依存關(guān)系網(wǎng)絡(luò)為用戶進行實時查詢擴展的方法。實驗表明,該方法的擴展成功率達到84%,并能減少用戶查詢時所需的輸入量。 最后,實現(xiàn)了一個具有完整檢索功能的實時查詢擴展系統(tǒng)。該系統(tǒng)綜合利用上述的查詢詞擴展技術(shù)和基于字符串的詞語補全技術(shù)來進行實時查詢擴展。系統(tǒng)評測表明,該系統(tǒng)可以減少63.75%的用戶操作。而且在經(jīng)過擴展之后,檢索結(jié)果的nDCG評分達到88.95%。與微軟的Bing搜索引擎的比較表明,本系統(tǒng)在用戶輸入的詞序不同時有更穩(wěn)定的查詢擴展能力。
[Abstract]:With the rapid growth of the Internet information scale, the search engine has become a necessary tool for people to obtain the network information quickly. The user only needs to input query words to the search engine, and the search results will be obtained. However, the query input usually has only a few words and often has ambiguity, so sometimes it can not accurately reflect the user. The query intention leads to the return of unrelated information.
Real-time query extension is a technology to expand user's input to more accurately reflect the user's query intention. Based on the recommendation of new query words to the user real-time, it can complement the user query sentence, reduce the user's input and disambiguate the intention. The traditional real time query extension technology is mostly using the query log, Keyword completion and query recommendation based on keyword frequency.
In this paper, a query intention representation method based on the dependency relationship of verb + modifier + noun is proposed. Based on the large corpus analysis of 915600 documents with total size of 1.15G, a dependency network with more than 50 thousand nodes is constructed.
Then, a method of real-time query expansion for users by using the large scale dependency network above is proposed. The experiment shows that the expansion rate of the method is 84%, and the input amount required by the user can be reduced.
Finally, a real-time query extension system with complete retrieval function is implemented. The system uses the above query word extension technology and the string based word complement technology to carry out real-time query expansion. The system evaluation shows that the system can reduce 63.75% of the user operation. And after the expansion, the retrieval result is nD Compared with Microsoft's Bing search engine, the CG score reaches 88.95%., which shows that the system has more stable query expansion ability when the user input word order is different.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP391.3
【參考文獻】
相關(guān)期刊論文 前1條
1 唐怡;周昌樂;練睿婷;;基于HowNet的中文語義依存分析[J];心智與計算;2010年02期
,本文編號:1983110
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1983110.html
最近更新
教材專著