天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 搜索引擎論文 >

基于依存關(guān)系網(wǎng)絡(luò)的查詢擴(kuò)展研究

發(fā)布時(shí)間:2018-06-05 19:10

  本文選題:實(shí)時(shí)查詢擴(kuò)展 + 依存關(guān)系。 參考:《北京郵電大學(xué)》2013年碩士論文


【摘要】:隨著互聯(lián)網(wǎng)信息規(guī)模的飛速增長(zhǎng),搜索引擎成為了人們快速獲取網(wǎng)絡(luò)信息所必不可少的工具。用戶只需向搜索引擎輸入查詢?cè)~,便會(huì)得到相應(yīng)的搜索結(jié)果。但是,查詢輸入通常只有幾個(gè)詞,且常常有歧義,所以有時(shí)并不能準(zhǔn)確地反映用戶的查詢意圖,導(dǎo)致返回?zé)o關(guān)信息。 實(shí)時(shí)查詢擴(kuò)展是一種對(duì)用戶輸入進(jìn)行擴(kuò)充以便更準(zhǔn)確的體現(xiàn)用戶查詢意圖的技術(shù);谙蛴脩魧(shí)時(shí)的推薦新查詢?cè)~,它可以補(bǔ)全用戶查詢句,減少用戶的輸入量,同時(shí)消解意圖上的歧義。傳統(tǒng)的實(shí)時(shí)查詢擴(kuò)展技術(shù)大多是利用查詢?nèi)罩?基于關(guān)鍵詞頻率進(jìn)行查詢?cè)~補(bǔ)全和查詢?cè)~推薦。 本文首先提出了一種基于“動(dòng)詞+修飾詞+名詞”依存關(guān)系的查詢意圖表示方法,并基于對(duì)總大小為1.15G的915600篇文檔的大規(guī)模語(yǔ)料分析,構(gòu)造了一個(gè)超過(guò)5萬(wàn)個(gè)節(jié)點(diǎn)的依存關(guān)系網(wǎng)絡(luò)。 然后,提出了一個(gè)利用上述大規(guī)模依存關(guān)系網(wǎng)絡(luò)為用戶進(jìn)行實(shí)時(shí)查詢擴(kuò)展的方法。實(shí)驗(yàn)表明,該方法的擴(kuò)展成功率達(dá)到84%,并能減少用戶查詢時(shí)所需的輸入量。 最后,實(shí)現(xiàn)了一個(gè)具有完整檢索功能的實(shí)時(shí)查詢擴(kuò)展系統(tǒng)。該系統(tǒng)綜合利用上述的查詢?cè)~擴(kuò)展技術(shù)和基于字符串的詞語(yǔ)補(bǔ)全技術(shù)來(lái)進(jìn)行實(shí)時(shí)查詢擴(kuò)展。系統(tǒng)評(píng)測(cè)表明,該系統(tǒng)可以減少63.75%的用戶操作。而且在經(jīng)過(guò)擴(kuò)展之后,檢索結(jié)果的nDCG評(píng)分達(dá)到88.95%。與微軟的Bing搜索引擎的比較表明,本系統(tǒng)在用戶輸入的詞序不同時(shí)有更穩(wěn)定的查詢擴(kuò)展能力。
[Abstract]:With the rapid growth of the Internet information scale, the search engine has become a necessary tool for people to obtain the network information quickly. The user only needs to input query words to the search engine, and the search results will be obtained. However, the query input usually has only a few words and often has ambiguity, so sometimes it can not accurately reflect the user. The query intention leads to the return of unrelated information.
Real-time query extension is a technology to expand user's input to more accurately reflect the user's query intention. Based on the recommendation of new query words to the user real-time, it can complement the user query sentence, reduce the user's input and disambiguate the intention. The traditional real time query extension technology is mostly using the query log, Keyword completion and query recommendation based on keyword frequency.
In this paper, a query intention representation method based on the dependency relationship of verb + modifier + noun is proposed. Based on the large corpus analysis of 915600 documents with total size of 1.15G, a dependency network with more than 50 thousand nodes is constructed.
Then, a method of real-time query expansion for users by using the large scale dependency network above is proposed. The experiment shows that the expansion rate of the method is 84%, and the input amount required by the user can be reduced.
Finally, a real-time query extension system with complete retrieval function is implemented. The system uses the above query word extension technology and the string based word complement technology to carry out real-time query expansion. The system evaluation shows that the system can reduce 63.75% of the user operation. And after the expansion, the retrieval result is nD Compared with Microsoft's Bing search engine, the CG score reaches 88.95%., which shows that the system has more stable query expansion ability when the user input word order is different.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.3

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 唐怡;周昌樂(lè);練睿婷;;基于HowNet的中文語(yǔ)義依存分析[J];心智與計(jì)算;2010年02期

,

本文編號(hào):1983110

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1983110.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶d7bc9***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com