天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 搜索引擎論文 >

基于本體和用戶日志的查詢擴(kuò)展研究

發(fā)布時(shí)間:2018-05-03 19:12

  本文選題:本體 + 查詢擴(kuò)展; 參考:《湖南大學(xué)》2013年碩士論文


【摘要】:隨著因特網(wǎng)信息的爆炸式增長(zhǎng),用戶如何從大量的信息中獲取自己真正想要的信息變得越來(lái)越棘手。搜索引擎在一定程度上解決了用戶查找有用信息的問(wèn)題。但用戶在使用搜索引擎時(shí)往往無(wú)法準(zhǔn)確表達(dá)自己的查詢意圖,經(jīng)常出現(xiàn)查詢?cè)~使用不當(dāng)或者查詢?cè)~過(guò)短等問(wèn)題導(dǎo)致搜索引擎查全率和查準(zhǔn)率低下,無(wú)法返回有用信息。對(duì)用戶查詢進(jìn)行擴(kuò)展變得十分迫切。 查詢擴(kuò)展技術(shù)經(jīng)歷了幾十年的發(fā)展,國(guó)內(nèi)外的研究人員已提出多種查詢擴(kuò)展方法。然而這些常見方法在進(jìn)行擴(kuò)展時(shí)往往不能從語(yǔ)義層面理解用戶輸入,且因其擴(kuò)展詞的來(lái)源具有不確定性,容易加入查詢無(wú)關(guān)詞,造成“查詢漂移”問(wèn)題。本文結(jié)合領(lǐng)域本體和用戶查詢?nèi)罩咎岢鲆环N基于本體和用戶日志的查詢擴(kuò)展算法。利用領(lǐng)域本體從語(yǔ)義層面擴(kuò)展用戶查詢形成初始擴(kuò)展概念集,結(jié)合用戶查詢?nèi)罩纠迷~共現(xiàn)分析對(duì)初始擴(kuò)展概念集進(jìn)行二次篩選。主要內(nèi)容如下: (1)闡述了課題的研究背景與意義,分析了當(dāng)前查詢擴(kuò)展技術(shù)的研究進(jìn)展與存在的不足、對(duì)課題相關(guān)的背景知識(shí)和相關(guān)理論作了介紹,為后文研究工作的開展奠定了理論基礎(chǔ)。 (2)提出了一種基于本體的概念語(yǔ)義相似度計(jì)算公式,對(duì)候選擴(kuò)展詞進(jìn)行語(yǔ)義相似度計(jì)算,從語(yǔ)義層面對(duì)用戶查詢進(jìn)行擴(kuò)展。 (3)提出了一種基于用戶日志的詞共現(xiàn)計(jì)算公式,,對(duì)初始擴(kuò)展詞進(jìn)行詞共現(xiàn)計(jì)算,以計(jì)算結(jié)果作為擴(kuò)展詞的詞共現(xiàn)權(quán)值,結(jié)合擴(kuò)展詞的語(yǔ)義相似度權(quán)值和詞共現(xiàn)權(quán)值進(jìn)行二次篩選,從而避免初始擴(kuò)展易出現(xiàn)的“查詢漂移”問(wèn)題。 (4)根據(jù)本文提出的基于本體和用戶日志的查詢擴(kuò)展算法,結(jié)合國(guó)產(chǎn)軟硬件售后服務(wù)跟蹤系統(tǒng)的查詢需求設(shè)計(jì)并實(shí)現(xiàn)了一個(gè)原型系統(tǒng)。介紹了系統(tǒng)的整體框架及各個(gè)組成模塊。最后在該系統(tǒng)上進(jìn)行了對(duì)比實(shí)驗(yàn)測(cè)試。實(shí)驗(yàn)結(jié)果表明,與傳統(tǒng)的查詢擴(kuò)展方法相比較,本文方法在保障良好魯棒性的同時(shí),有效地提高了檢索準(zhǔn)確率。
[Abstract]:With the explosive growth of Internet information, it becomes more and more difficult for users to obtain the information they really want from a large amount of information. Search engine solves the problem of searching useful information to some extent. However, when users use search engines, they often can not express their query intention accurately. Problems such as improper use of query words or too short query words often lead to low recall and precision of search engines, which can not return useful information. It is urgent to extend user queries. Query extension technology has experienced decades of development, researchers at home and abroad have proposed a variety of query expansion methods. However, these common methods are often unable to understand user input from the semantic level, and because of the uncertainty of the source of the extension words, it is easy to add query independent words, resulting in the problem of "query drift". This paper presents an extended query algorithm based on domain ontology and user log. Domain ontology is used to extend user query from semantic level to form initial extended concept set. Combined with user query log, the initial extended concept set is filtered twice by word cooccurrence analysis. The main contents are as follows: 1) the research background and significance of the subject are expounded, the research progress and shortcomings of the current query extension technology are analyzed, and the related background knowledge and related theories are introduced, which lays a theoretical foundation for the later research work. (2) an ontology-based formula for calculating semantic similarity of concepts is proposed to calculate the semantic similarity of candidate extension words and to extend user queries from the semantic level. In this paper, a formula of word co-occurrence calculation based on user log is proposed, and the result is used as the word co-occurrence weight of the extended word. Combining the semantic similarity weights and co-occurrence weights of extended words, the problem of "query drift" which is easy to occur in initial extension can be avoided. 4) according to the query expansion algorithm based on ontology and user log proposed in this paper, a prototype system is designed and implemented according to the query requirements of domestic hardware and software after-sales service tracking system. The whole frame and each component module of the system are introduced. Finally, a comparative experiment was carried out on the system. The experimental results show that compared with the traditional query expansion method, this method not only guarantees good robustness, but also effectively improves the retrieval accuracy.
【學(xué)位授予單位】:湖南大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 袁里馳;;一種基于互信息的詞聚類算法[J];系統(tǒng)工程;2008年05期

2 王建勇,單松巍,雷鳴,謝正茂,李曉明;海量Web搜索引擎系統(tǒng)中用戶行為的分布特征及其啟示[J];中國(guó)科學(xué)E輯:技術(shù)科學(xué);2001年04期

3 張超盟;李戰(zhàn)懷;溫宗臣;;局部上下文分析剪枝概念樹的查詢擴(kuò)展[J];計(jì)算機(jī)工程;2009年14期

4 趙偉,戴新宇,尹存燕,陳家駿;一種規(guī)則與統(tǒng)計(jì)相結(jié)合的漢語(yǔ)分詞方法[J];計(jì)算機(jī)應(yīng)用研究;2004年03期

5 黃名選;嚴(yán)小衛(wèi);張師超;;查詢擴(kuò)展技術(shù)進(jìn)展與展望[J];計(jì)算機(jī)應(yīng)用與軟件;2007年11期

6 余慧佳;劉奕群;張敏;茹立云;馬少平;;基于大規(guī)模日志分析的搜索引擎用戶行為分析[J];中文信息學(xué)報(bào);2007年01期

7 陳

本文編號(hào):1839733


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1839733.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶e593b***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
国语对白刺激高潮在线视频| 午夜精品麻豆视频91| 日本欧美一区二区三区在线播| 久久99爱爱视频视频| 日本免费一区二区三女| 少妇特黄av一区二区三区| 成人区人妻精品一区二区三区| 日本不卡一本二本三区| 最近日韩在线免费黄片| 亚洲精品国男人在线视频| 日木乱偷人妻中文字幕在线| 男人和女人黄 色大片| 一区二区三区四区亚洲另类| 国产精品欧美激情在线| 亚洲av在线视频一区| 大香蕉大香蕉手机在线视频| 成人午夜视频精品一区| 国产对白老熟女正在播放| 激情内射亚洲一区二区三区| 性感少妇无套内射在线视频| 九九热国产这里只有精品| 91精品国产综合久久精品| 亚洲精品欧美精品日韩精品| 国产传媒免费观看视频| 在线观看国产成人av天堂野外| 国产乱人伦精品一区二区三区四区| 九九久久精品久久久精品| 特黄大片性高水多欧美一级 | 加勒比东京热拍拍一区二区| 久久精品国产第一区二区三区| 久久精品伊人一区二区| 开心激情网 激情五月天| 国产精品久久熟女吞精| 国产日韩欧美专区一区| 国产超碰在线观看免费| 精品香蕉一区二区在线| 永久福利盒子日韩日韩| 日韩黄色一级片免费收看| 日本熟妇熟女久久综合| 日本人妻免费一区二区三区| 精品人妻一区二区三区在线看|