天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

對象檢索中的實體信息查詢擴展算法研究

發(fā)布時間:2019-07-10 08:34
【摘要】:本文主要研究了對象檢索中的實體信息擴展算法,現(xiàn)如今對于信息的需求已經(jīng)逐漸從較為模糊的網(wǎng)頁檢索演進為對象檢索,帶動實體信息抽取成為最核心的技術(shù)之一,而實體信息擴展則是實體信息抽取技術(shù)中一個重要的部分。實體信息抽取的目的在于自動生成包含實體相關(guān)屬性信息的實體知識庫。本文研究的實體信息查詢擴展的目的:一是擴充實體查詢詞信息,在查詢詞信息不完備的條件下,對實體查詢詞進行信息擴充,消除查詢詞歧義,明確查詢意圖;二是實現(xiàn)針對實體別稱等共指信息的擴展,從而將共同指向的不同實體之間的信息得以合并共享。 本文的主要工作如下: 首先,將對象檢索與傳統(tǒng)的信息檢索進行了分析對比,重點分析了實體信息擴展和傳統(tǒng)查詢擴展在預(yù)處理、詞項選擇、相關(guān)度計算、及匹配方法上的區(qū)別和聯(lián)系,并在此基礎(chǔ)上確定了本文的主要研究課題,即基于統(tǒng)計學(xué)習(xí)的實體信息擴展,以及基于語法規(guī)則的實體信息擴展。 其次,針對與實體相關(guān)度高的詞項擴展問題,本文提出了一種基于概率統(tǒng)計的實體信息擴展方法,利用相關(guān)反饋技術(shù),結(jié)合層次聚類算法,在相關(guān)文檔集內(nèi)對實體與詞項進行共現(xiàn)相關(guān)度挖掘,實現(xiàn)對實體描述信息的擴展;谠撃P,對兩千余個實體進行了相關(guān)詞項擴展,并應(yīng)用在TREC2012Microblog評測任務(wù)中,結(jié)果驗證了該模型的有效性。 最后,針對實體別稱、同義詞、身份描述等信息,本文研究給出了一種基于語法規(guī)則的實體信息擴展方法,通過詞法分析預(yù)處理,根據(jù)針對共指表述的語法特征,對實體表述進行共指消解,實現(xiàn)實體別稱等信息的擴展。利用該模型,在TAC2012KBP中的兩個子任務(wù)中獲得良好效果,驗證了該模型的有效性。
文內(nèi)圖片:凝聚的層次聚類劃分策略這一簇文檔集中的全部文檔將作為對實體的支撐信息/并在后續(xù)步驟中對這些文檔進行針對這一實體的信息抽取作為對這一實體的信息擴展
圖片說明:凝聚的層次聚類劃分策略這一簇文檔集中的全部文檔將作為對實體的支撐信息/并在后續(xù)步驟中對這些文檔進行針對這一實體的信息抽取作為對這一實體的信息擴展
[Abstract]:This paper mainly studies the entity information expansion algorithm in object retrieval. Now the demand for information has gradually evolved from vague web page retrieval to object retrieval, which makes entity information extraction become one of the most core technologies, and entity information expansion is an important part of entity information extraction technology. The purpose of entity information extraction is to automatically generate entity knowledge base containing entity related attribute information. The purpose of the entity information query extension studied in this paper is: first, to expand the entity query word information, under the condition that the query word information is not complete, to expand the entity query word information, to eliminate the query word ambiguity, and to clarify the query intention; the other is to realize the expansion of the common reference information for the entity nickname, so that the information between the different entities can be merged and shared. The main work of this paper is as follows: firstly, the object retrieval is analyzed and compared with the traditional information retrieval, and the differences and relations between entity information extension and traditional query extension in preprocessing, word item selection, relevance calculation and matching methods are analyzed. On this basis, the main research topics of this paper are determined, that is, the entity information extension based on statistical learning. And the extension of entity information based on syntax rules. Secondly, in order to solve the problem of word item expansion with high correlation with entity, this paper proposes a method of entity information extension based on probability statistics. By using correlation feedback technology and hierarchical clustering algorithm, the co-occurrence correlation degree mining of entity and word item is carried out in the related document set to realize the extension of entity description information. Based on the model, the related lexical items of more than two thousand entities are extended and applied to the TREC2012Microblog evaluation task. The results verify the effectiveness of the model. Finally, aiming at the information such as entity synonym, identity description and so on, this paper presents a method of entity information extension based on grammatical rules. Through lexical analysis preprocessing, according to the grammatical characteristics of common reference expression, the entity expression is digested and the information such as entity nickname is extended. Using the model, good results are obtained in two subtasks in TAC2012KBP, and the effectiveness of the model is verified.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.092;TP391.3

【參考文獻】

相關(guān)期刊論文 前3條

1 徐建民;白彥霞;吳樹芳;;基于同義詞擴展的貝葉斯網(wǎng)絡(luò)檢索模型[J];計算機應(yīng)用;2006年11期

2 嚴華云;劉其平;肖良軍;;信息檢索中的相關(guān)反饋技術(shù)綜述[J];計算機應(yīng)用研究;2009年01期

3 王蘭成;李超;;結(jié)合兩種相似度計算的主題信息檢索方法研究[J];現(xiàn)代圖書情報技術(shù);2009年11期

,

本文編號:2512481

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2512481.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶6ad5b***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
精品欧美一区二区三久久| 欧美三级精品在线观看| av在线免费播放一区二区| 成人精品网一区二区三区| 国产精品福利一二三区| 在线欧洲免费无线码二区免费| 黄片三级免费在线观看| 五月激情婷婷丁香六月网| 午夜激情视频一区二区| 久久亚洲午夜精品毛片| 一区中文字幕人妻少妇| 国产一区二区三区香蕉av| 国自产拍偷拍福利精品图片| 日本高清不卡在线一区| 中国美女草逼一级黄片视频| 亚洲熟女少妇精品一区二区三区| 日本午夜免费啪视频在线| 在线观看国产午夜福利| 国产美女网红精品演绎| 五月天婷亚洲天婷综合网| 又黄又硬又爽又色的视频| 亚洲专区一区中文字幕| 国产在线一区二区免费| 亚洲最新中文字幕一区 | 女同伦理国产精品久久久| 中文字幕日韩欧美亚洲午夜| 日韩精品福利在线观看| 沐浴偷拍一区二区视频| 91熟女大屁股偷偷对白| 亚洲精品美女三级完整版视频 | 午夜福利精品视频视频| 日本加勒比不卡二三四区| 欧美日不卡无在线一区| 欧美不雅视频午夜福利| 久久精品伊人一区二区| 日韩精品一区二区三区av在线| 亚洲综合色在线视频香蕉视频| 欧美日韩校园春色激情偷拍 | 日本91在线观看视频| 国产又大又硬又粗又湿| 日韩免费午夜福利视频|