天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當前位置:主頁 > 科技論文 > 搜索引擎論文 >

Web中相關實體發(fā)現研究

發(fā)布時間:2018-05-02 11:24

  本文選題:相關實體發(fā)現 + 類型細化; 參考:《北京交通大學》2013年博士論文


【摘要】:隨著Internet和信息檢索技術的迅猛發(fā)展,Web成為人們獲取信息的重要途徑,而搜索引擎則成為從Web中獲取信息的重要工具。傳統(tǒng)的搜索方式是:用戶向搜索引擎(比如Google、百度)提交查詢,搜索引擎則依據提交的查詢給用戶返回一組相關文檔列表。但是很多時候用戶需求的并不是文檔本身,而是文檔中包含的實體信息。因此如何從眾多的Web文檔中找到用戶需求的實體信息成為近年來的研究熱點,而相關實體發(fā)現研究正是針對用戶的這種特殊實體查詢需求而產生。相關實體發(fā)現是指給定一個由源實體、目標類型和源實體與目標實體的關系描述構成的查詢,找到符合要求的一組實體。 返回的實體需要滿足查詢要求的類型,但是給定的目標類型經常非常粗糙,這導致無法對得到的實體進行準確的類型判斷,針對這個問題我們做了如下的工作: 1)提出一種自動獲取細粒度目標類型及其下義種子實體的方法。通過對查詢語句的句法分析獲取細粒度目標類型,利用查詢模板獲取目標類型的下義種子實體。 2)提出一種基于歸納法的細粒度目標類型下義類別判別規(guī)則集合獲取方法,對于數量較少的種子實體,利用歸納法獲取細粒度目標類型的下義類別判別規(guī)則集合。 3)提出一種基于特征提取的細粒度目標類型下義類別判別規(guī)則集合獲取方法,對于數量較多的種子實體,利用學習到的最佳特征提取方法獲取細粒度目標類型的下義類別判別規(guī)則集合。 由于初始檢索到的候選實體是無序的,要想得到滿足用戶查詢要求的實體,必須對所有的候選實體進行排序,針對該問題我們做了如下的工作: 1)提出了一種基于生成概率模型的實體排序方法。從實體相關度、實體類型相關度和實體關系相關度三方面的組合計算來對實體進行排序,通過對比多種組合方法,獲取最佳的排序方法。對于實體類型相關度的計算使用了兩種方法,一種方法是基于歸納法獲取的細粒度目標類型下義類別判別規(guī)則集合,利用不同的規(guī)則集合數進行實體類型相關度計算,另一種方法是基于特征提取方法獲取的細粒度目標類型下義類別判別規(guī)則集合。對于實體關系相關度計算,評估了兩種平滑方法對實體排序的影響,并提出了一種去停止詞重構關系的實體關系相關度計算方法,提高了排序效果并降低了時間耗費。 2)提出了一種基于馬爾可夫隨機場的實體排序方法。該方法將實體用文檔、類型和名稱三個屬性表示,利用學習到的最佳權重參數通過線性合并查詢與候選實體表示文檔的相關度、目標類型與候選實體類型的相關度以及源實體與候選實體名稱的相關度來對實體進行排序。 相關實體發(fā)現任務中,實體被定義為由其唯一的主頁所表示,因此對所有的候選實體排序后,還要找到實體的主頁。針對實體的主頁查找問題,提出了一種查找方法,通過合并Web頁面的多屬性表示得分和實體的Wikipedia頁面外部鏈接得分來實現實體的主頁查找。 實驗結果表明,我們提出的方法可以有效的完成相關實體發(fā)現任務,大量的減少用戶人工獲取相關實體信息的工作,并為用戶提供一個有效的結果。
[Abstract]:With the rapid development of Internet and information retrieval technology, Web has become an important way for people to obtain information, and search engines have become an important tool for obtaining information from Web. The traditional search method is: users submit queries to search engines (such as Google, Baidu), and search engines return a group of phases to users based on submission queries. Guan Wendang list. But most of the time the user needs not the document itself, but the entity information contained in the document. So how to find the entity information of the user needs from a large number of Web documents has become a hot spot of research in recent years, and the related entity discovery research is produced for the user's special entity query requirement. Closed entity discovery refers to a query consisting of a description of the source entity, the target type and the source entity, and a set of entities that meet the requirements.
The returned entity needs to meet the type of query requirements, but the given target type is often very rough, which leads to the inability to accurately type the obtained entity, and we do the following work for this problem:
1) a method of automatic acquisition of fine-grained target type and its underlying seed entity is proposed. By the syntactic analysis of query sentences, fine-grained target types are obtained, and a query template is used to obtain the underlying seed entity of the target type.
2) a method based on induction is proposed to obtain a set of fine category discriminant rule sets under fine grained target type. For a small number of seed entities, a set of lower sense category discriminant rules for fine-grained target types is obtained by induction.
3) a collection method based on feature extraction is proposed to obtain a set of semantic category discriminant rules set under fine grained target types. For a large number of seed entities, the best feature extraction method learned from learning is used to obtain a set of lower class discriminant rules for fine grained target types.
Since the initial retrieved candidate entities are unordered, to get the entity that meets the user's query requirements, all the candidate entities must be sorted. We have done the following work on the problem:
1) a kind of entity sorting method based on the generation probability model is proposed. The combination calculation of entity correlation degree, entity type correlation degree and entity relation correlation degree is used to sort the entity, and the best sorting method is obtained by comparing a variety of combination methods. Two methods are used for the calculation of entity type correlation. The method is a set of semantic category discrimination rules under the fine grained target type obtained by induction, and the correlation degree of entity type is calculated by different set of rule sets. The other is a set of semantic category discrimination rules under the fine-grained target type obtained by the feature extraction method. The evaluation of the correlation degree of entity relations is two. The effect of the smoothing method on the entity sorting is presented, and a method of calculating the correlation degree of the entity relation to stop the reconfiguration of the words is proposed, which improves the ranking effect and reduces the time consumption.
2) an entity sorting method based on Markov random field is proposed. This method represents the entity with three attributes of document, type and name, and the correlation degree of the document by linear merge query with the candidate entity, the correlation degree between the target type and the candidate entity type and the source entity and candidate. The correlation degree of the entity name is used to sort the entity.
In the related entity discovery task, the entity is defined as its unique home page, so after sorting all the candidate entities, the entity's main page is also found. A lookup method is proposed for the entity's home page finding problem by merging the multiple attribute table of the Web page and the external link score of the entity's Wikipedia page. To implement the home page lookup of the entity.
The experimental results show that the proposed method can effectively complete the related entity discovery tasks, reduce the work of the user to obtain the relevant entity information artificially, and provide an effective result for the user.

【學位授予單位】:北京交通大學
【學位級別】:博士
【學位授予年份】:2013
【分類號】:TP391.3

【共引文獻】

相關期刊論文 前1條

1 周密;劉倩;梁安;;組織內成員間知識共享的影響因素研究[J];管理學報;2013年10期

相關博士學位論文 前1條

1 裘麗;互聯(lián)網大規(guī)模公益協(xié)作機制研究[D];湖南大學;2012年

相關碩士學位論文 前1條

1 李源;虛擬團隊中的社會惰性研究[D];大連理工大學;2013年

,

本文編號:1833667

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1833667.html


Copyright(c)文論論文網All Rights Reserved | 網站地圖 |

版權申明:資料由用戶dd33f***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
99久热只有精品视频最新| 国内女人精品一区二区三区| 国产免费一区二区三区av大片| 99久久国产精品亚洲| 亚洲av日韩av高潮无打码| 青草草在线视频免费视频| 男人大臿蕉香蕉大视频| 中国少妇精品偷拍视频 | 麻豆视频传媒入口在线看| 免费亚洲黄色在线观看| 午夜精品麻豆视频91| 久久永久免费一区二区| 亚洲精品一区二区三区日韩| 欧美日韩国产欧美日韩| 亚洲熟妇中文字幕五十路| 日本高清一道一二三区四五区| 我想看亚洲一级黄色录像| 国产肥女老熟女激情视频一区| 国产伦精品一一区二区三区高清版 | 女人高潮被爽到呻吟在线观看| 国产色偷丝袜麻豆亚洲| 亚洲欧美日本视频一区二区| 欧美日韩国产成人高潮| 国产成人综合亚洲欧美日韩| 后入美臀少妇一区二区| 国产自拍欧美日韩在线观看| 熟妇久久人妻中文字幕| 中文字幕欧美精品人妻一区| 少妇视频一区二区三区| 亚洲精品成人综合色在线| 国产又色又爽又黄又免费 | 亚洲一区二区三区中文久久 | 中文字幕乱码免费人妻av| 在线观看欧美视频一区| 男人和女人草逼免费视频| 日韩中文字幕人妻精品| 国产在线一区二区三区不卡| 亚洲最新av在线观看| 日韩黄色大片免费在线| 中文字幕免费观看亚洲视频| 亚洲国产欧美精品久久|