天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

多階段混合屬性的景點(diǎn)實(shí)體解析研究

發(fā)布時(shí)間:2018-04-29 14:24

  本文選題:景點(diǎn)實(shí)體解析 + 多階段; 參考:《江西師范大學(xué)》2015年碩士論文


【摘要】:實(shí)體解析是一個(gè)非常傳統(tǒng)的研究方向,近年來又逐漸成為研究熱點(diǎn),基于領(lǐng)域的實(shí)體解析正是其熱點(diǎn)之一。與通用實(shí)體解析不同的是,基于領(lǐng)域的實(shí)體解析需要全面地分析和捕獲領(lǐng)域數(shù)據(jù)的特征,并充分地加以利用。通用實(shí)體解析方法通常是在單一階段內(nèi)一次性匹配特征數(shù)據(jù)來完成實(shí)體解析,這一方面會(huì)造成不同特征數(shù)據(jù)的相互干擾,另一方面也不利于有針對性地利用不同的特征數(shù)據(jù),從而影響實(shí)體解析的精確度。因此,本文在旅游信息領(lǐng)域背景下,在對領(lǐng)域無關(guān)和基于領(lǐng)域?qū)嶓w解析文獻(xiàn)綜述的基礎(chǔ)上,提出了一種基于多階段混合屬性的景點(diǎn)實(shí)體解析方法。本方法在不同旅游數(shù)據(jù)源中,在景點(diǎn)的不同屬性中充分提取景點(diǎn)的特征信息,通過多個(gè)階段設(shè)計(jì)相應(yīng)算法多次利用相關(guān)特征信息,最終實(shí)現(xiàn)景點(diǎn)實(shí)體解析。其中,景點(diǎn)的不同屬性包括景點(diǎn)名、景點(diǎn)所在地,以及景點(diǎn)簡介等。實(shí)體解析分為兩個(gè)階段,第一階段是利用景點(diǎn)簡介中的名詞信息,對不同旅游網(wǎng)站中的景點(diǎn)進(jìn)行聚類;第二階段是在聚類結(jié)果基礎(chǔ)上,利用景點(diǎn)名和景點(diǎn)簡介中的人名地名相似度信息,進(jìn)行桶裝算法實(shí)現(xiàn)實(shí)體解析。本論文創(chuàng)新點(diǎn)如下:(1).解決了基于旅游景點(diǎn)實(shí)體解析的問題;(2).提出了基于多階段混合屬性的景點(diǎn)完全實(shí)體消解框架,在不同階段有針對性地利用實(shí)體屬性的有效信息;(3).提出了一種景點(diǎn)名景點(diǎn)簡介混合的景點(diǎn)相似度度量方法;(4).提出了一種基于最遠(yuǎn)初始中心點(diǎn)和輪廓系數(shù)評價(jià)函數(shù)的k-means聚類優(yōu)化算法;(5).改造了一種桶裝解析算法;(6).在真實(shí)旅游景點(diǎn)數(shù)據(jù)集上進(jìn)行了大量對比實(shí)驗(yàn)。
[Abstract]:Entity resolution is a very traditional research direction and has gradually become a research hotspot in recent years. Unlike common entity resolution, domain-based entity resolution needs to analyze and capture the features of domain data comprehensively and make full use of them. The common entity resolution method usually matches the feature data in a single stage to complete the entity resolution. On the one hand, it will lead to the mutual interference of different feature data, on the other hand, it is not conducive to the targeted use of different feature data. Thus, the accuracy of entity resolution is affected. Therefore, under the background of tourism information field, based on the literature review of domain-independent and domain-based entity analysis, this paper proposes a method of entity parsing based on multi-stage mixed attributes. In this method, the feature information of scenic spots is fully extracted in different tourist data sources and different attributes of scenic spots, and the relevant feature information is used many times through designing the corresponding algorithm in multiple stages, and finally the entity analysis of scenic spots is realized. Among them, the different attributes of scenic spots include the name, site, and site profile. Entity analysis is divided into two stages, the first stage is to use the noun information in the introduction of scenic spots to cluster different tourist sites; the second stage is based on the clustering results. By using the similarity information of the scenic spot name and the person name and place name, the barreled algorithm is used to realize entity analysis. The innovation of this paper is as follows: 1. Solve the problem based on the entity analysis of tourist attractions. A framework of complete entity resolution for scenic spots based on multi-stage mixed attributes is proposed, and the effective information of entity attributes is used in different stages. This paper presents a mixed method for measuring the similarity of scenic spots. A k-means clustering optimization algorithm based on the farthest initial center and contour coefficient evaluation function is proposed. A barrelled analytical algorithm is modified. A large number of comparative experiments were carried out on the real tourist attraction data set.
【學(xué)位授予單位】:江西師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2015
【分類號(hào)】:F590.3

【參考文獻(xiàn)】

相關(guān)期刊論文 前2條

1 楊丹;申德榮;于戈;聶鐵錚;寇月;;數(shù)據(jù)空間中時(shí)間為中心的集合實(shí)體識(shí)別策略[J];計(jì)算機(jī)科學(xué)與探索;2012年11期

2 寇月;申德榮;劉恒;王泰明;聶鐵錚;于戈;;異構(gòu)網(wǎng)絡(luò)中關(guān)聯(lián)實(shí)體識(shí)別模型及增量式驗(yàn)證算法研究[J];計(jì)算機(jī)學(xué)報(bào);2013年10期

相關(guān)碩士學(xué)位論文 前1條

1 楊莉;Web旅游信息集成中的信息融合研究[D];江西財(cái)經(jīng)大學(xué);2013年



本文編號(hào):1820310

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjilunwen/lyjj/1820310.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶ace84***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請E-mail郵箱bigeng88@qq.com