天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于異構(gòu)網(wǎng)絡(luò)的微博新聞事件自動檢測與摘要算法研究與實現(xiàn)

發(fā)布時間:2018-06-29 23:13

  本文選題:異構(gòu)信息網(wǎng)絡(luò) + 跨模態(tài)融合 ; 參考:《西南交通大學》2017年碩士論文


【摘要】:如今,微博平臺在實時傳播信息方面發(fā)揮了重要作用。然而,由于其具有規(guī)模大、實時性強和數(shù)據(jù)非結(jié)構(gòu)化的特點,常見的數(shù)據(jù)挖掘方法在處理它們時不再適用。為了克服傳統(tǒng)微博事件檢測與摘要方法忽視微博平臺中豐富視覺和社交信息的缺點,幫助人們快速掌握本質(zhì)意義的大量的微博,本文以著名社交網(wǎng)站Twitter上多個個熱點話題約100萬數(shù)據(jù)作為主要研究對象,主要研究了跨模態(tài)微博事件檢測、摘要?紤]包括文本、視覺、社交、時間等多個特征,提出了基于異構(gòu)網(wǎng)絡(luò)的事件檢測和摘要框架。首先在數(shù)據(jù)預(yù)處理階段,定義嚴格的過濾模式去除無意義的博文和圖片;接下來在事件檢測階段,使用異構(gòu)網(wǎng)絡(luò)模擬微博數(shù)據(jù)的異質(zhì)特性,采用后期多模態(tài)融合實體相似性模型來組合Twitter數(shù)據(jù)的異質(zhì)特征,并使用近似相似算法生成融合特征后的同構(gòu)圖。下一步在同構(gòu)相似度圖上采用改進DBSCAN的算法,融入概率模型解決子話題分割的問題,然后根據(jù)子話題的熱度及新穎度對產(chǎn)生的聚類排序。最后,分別為話題生成文本和視覺摘要。本文的貢獻如下:1、利用多模態(tài)信息構(gòu)建動態(tài)異構(gòu)信息網(wǎng)絡(luò),解決傳統(tǒng)方法不能利用微博豐富附加信息的缺點。利用AFF函數(shù)融合多模態(tài)特征,考慮它們的語義相似性和時空接近性來區(qū)分事件。從異構(gòu)網(wǎng)絡(luò)轉(zhuǎn)換為同構(gòu)網(wǎng)絡(luò),保留關(guān)鍵信息的同時為之后的檢測和摘要簡化結(jié)構(gòu)。2、為了提高檢測和摘要的多樣性,減少話題分割的現(xiàn)象,在聚類階段,提出HRDBSCAN算法,在原有聚類算法的基礎(chǔ)上結(jié)合概率統(tǒng)計方法合并相似類簇;在摘要階段,對子話題摘要結(jié)果再聚類,確保每個子話題在摘要只出現(xiàn)一次。3、在包含若干真實事件的Twitter數(shù)據(jù)集上實驗,實驗結(jié)果證明與現(xiàn)有方法相比本文提出框架的新穎性和優(yōu)越性。
[Abstract]:Nowadays, the Weibo platform plays an important role in spreading information in real time. However, because of its large scale, strong real-time and unstructured data, common data mining methods are no longer applicable to deal with them. In order to overcome the shortcomings of traditional Weibo event detection and summary methods which ignore the rich visual and social information in Weibo platform and help people quickly grasp a large number of Weibo with essential meaning. In this paper, 1 million data about 1 million hot topics on the famous social network are taken as the main research object, and the cross-modal Weibo event detection is mainly studied. Considering text, visual, social, time and other features, an event detection and summary framework based on heterogeneous networks is proposed. In the data preprocessing stage, strict filtering mode is defined to remove meaningless blog posts and images. Then, heterogeneous network is used to simulate the heterogeneity of Weibo data in the event detection phase. A multimodal fusion entity similarity model is used to combine the heterogeneous features of Twitter data, and an approximate similarity algorithm is used to generate the homocomposition of the fusion features. In the next step, the improved DBSCAN algorithm is used in the isomorphic similarity graph to solve the sub-topic segmentation problem by incorporating the probability model, and then the resulting clustering is sorted according to the heat and novelty of the sub-topic. Finally, text and visual summary are generated for the topic. The contributions of this paper are as follows: 1. We use multi-modal information to construct dynamic heterogeneous information network to solve the problem that traditional methods can not enrich additional information by using Weibo. AFF functions are used to fuse multi-modal features and their semantic similarity and spatio-temporal proximity are considered to distinguish events. In order to improve the diversity of detection and summary and reduce the phenomenon of topic segmentation, HRDBSCAN algorithm is proposed in the clustering stage, in order to improve the diversity of detection and summary and reduce the phenomenon of topic segmentation. On the basis of the original clustering algorithm combined with the probability and statistics method to merge the similar clusters, in the summary stage, the sub-topic summary results of the clustering, Make sure that each subtopic only appears once. 3 in the summary, and experiment on the Twitter dataset containing some real events. The experimental results show that the proposed framework is more novel and superior than the existing methods.
【學位授予單位】:西南交通大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP393.092

【參考文獻】

相關(guān)期刊論文 前1條

1 劉美玲;鄭德權(quán);趙鐵軍;于洋;;動態(tài)多文檔文摘模型[J];軟件學報;2012年02期

,

本文編號:2083771

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2083771.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶1fa4c***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
欧美熟妇一区二区在线| 夫妻性生活动态图视频| 日韩欧美国产高清在线| 日本黄色高清视频久久| 日本不卡在线视频中文国产| 91麻豆精品欧美视频| 国产亚洲精品久久99| 男女午夜视频在线观看免费| 国产女同精品一区二区| 日本免费一区二区三女| 欧美一级片日韩一级片| 精品一区二区三区三级视频| 三级理论午夜福利在线看| 日本一区不卡在线观看| 精品一区二区三区人妻视频| 国产成人精品午夜福利| 日韩精品一区二区三区射精| 亚洲熟妇熟女久久精品 | 国产精品人妻熟女毛片av久| 人妻中文一区二区三区| 内射精子视频欧美一区二区| 欧美欧美日韩综合一区| 欧美区一区二区在线观看| 亚洲熟妇熟女久久精品| 亚洲精品一区二区三区免| 国产精品一区欧美二区| 中文字幕日韩欧美一区| 成人免费在线视频大香蕉| 天海翼高清二区三区在线| 午夜亚洲少妇福利诱惑| 欧美日韩中黄片免费看| 中文字幕在线区中文色| 欧美日韩有码一二三区| 亚洲国产精品肉丝袜久久| 亚洲国产成人久久一区二区三区| 亚洲精品中文字幕欧美| 久久精品亚洲精品一区| 最好看的人妻中文字幕| 熟妇人妻av中文字幕老熟妇| 精品欧美国产一二三区| 黄片免费播放一区二区|