天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于時間發(fā)展的微博自適應(yīng)話題追蹤研究

發(fā)布時間:2018-11-27 08:10
【摘要】:隨著互聯(lián)網(wǎng)的快速發(fā)展,社交網(wǎng)絡(luò)由于其交互性、自由性和開放性受到越來越多的人的青睞。自從2006年,世界首款微博客(以下簡稱微博)服務(wù)網(wǎng)站—Twitter由美國的埃文-威廉姆斯公司Obvious推出以來,微博服務(wù)蒸蒸日上,堪稱蓬勃發(fā)展。微博不同于傳統(tǒng)的新聞、博客,其內(nèi)容簡短,限制在140字以內(nèi)。但是,用戶除了可以在自己的微博內(nèi)容里加入簡短的文本以外,還可以加入圖片、視頻、音頻和其他鏈接等。這種自由、開放的傳播方式,受到了廣大用戶的歡迎和關(guān)注,同時,微博服務(wù)也在全球各地快速傳播,掀起了一股微博服務(wù)的熱潮。 由于微博的自由性、交互性和開放性,人們可以隨時隨地分享自己的所見所聞或發(fā)表自己的情感態(tài)度。隨著微博用戶的急劇增長,微博信息量日益劇增,一些突發(fā)事件往往也容易在微博平臺顯現(xiàn)出來。因此,現(xiàn)階段微博話題檢測研究正受到研究學(xué)者的關(guān)注,正逐漸成為研究熱點(diǎn)。但是,人們有時更關(guān)注某一事件的發(fā)展?fàn)顩r,因此微博話題追蹤顯得尤為重要。為了充分利用微博的時間敏感特性,及時檢測和追蹤微博熱點(diǎn)話題,本文進(jìn)行了如下研究: 1.針對微博信息量大而時間敏感性強(qiáng)的特點(diǎn),,提出基于速度增長的微博話題發(fā)現(xiàn)方法 本文提出了基于速度增長的微博熱點(diǎn)話題發(fā)現(xiàn)方法。首先把經(jīng)過預(yù)處理的微博按等數(shù)量窗口劃分,統(tǒng)計每個窗口內(nèi)各詞語的詞頻,并表示成時間二元組序列;然后通過計算每相鄰兩個窗口的個詞語的增長斜率來發(fā)現(xiàn)增長速度快的詞語;然后通過計算與該詞語有關(guān)的用戶的增長速度和微博條數(shù)的增長速度來確定該詞語是否是熱點(diǎn)主題詞;最后通過熱點(diǎn)主題詞聚類產(chǎn)生熱點(diǎn)話題。結(jié)果表明,該方法對新話題有很強(qiáng)的的挖掘能力。 2.針對話題追蹤中的話題漂移問題,提出了基于時間發(fā)展的微博自適應(yīng)話題追蹤方法 該方法首先針對微博追蹤中的數(shù)據(jù)稀疏問題,利用基于相關(guān)性檢索的特征詞擴(kuò)展方法來擴(kuò)展特征詞;然后針對特征詞權(quán)重不變?nèi)菀讓?dǎo)致召回率低的問題,利用基于時間衰減的特征詞權(quán)重調(diào)整策略對特征詞權(quán)重進(jìn)行適當(dāng)?shù)乃p;最后針對話題模板靜態(tài)不變問題,提出了基于雙重過濾技術(shù)的話題模板調(diào)整方法,把相關(guān)報道且重要性得分高的報道用來更新話題模板。實(shí)驗表明該方法在一定程度上提高了追蹤效率。 3.設(shè)計并實(shí)現(xiàn)了基于時間發(fā)展的微博自適應(yīng)話題追蹤算法的網(wǎng)絡(luò)輿情監(jiān)測系統(tǒng) 將本文提出的自適應(yīng)話題追蹤方法應(yīng)用于網(wǎng)絡(luò)輿情監(jiān)測系統(tǒng)中的話題追蹤模塊的話題模板調(diào)整,利用重要性得分高的微博條目更新話題模板,使系統(tǒng)有更高的召回率和準(zhǔn)確率,滿足用戶的需求。
[Abstract]:With the rapid development of Internet, more and more people favor social network because of its interactivity, freedom and openness. Since 2006, when Twitter, the world's first Weibo service site, was launched by Obvious of Evan Williams, the service has flourished and flourished. Weibo, unlike the traditional news, blog, its content is short, limited to 140 words. However, in addition to adding short text to Weibo's content, users can also add pictures, videos, audio and other links. This kind of free and open mode of communication has been welcomed and concerned by the vast number of users. At the same time, Weibo service has spread rapidly all over the world, setting off an upsurge of Weibo service. Because of Weibo's freedom, interactivity and openness, people can share what they see and hear at any time or express their emotional attitude. With the rapid growth of Weibo users, the amount of Weibo information is increasing day by day, and some unexpected events often appear easily on Weibo platform. Therefore, at present, Weibo topic detection research is being paid attention by researchers, and is becoming a research hotspot. However, people sometimes pay more attention to the development of an event, so Weibo topic tracking is particularly important. In order to make full use of Weibo's time sensitive characteristics and to detect and track the hot topics of Weibo in time, this paper has carried out the following research: 1. In view of Weibo's characteristics of large amount of information and strong time sensitivity, this paper puts forward a method of topic discovery based on speed growth for Weibo, which is a hot topic discovery method based on speed growth. Firstly, Weibo is divided according to the same number of windows, the frequency of each word in each window is counted, and the binary sequence of time is expressed. Then the fast growing words are found by calculating the growth slope of each of the two adjacent windows. Then we calculate the growth rate of users and Weibo number to determine whether the word is a hot topic word. Finally, hot topic words are generated by clustering hot topic words. The results show that the method has a strong ability to mine new topics. 2. In order to solve the topic drift problem in topic tracking, an adaptive topic tracking method for Weibo based on time development is proposed. The extended method of feature words based on relevance retrieval is used to extend the feature words. Secondly, aiming at the problem that the weight of feature words is invariable, the weight adjustment strategy based on time attenuation is used to reduce the weight of feature words. Finally, aiming at the static invariance of topic template, a topic template adjustment method based on double filtering technology is proposed, which uses the related reports with high importance score to update the topic template. Experiments show that this method improves the tracking efficiency to some extent. 3. The monitoring system of network public opinion based on Weibo adaptive topic tracking algorithm based on time development is designed and implemented. The adaptive topic tracking method proposed in this paper is applied to the topic tracking module of network public opinion monitoring system. Topic template adjustment, Using Weibo entry with high importance score to update topic template makes the system have higher recall rate and accuracy rate and meet the needs of users.
【學(xué)位授予單位】:山東師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.092;TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前7條

1 賈自艷 ,何清 ,張? ,李嘉佑 ,史忠植;一種基于動態(tài)進(jìn)化模型的事件探測和追蹤算法[J];計算機(jī)研究與發(fā)展;2004年07期

2 于滿泉;駱衛(wèi)華;許洪波;白碩;;話題識別與跟蹤中的層次化話題識別技術(shù)研究[J];計算機(jī)研究與發(fā)展;2006年03期

3 王會珍;朱靖波;季鐸;葉娜;張斌;;基于反饋學(xué)習(xí)自適應(yīng)的中文話題追蹤[J];中文信息學(xué)報;2006年03期

4 洪宇;張宇;劉挺;李生;;話題檢測與跟蹤的評測及研究綜述[J];中文信息學(xué)報;2007年06期

5 李心妍;劉俐俐;;淺析微博中的“微輿情”[J];新聞世界;2011年07期

6 崔爭艷;;基于語義的微博短信息分類[J];現(xiàn)代計算機(jī)(專業(yè)版);2010年08期

7 謝嵐;;微博客的分級化傳播模式研究[J];新聞傳播;2010年12期



本文編號:2360001

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2360001.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶24bd8***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com