天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于時(shí)間發(fā)展的微博自適應(yīng)話題追蹤研究

發(fā)布時(shí)間:2018-11-27 08:10
【摘要】:隨著互聯(lián)網(wǎng)的快速發(fā)展,社交網(wǎng)絡(luò)由于其交互性、自由性和開(kāi)放性受到越來(lái)越多的人的青睞。自從2006年,世界首款微博客(以下簡(jiǎn)稱(chēng)微博)服務(wù)網(wǎng)站—Twitter由美國(guó)的埃文-威廉姆斯公司Obvious推出以來(lái),微博服務(wù)蒸蒸日上,堪稱(chēng)蓬勃發(fā)展。微博不同于傳統(tǒng)的新聞、博客,其內(nèi)容簡(jiǎn)短,限制在140字以內(nèi)。但是,用戶除了可以在自己的微博內(nèi)容里加入簡(jiǎn)短的文本以外,還可以加入圖片、視頻、音頻和其他鏈接等。這種自由、開(kāi)放的傳播方式,受到了廣大用戶的歡迎和關(guān)注,同時(shí),微博服務(wù)也在全球各地快速傳播,掀起了一股微博服務(wù)的熱潮。 由于微博的自由性、交互性和開(kāi)放性,人們可以隨時(shí)隨地分享自己的所見(jiàn)所聞或發(fā)表自己的情感態(tài)度。隨著微博用戶的急劇增長(zhǎng),微博信息量日益劇增,一些突發(fā)事件往往也容易在微博平臺(tái)顯現(xiàn)出來(lái)。因此,現(xiàn)階段微博話題檢測(cè)研究正受到研究學(xué)者的關(guān)注,正逐漸成為研究熱點(diǎn)。但是,人們有時(shí)更關(guān)注某一事件的發(fā)展?fàn)顩r,因此微博話題追蹤顯得尤為重要。為了充分利用微博的時(shí)間敏感特性,及時(shí)檢測(cè)和追蹤微博熱點(diǎn)話題,本文進(jìn)行了如下研究: 1.針對(duì)微博信息量大而時(shí)間敏感性強(qiáng)的特點(diǎn),提出基于速度增長(zhǎng)的微博話題發(fā)現(xiàn)方法 本文提出了基于速度增長(zhǎng)的微博熱點(diǎn)話題發(fā)現(xiàn)方法。首先把經(jīng)過(guò)預(yù)處理的微博按等數(shù)量窗口劃分,統(tǒng)計(jì)每個(gè)窗口內(nèi)各詞語(yǔ)的詞頻,并表示成時(shí)間二元組序列;然后通過(guò)計(jì)算每相鄰兩個(gè)窗口的個(gè)詞語(yǔ)的增長(zhǎng)斜率來(lái)發(fā)現(xiàn)增長(zhǎng)速度快的詞語(yǔ);然后通過(guò)計(jì)算與該詞語(yǔ)有關(guān)的用戶的增長(zhǎng)速度和微博條數(shù)的增長(zhǎng)速度來(lái)確定該詞語(yǔ)是否是熱點(diǎn)主題詞;最后通過(guò)熱點(diǎn)主題詞聚類(lèi)產(chǎn)生熱點(diǎn)話題。結(jié)果表明,該方法對(duì)新話題有很強(qiáng)的的挖掘能力。 2.針對(duì)話題追蹤中的話題漂移問(wèn)題,提出了基于時(shí)間發(fā)展的微博自適應(yīng)話題追蹤方法 該方法首先針對(duì)微博追蹤中的數(shù)據(jù)稀疏問(wèn)題,利用基于相關(guān)性檢索的特征詞擴(kuò)展方法來(lái)擴(kuò)展特征詞;然后針對(duì)特征詞權(quán)重不變?nèi)菀讓?dǎo)致召回率低的問(wèn)題,利用基于時(shí)間衰減的特征詞權(quán)重調(diào)整策略對(duì)特征詞權(quán)重進(jìn)行適當(dāng)?shù)乃p;最后針對(duì)話題模板靜態(tài)不變問(wèn)題,提出了基于雙重過(guò)濾技術(shù)的話題模板調(diào)整方法,,把相關(guān)報(bào)道且重要性得分高的報(bào)道用來(lái)更新話題模板。實(shí)驗(yàn)表明該方法在一定程度上提高了追蹤效率。 3.設(shè)計(jì)并實(shí)現(xiàn)了基于時(shí)間發(fā)展的微博自適應(yīng)話題追蹤算法的網(wǎng)絡(luò)輿情監(jiān)測(cè)系統(tǒng) 將本文提出的自適應(yīng)話題追蹤方法應(yīng)用于網(wǎng)絡(luò)輿情監(jiān)測(cè)系統(tǒng)中的話題追蹤模塊的話題模板調(diào)整,利用重要性得分高的微博條目更新話題模板,使系統(tǒng)有更高的召回率和準(zhǔn)確率,滿足用戶的需求。
[Abstract]:With the rapid development of Internet, more and more people favor social network because of its interactivity, freedom and openness. Since 2006, when Twitter, the world's first Weibo service site, was launched by Obvious of Evan Williams, the service has flourished and flourished. Weibo, unlike the traditional news, blog, its content is short, limited to 140 words. However, in addition to adding short text to Weibo's content, users can also add pictures, videos, audio and other links. This kind of free and open mode of communication has been welcomed and concerned by the vast number of users. At the same time, Weibo service has spread rapidly all over the world, setting off an upsurge of Weibo service. Because of Weibo's freedom, interactivity and openness, people can share what they see and hear at any time or express their emotional attitude. With the rapid growth of Weibo users, the amount of Weibo information is increasing day by day, and some unexpected events often appear easily on Weibo platform. Therefore, at present, Weibo topic detection research is being paid attention by researchers, and is becoming a research hotspot. However, people sometimes pay more attention to the development of an event, so Weibo topic tracking is particularly important. In order to make full use of Weibo's time sensitive characteristics and to detect and track the hot topics of Weibo in time, this paper has carried out the following research: 1. In view of Weibo's characteristics of large amount of information and strong time sensitivity, this paper puts forward a method of topic discovery based on speed growth for Weibo, which is a hot topic discovery method based on speed growth. Firstly, Weibo is divided according to the same number of windows, the frequency of each word in each window is counted, and the binary sequence of time is expressed. Then the fast growing words are found by calculating the growth slope of each of the two adjacent windows. Then we calculate the growth rate of users and Weibo number to determine whether the word is a hot topic word. Finally, hot topic words are generated by clustering hot topic words. The results show that the method has a strong ability to mine new topics. 2. In order to solve the topic drift problem in topic tracking, an adaptive topic tracking method for Weibo based on time development is proposed. The extended method of feature words based on relevance retrieval is used to extend the feature words. Secondly, aiming at the problem that the weight of feature words is invariable, the weight adjustment strategy based on time attenuation is used to reduce the weight of feature words. Finally, aiming at the static invariance of topic template, a topic template adjustment method based on double filtering technology is proposed, which uses the related reports with high importance score to update the topic template. Experiments show that this method improves the tracking efficiency to some extent. 3. The monitoring system of network public opinion based on Weibo adaptive topic tracking algorithm based on time development is designed and implemented. The adaptive topic tracking method proposed in this paper is applied to the topic tracking module of network public opinion monitoring system. Topic template adjustment, Using Weibo entry with high importance score to update topic template makes the system have higher recall rate and accuracy rate and meet the needs of users.
【學(xué)位授予單位】:山東師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP393.092;TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前7條

1 賈自艷 ,何清 ,張? ,李嘉佑 ,史忠植;一種基于動(dòng)態(tài)進(jìn)化模型的事件探測(cè)和追蹤算法[J];計(jì)算機(jī)研究與發(fā)展;2004年07期

2 于滿泉;駱衛(wèi)華;許洪波;白碩;;話題識(shí)別與跟蹤中的層次化話題識(shí)別技術(shù)研究[J];計(jì)算機(jī)研究與發(fā)展;2006年03期

3 王會(huì)珍;朱靖波;季鐸;葉娜;張斌;;基于反饋學(xué)習(xí)自適應(yīng)的中文話題追蹤[J];中文信息學(xué)報(bào);2006年03期

4 洪宇;張宇;劉挺;李生;;話題檢測(cè)與跟蹤的評(píng)測(cè)及研究綜述[J];中文信息學(xué)報(bào);2007年06期

5 李心妍;劉俐俐;;淺析微博中的“微輿情”[J];新聞世界;2011年07期

6 崔爭(zhēng)艷;;基于語(yǔ)義的微博短信息分類(lèi)[J];現(xiàn)代計(jì)算機(jī)(專(zhuān)業(yè)版);2010年08期

7 謝嵐;;微博客的分級(jí)化傳播模式研究[J];新聞傳播;2010年12期



本文編號(hào):2360002

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2360002.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶24bd8***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com