基于領域的微博用戶影響力評估方法的研究
發(fā)布時間:2018-06-04 12:43
本文選題:微博 + 領域分類; 參考:《西南大學》2014年碩士論文
【摘要】:微博誕生以來,以其交互性強、傳播速度快、內容簡潔等特征獲得了大量網絡用戶的青睞,是當下流行的社交網絡。作為廣泛應用的信息載體和傳輸媒介,微博已經擁有了大量的流動信息和活躍用戶。其中用戶所發(fā)表的內容眾多且涉及多個行業(yè)與領域,并通過大量的粉絲進行評論與轉發(fā),從而在各行業(yè)產生巨大的影響力。當合理有效地評估微博用戶影響力時,則可以使其產生巨大的社會效益,比如進行信息擴散、商品推介和宣傳時會達到事半功倍的效果,這對于商業(yè)營銷來說具有重大的意義。因此,多方位完善地考慮用戶在各領域行業(yè)的參與度,計算用戶在各領域的影響力具有重要的研究意義。目前國內外也有大量的研究者對微博用戶影響力進行了研究。 微博興起于國外的Twitter,但Twitter又不同于國內的微博,它沒有評論功能。因此傳統(tǒng)的微博用戶影響力評估方法主要是針對于Twitter,雖然考慮了微博用戶的粉絲數、微博數、粉絲質量及其轉發(fā)數與被提及數等參數,但沒有考慮微博的評論功能,存在一定的局限性。通常所說的社會影響力是在特定領域的影響力,每個用戶在各個領域的影響力是不同的,因此對用戶在各領域的影響力評估也具有重大的意義。而傳統(tǒng)研究主要是籠統(tǒng)地對用戶進行影響力評估,忽略了微博用戶的跨領域性與微博的領域交叉性,沒有考慮微博用戶在不同領域影響力的評估。 因此,針對以上問題,本文提出了基于領域的微博用戶影響力的評估方法,該評估方法主要由基于KNN的領域分類算法與微博用戶影響力算法構成,解決當前微博用戶影響力評估方面存在的問題。本文主要工作和創(chuàng)新點從以下幾個方面展開: 第一,針對傳統(tǒng)研究忽略了微博用戶跨領域以及微博交叉性問題,本文應用了基于KNN的領域分類算法。首先由于一個用戶通常對多個領域都有所涉獵,因此其發(fā)表的微博將涉及不同的領域。其次單條微博所屬的領域界限不明顯,可能既屬于領域A,也屬于領域B。以上現象分別為微博用戶的跨領域性與領域交叉性問題。為了充分考慮以上問題,本文應用了基于KNN領域分類算法。該算法主要參照微博文本語料庫的類標簽,依據每條微博文本內容將微博劃分為21個領域,從而得到用戶在各領域的微博以及微博總數。 第二,針對傳統(tǒng)研究影響力指標過于簡單的問題,本文增加了影響力參數計算指標,提出微博用戶影響力計算算法。傳統(tǒng)研究主要是從微博數、粉絲數、轉發(fā)數以及被提及數來度量微博用戶影響力。微博用戶影響力本質上是用戶間的相互作用。而用戶間的相互作用除了通過傳統(tǒng)參數反映外,還能夠通過用戶的被評論數、總在線時間與注冊時間反映。因此本文充分考慮用戶的評論功能、在線時間、注冊時間等參數,從而提出微博用戶影響力計算算法。 第三,進行實驗分析。分別運用傳統(tǒng)方法與本文提出的評估方法計算微博用戶在各領域的影響力,并對該兩組數據進行對比與分析。通過實驗表明,本文提出的基于領域的微博用戶影響力評估方法具有更好的實用性與合理性。 本文的研究能夠有效地評估用戶在各領域的影響力,對商業(yè)宣傳具有積極的作用,對微博的應用發(fā)展具有重要的意義。
[Abstract]:Since the birth of micro-blog, with its strong interaction, fast transmission speed, simple content and so on, it has been popular with a large number of Internet users. It is the popular social network. As a widely used information carrier and transmission medium, micro-blog has already had a large number of mobile information and active users. Industry and field, and through a large number of fans to review and forward, and thus have great influence in various industries. When a reasonable and effective assessment of the influence of micro-blog users, it can produce huge social benefits, such as information diffusion, commodity introduction and dissemination will achieve twice the result of half the effort, this is a business camp. Marketing is of great significance. Therefore, it is of great significance to consider the participation of users in various fields and to calculate the influence of users in various fields. There are also a large number of researchers at home and abroad studying the influence of micro-blog users.
Micro-blog has sprang up in foreign Twitter, but Twitter is different from domestic micro-blog. It has no comment function. Therefore, the traditional micro-blog user influence evaluation method is mainly aimed at Twitter, although it takes into account the parameters of the number of fans, the number of micro-blog, the quality of the fans, the number of fans, the number of forwarded and the number of references, but does not consider the comments of micro-blog. There are certain limitations. Generally speaking, the influence of the social influence is in a particular field, and the influence of each user in various fields is different. Therefore, it is of great significance to evaluate the influence of the users in various fields. The traditional research is mainly to evaluate the influence of the users in general and ignore the micro-blog. The cross domain of micro-blog and its interdisciplinary nature do not take into account the evaluation of micro-blog users' influence in different fields.
Therefore, in view of the above problems, this paper proposes a field based evaluation method of micro-blog user influence, which is mainly composed of KNN based domain classification algorithm and micro-blog user influence algorithm, to solve the existing problems in the evaluation of influence of micro-blog users. The main work and innovation points are shown in the following aspects. Open:
First, in view of the neglect of the cross domain and the micro-blog crossover problem of micro-blog users, the domain classification algorithm based on KNN is applied in this paper. First, because one user usually dabble in many fields, the published micro-blog will involve different fields. Secondly, the domain boundaries of the single micro-blog are not obvious, which may be both possible. The domain A, which belongs to the domain B., is the cross domain and domain cross problem of the micro-blog users respectively. In order to fully consider the above problems, this paper applies the KNN domain classification algorithm. The algorithm mainly refers to the class tag of the micro-blog text corpus, and divides the micro-blog into 21 domains according to each micro-blog text content. Get the total number of micro-blog and micro-blog in all fields.
Second, in order to solve the problem that the traditional research influence index is too simple, this paper adds the calculation index of the influence parameter and puts forward the micro-blog user influence calculation algorithm. The traditional research is mainly from the micro-blog number, the number of fans, the forwarding number and the number of references to measure the influence force of the micro-blog user. The influence of the micro-blog user is essentially the interaction between the users. The interaction between users is not only reflected by the traditional parameters, but also can be reflected by the number of users' comments, the total online time and the time of registration. Therefore, this paper gives full consideration to the user's comment function, online time, registration time and other parameters, thus the micro-blog user influence calculation algorithm is proposed.
Third, carry on the experiment analysis. Use the traditional method and the evaluation method proposed in this paper to calculate the influence of micro-blog users in various fields, and compare and analyze the two groups of data. Through the experiment, it shows that the domain based micro-blog user influence evaluation method proposed in this paper has better practicability and rationality.
The research in this paper can effectively evaluate the influence of users in various fields, play a positive role in business propaganda, and is of great significance to the application and development of micro-blog.
【學位授予單位】:西南大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP393.092
【參考文獻】
相關期刊論文 前10條
1 張征杰;王自強;;文本分類及算法綜述[J];電腦知識與技術;2012年04期
2 劉群,張華平,俞鴻魁,程學旗;基于層疊隱馬模型的漢語詞法分析[J];計算機研究與發(fā)展;2004年08期
3 李榮陸,王建會,陳曉云,陶曉鵬,胡運發(fā);使用最大熵模型進行中文文本分類[J];計算機研究與發(fā)展;2005年01期
4 張著英;黃玉龍;王翰虎;;一個高效的KNN分類算法[J];計算機科學;2008年03期
5 吳春明;謝德體;;基于領域特征文本的Deep Web分類研究[J];計算機科學;2012年04期
6 張寧,賈自艷,史忠植;使用KNN算法的文本分類[J];計算機工程;2005年08期
7 羅長升;段建國;郭莉;;基于推拉策略的文本分類增量學習研究[J];中文信息學報;2008年01期
8 張孝飛;黃河燕;;一種采用聚類技術改進的KNN文本分類方法[J];模式識別與人工智能;2009年06期
9 吳文苑;;微博傳播對網絡輿論的影響——以“宜黃強拆事件”為例[J];新聞世界;2011年06期
10 侯漢清;;分類法的發(fā)展趨勢簡論[J];情報科學;1981年01期
,本文編號:1977395
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1977395.html
最近更新
教材專著