微博客數據分析系統(tǒng)的設計與實現(xiàn)
發(fā)布時間:2018-06-20 17:47
本文選題:微博客 + 微博爬蟲; 參考:《北京郵電大學》2014年碩士論文
【摘要】:微博客,簡稱微博,作為一種基于互聯(lián)網技術的應用,其用戶數量持續(xù)不斷增長,呈現(xiàn)出爆發(fā)式增長的趨勢。微博用戶通過主動地“發(fā)布”和“轉發(fā)”信息,能夠使信息在極短的時間內獲得最大的傳播效果。微博的快速發(fā)展,產生了大量的微博相關的數據,在這些數據中隱藏著巨大的價值,但是目前對于微博數據的獲取和數據分析,以及分析結果展示的相關技術仍然不夠完善,不能夠有效的獲取和分析微博數據,數據分析結果展示方式較為單一。本文首先對微博及特點進行分析,重點研究分析新浪微博平臺的特點;其次,對微博數據獲取方法進行研究,設計實現(xiàn)一種針對新浪微博平臺的,基于模擬登陸的微博爬蟲;最后對微博數據的分析方法及結果展示進行研究,針對微博數據設計有效的分析方法,并且對分析結果設計直觀,美觀,交互的展示方式。本文的具體工作如下: 1)研究微博的概念,特點和主要應用。新浪微博作為本文的研究重點,文中針對新浪微博的特點進行了研究分析。 2)研究微博數據獲取方法,分析基于微博API接口的數據獲取方法,明確該方法存在的限制。同時對傳統(tǒng)網絡爬蟲及其方法進行介紹。 3)設計針對新浪微博的微博數據獲取系統(tǒng),包括微博數據獲取系統(tǒng)需求分析,數據庫設計,微博爬蟲設計。微博爬蟲設計包括微博模擬登陸,網頁數據提取和不同類型的微博數據獲取方法的設計。 4)設計微博消息數據分析系統(tǒng),包括微博消息分析系統(tǒng)需求分析,分析系統(tǒng)數據庫設計,以及數據分析方法設計。本文設計的數據分析方法包括微博消息關鍵詞提。晃⒉┫鞑シ治觯何⒉┦鼙姺治龊完P鍵轉發(fā)者發(fā)現(xiàn);微博水軍用戶檢測方法。 5)設計B/S架構的微博數據分析展示平臺,該平臺采用HTML5與JSP相結合的技術,將數據分析結果以網頁的形式進行展示。 本文設計的微博客數據分析系統(tǒng),能夠有效地獲取微博數據,對微博消息數據進行分析,并將分析結果以美觀和新穎的方式在微博數據分析展示平臺上進行展示,平臺具有較好的用戶交互性。
[Abstract]:As a kind of application based on Internet technology, micro-blog, referred to as Weibo, has a trend of explosive growth in the number of users. Weibo users can actively "publish" and "forward" the information in a very short period of time to obtain the maximum effect of dissemination. With the rapid development of Weibo, a large number of Weibo related data are produced, and there is great value hidden in these data. But at present, the acquisition and analysis of Weibo data, as well as the related techniques to display the analysis results, are still not perfect. Can not effectively obtain and analyze Weibo data, data analysis results show a single way. In this paper, Weibo and its characteristics are analyzed, and the characteristics of Sina Weibo platform are studied. Secondly, the method of Weibo data acquisition is studied, and a Weibo crawler based on simulated landing is designed and implemented for Sina Weibo platform. Finally, the analysis method and result display of Weibo data are studied, and an effective analysis method is designed for Weibo data, and an intuitive, beautiful and interactive display method is designed for the analysis result. The main work of this paper is as follows: 1) the concept, characteristics and main applications of Weibo are studied. Sina Weibo is the focal point of this paper. According to the characteristics of Sina Weibo, this paper studies the method of Weibo data acquisition, analyzes the method of data acquisition based on Weibo API interface, and clarifies the limitations of this method. At the same time, the traditional web crawler and its methods are introduced. 3) the Weibo data acquisition system for Sina Weibo is designed, including the requirement analysis of Weibo data acquisition system, database design and Weibo crawler design. Weibo crawler design includes the design of Weibo simulation landing, web page data extraction and different types of Weibo data acquisition methods. 4) the design of Weibo message data analysis system, including Weibo message analysis system requirements analysis, Analysis system database design, and data analysis method design. The data analysis methods designed in this paper include Weibo message keyword extraction, Weibo message dissemination analysis, Weibo audience analysis and key retweeter discovery. 5) A Weibo data analysis and display platform based on B / S architecture is designed. The platform adopts the technology of HTML5 and JSPs to display the data analysis results in the form of web pages. The microblog data analysis system designed in this paper can effectively obtain Weibo data, analyze the Weibo message data, and display the analysis results on the Weibo data analysis and display platform in a beautiful and novel way. The platform has good user interaction.
【學位授予單位】:北京郵電大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:TP393.092
【參考文獻】
相關期刊論文 前4條
1 方俊;郭雷;王曉東;;基于語義的關鍵詞提取算法[J];計算機科學;2008年06期
2 王晶;朱珂;汪斌強;;基于信息數據分析的微博研究綜述[J];計算機應用;2012年07期
3 許曄;;微博——正在改變世界的創(chuàng)新應用[J];中國科技論壇;2012年08期
4 王淼;劉友華;;微博客的情報特征及其獲取方法[J];現(xiàn)代情報;2013年01期
,本文編號:2045159
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2045159.html
最近更新
教材專著