基于Spark平臺(tái)的網(wǎng)絡(luò)數(shù)據(jù)分析系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)
[Abstract]:With the rapid development of Internet technology, the content distribution network (CDN) plays an important role in the Internet architecture, and users' online records are recorded in the CDN service provider's log. The major CDN manufacturers have some common requirements for analyzing massive network data, and their PM managers, operators and other non-technical personnel all need to do some general data analysis work on these network data. For CDN service providers, there is a lack of a common network data analysis service platform. Therefore, to provide CDN manufacturers with a general, no big data platform to use the threshold of network data analysis service platform has an urgent need. In order to design a general, simple and extensible service platform for analyzing massive network data, this paper designs and implements a network data analysis service platform based on Spark platform by using the existing distributed framework. The main work of this paper is as follows: (1) based on Spark big data technology, the preprocessing and processing of massive network data are realized. According to the characteristics of network data, this paper designs and implements a network data analysis service tool. (2) the research of big data platform Web technology. This paper mainly studies how to browse the network data on the distributed storage engine on the Web platform and how to carry out the massive network data analysis task through the Web platform. (3) based on the management mechanism of big data platform based on Yarn, this paper analyzes the relationship between resource manager Yam and computing engine Spark, and studies how to realize the task of monitoring the Spark in big data platform by monitoring Yarn. In order to ensure the usability of the whole system platform; (4) the visualization of big data analysis results is studied. Through the research of the third party visualization plug-in, this paper proposes to introduce Echarts to present big data analysis results to the page. According to the solutions obtained by the related technical research, this paper realizes the data analysis function based on the Spark platform and the Web of big data and the platform, and verifies the effectiveness of these functions and platforms through experiments. Based on the implementation of the above key technology, this paper has completed the development of network data analysis service platform, which provides users with related network data analysis function, network data preview function, result data visualization. The functions of system monitoring provide a platform for the users to master the characteristics of their Internet behavior, and also create conditions for the providers and CDN vendors to optimize their own services.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP393.09;TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前9條
1 顧小苑;;Chubby和ZooKeeper系統(tǒng)的對(duì)比研究[J];數(shù)字技術(shù)與應(yīng)用;2016年08期
2 李媛禎;楊群;賴尚琦;李博涵;;一種Hadoop Yarn的資源調(diào)度方法研究[J];電子學(xué)報(bào);2016年05期
3 陳僑安;李峰;曹越;龍明盛;;基于運(yùn)行數(shù)據(jù)分析的Spark任務(wù)參數(shù)優(yōu)化[J];計(jì)算機(jī)工程與科學(xué);2016年01期
4 薛志云;何軍;張丹陽;曹維焯;;Hadoop和Spark在實(shí)驗(yàn)室中部署與性能評(píng)估[J];實(shí)驗(yàn)室研究與探索;2015年11期
5 ;運(yùn)用Spark加速實(shí)時(shí)數(shù)據(jù)分析[J];電腦編程技巧與維護(hù);2015年21期
6 陳虹君;;Spark框架的Graphx算法研究[J];電腦知識(shí)與技術(shù);2015年01期
7 丁圣勇;閔世武;樊勇兵;;基于Spark平臺(tái)的NetFlow流量分析系統(tǒng)[J];電信科學(xué);2014年10期
8 申德榮;于戈;王習(xí)特;聶鐵錚;寇月;;支持大數(shù)據(jù)管理的NoSQL系統(tǒng)研究綜述[J];軟件學(xué)報(bào);2013年08期
9 張延松;焦敏;王占偉;王珊;周p,
本文編號(hào):2336875
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2336875.html