多資源服務(wù)器協(xié)同環(huán)境下的HTTP流量分析
發(fā)布時(shí)間:2019-01-26 20:08
【摘要】:數(shù)年以前,基于HTTP的網(wǎng)絡(luò)業(yè)務(wù)由若干服務(wù)提供商以中央集中的方式提供,鮮有分布式服務(wù)器的存在。通常的情況是,單一服務(wù)器提供獨(dú)有的網(wǎng)絡(luò)服務(wù),并且固定在某個(gè)IP地址上。現(xiàn)如今,網(wǎng)絡(luò)結(jié)構(gòu)日益復(fù)雜,IP地址與其提供的內(nèi)容及服務(wù)開始變得動(dòng)態(tài)化和復(fù)雜化:運(yùn)營商大量使用內(nèi)容分發(fā)網(wǎng)絡(luò)(CDN, Content delivery network)、內(nèi)容緩存,基于云的網(wǎng)絡(luò)服務(wù)不斷涌現(xiàn),服務(wù)提供商與承載服務(wù)的基礎(chǔ)設(shè)備之間耦合程度正在減弱,所有這些都使得網(wǎng)絡(luò)管理更加困難。在如此形勢下,運(yùn)營商迫切需要把握HTTP流量構(gòu)成及使用模式,搞清HTTP流量在不同服務(wù)提供商間的分布,以便合理配置網(wǎng)絡(luò)資源。與此同時(shí),由于網(wǎng)絡(luò)流量的劇增,傳統(tǒng)的流量分析方法已無法滿足海量數(shù)據(jù)的存儲和處理要求,需要引入更高效、更可靠的方式進(jìn)行處理。Hadoop正是一個(gè)能夠?qū)A繑?shù)據(jù)進(jìn)行可靠的分布式處理的可擴(kuò)展開源軟件框架,并已經(jīng)被應(yīng)用于越來越多的研究領(lǐng)域。 本文首先介紹了基于關(guān)聯(lián)規(guī)則的HTTP流量分析算法,利用jaccard系數(shù)衡量流量相關(guān)性并給出數(shù)學(xué)描述。 隨后,本文介紹了Hadoop的基本原理,并在Hadoop技術(shù)的基礎(chǔ)上提出了HTTP流量分析系統(tǒng)的三層體系結(jié)構(gòu),將網(wǎng)絡(luò)流量的采集、存儲、處理和分析等獨(dú)立的功能整合到一起,形成具備完整功能的處理系統(tǒng)。 接著,本文對前述系統(tǒng)數(shù)據(jù)層的IP地址識別組件進(jìn)行了重點(diǎn)介紹。此組件實(shí)現(xiàn)了服務(wù)器IP地址向服務(wù)提供商的映射,是本文所述HTTP流量分析系統(tǒng)最重要的組成部分。 最后,利用系統(tǒng)采集層和數(shù)據(jù)層的處理的中間結(jié)果,本文在HTTP流量分析應(yīng)用層總結(jié)了HTTP流量分布規(guī)律。
[Abstract]:A few years ago, the network service based on HTTP was provided by several service providers in a centralized way, and there were few distributed servers. Typically, a single server provides a unique network service and is fixed to a IP address. Nowadays, with the increasing complexity of network structure, IP addresses and their contents and services are becoming more and more dynamic and complicated: operators use a lot of content to distribute network (CDN, Content delivery network), content cache, and cloud-based network services continue to emerge. The coupling between service providers and the infrastructure that hosts the services is decreasing, all of which make network management more difficult. In such a situation, operators urgently need to grasp the HTTP traffic structure and usage mode, to find out the distribution of HTTP traffic among different service providers, in order to allocate network resources reasonably. At the same time, due to the rapid increase of network traffic, the traditional traffic analysis method can no longer meet the storage and processing requirements of massive data, so it is necessary to introduce more efficient. Hadoop is a scalable open source software framework capable of reliably distributed processing massive data and has been used in more and more research fields. This paper first introduces the HTTP traffic analysis algorithm based on association rules, and uses the jaccard coefficient to measure the traffic correlation and gives the mathematical description. Then, this paper introduces the basic principle of Hadoop, and puts forward the three-layer architecture of HTTP traffic analysis system based on Hadoop technology, which integrates the independent functions of network traffic collection, storage, processing and analysis. Form a complete function of the processing system. Then, this paper focuses on the IP address recognition component of the system data layer. This component realizes the mapping of server IP address to service provider and is the most important component of HTTP traffic analysis system described in this paper. Finally, using the intermediate results of the system collection layer and the data layer, this paper summarizes the HTTP traffic distribution law in the HTTP traffic analysis application layer.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:TP393.06
本文編號:2415863
[Abstract]:A few years ago, the network service based on HTTP was provided by several service providers in a centralized way, and there were few distributed servers. Typically, a single server provides a unique network service and is fixed to a IP address. Nowadays, with the increasing complexity of network structure, IP addresses and their contents and services are becoming more and more dynamic and complicated: operators use a lot of content to distribute network (CDN, Content delivery network), content cache, and cloud-based network services continue to emerge. The coupling between service providers and the infrastructure that hosts the services is decreasing, all of which make network management more difficult. In such a situation, operators urgently need to grasp the HTTP traffic structure and usage mode, to find out the distribution of HTTP traffic among different service providers, in order to allocate network resources reasonably. At the same time, due to the rapid increase of network traffic, the traditional traffic analysis method can no longer meet the storage and processing requirements of massive data, so it is necessary to introduce more efficient. Hadoop is a scalable open source software framework capable of reliably distributed processing massive data and has been used in more and more research fields. This paper first introduces the HTTP traffic analysis algorithm based on association rules, and uses the jaccard coefficient to measure the traffic correlation and gives the mathematical description. Then, this paper introduces the basic principle of Hadoop, and puts forward the three-layer architecture of HTTP traffic analysis system based on Hadoop technology, which integrates the independent functions of network traffic collection, storage, processing and analysis. Form a complete function of the processing system. Then, this paper focuses on the IP address recognition component of the system data layer. This component realizes the mapping of server IP address to service provider and is the most important component of HTTP traffic analysis system described in this paper. Finally, using the intermediate results of the system collection layer and the data layer, this paper summarizes the HTTP traffic distribution law in the HTTP traffic analysis application layer.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:TP393.06
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 劉軍;李銀周;Felix Cuadrado;Steve Uhlig;雷振明;;基于Jaccard的移動(dòng)終端自動(dòng)識別并行算法及其MapReduce實(shí)現(xiàn)(英文)[J];中國通信;2013年07期
,本文編號:2415863
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2415863.html
最近更新
教材專著