基于流行為特征分析的網(wǎng)絡(luò)端目標(biāo)表征與識(shí)別方法研究
發(fā)布時(shí)間:2018-11-26 16:36
【摘要】:隨著互聯(lián)網(wǎng)的飛速發(fā)展,如何有效地來(lái)對(duì)網(wǎng)絡(luò)流量和用戶行為進(jìn)行監(jiān)管,構(gòu)建一個(gè)文明健康、可信穩(wěn)定的網(wǎng)絡(luò)空間,漸漸引起了研究者們的注意。因此,如何對(duì)網(wǎng)絡(luò)中不同的人(即端目標(biāo))進(jìn)行表征與識(shí)別開(kāi)始成為當(dāng)前研究者們關(guān)注的一個(gè)焦點(diǎn)。近年來(lái)研究較多的是如何利用流行為特征對(duì)網(wǎng)絡(luò)流進(jìn)行分類,而將其應(yīng)用于網(wǎng)絡(luò)端目標(biāo)的表征與識(shí)別的研究則相對(duì)較少。針對(duì)上述網(wǎng)絡(luò)端目標(biāo)表征與識(shí)別的研究現(xiàn)狀,本文提出基于服務(wù)類型劃分的分析方法,首先根據(jù)不同的服務(wù)類型對(duì)流量進(jìn)行分類,并應(yīng)用于網(wǎng)絡(luò)流的行為特征的提取和選擇,得到網(wǎng)絡(luò)端目標(biāo)的表征,隨后引入機(jī)器學(xué)習(xí)和社團(tuán)發(fā)現(xiàn)算法,最終完成網(wǎng)絡(luò)端目標(biāo)的識(shí)別,并取得了不錯(cuò)的效果。主要工作如下:(1)針對(duì)個(gè)體端目標(biāo)的識(shí)別,即識(shí)別一個(gè)特定的用戶行為是由哪個(gè)端目標(biāo)產(chǎn)生的,本文引入了基于機(jī)器學(xué)習(xí)的分類方法。首先將用戶的流量梳理到作者劃分的24種服務(wù)類型之下,用于構(gòu)建端目標(biāo)的流量矩陣,接著就是對(duì)原始的數(shù)據(jù)包處理得到分析所需的相關(guān)流行為特征,經(jīng)過(guò)特征選擇之后最后得到用于表征一個(gè)端目標(biāo)的特征參數(shù)集,如此一天的流量數(shù)據(jù)便可以轉(zhuǎn)化為表征該端目標(biāo)行為的一個(gè)樣本。采集了足夠多的樣本數(shù)據(jù)之后,便得到了機(jī)器學(xué)習(xí)所需的樣本數(shù)據(jù),經(jīng)過(guò)對(duì)樣本數(shù)據(jù)的手工標(biāo)記之后,本文采用機(jī)器學(xué)習(xí)中的C4.5決策樹(shù)算法將樣本數(shù)據(jù)用于訓(xùn)練和測(cè)試,最終取得了不錯(cuò)的識(shí)別效果。(2)針對(duì)個(gè)體端目標(biāo)之間的行為相似性,即發(fā)現(xiàn)網(wǎng)絡(luò)中潛在的社團(tuán)群體,本文提出了基于流行為特征分析的社團(tuán)發(fā)現(xiàn)算法來(lái)進(jìn)行分析。由于需要衡量端目標(biāo)之間的行為相似性,作者分別使用Dice相似度計(jì)算流行為特征的相似度,余弦相似度計(jì)算服務(wù)類型的相似度,構(gòu)建相似度矩陣。最后利用社團(tuán)發(fā)現(xiàn)算法分別得出基于流行為特征和服務(wù)類型的社團(tuán)結(jié)構(gòu)劃分,綜合兩者的結(jié)果得到最終的社團(tuán)劃分結(jié)果。
[Abstract]:With the rapid development of the Internet, how to regulate the network traffic and user behavior effectively and build a civilized, healthy, credible and stable network space has gradually attracted the attention of researchers. Therefore, how to characterize and identify different people in the network has become a focus of attention. In recent years, much research has been done on how to classify network flows by using popular features, but relatively few studies have been made on their application to the characterization and recognition of network end targets. In view of the research status of target representation and recognition on the network side, this paper proposes an analysis method based on the classification of service types. Firstly, traffic is classified according to different service types, and it is applied to the extraction and selection of behavior characteristics of network flows. Then the machine learning and community discovery algorithms are introduced to realize the recognition of the target in the network, and good results are obtained. The main work is as follows: (1) for the recognition of individual target, that is, to identify which end target a particular user behavior is generated by, this paper introduces a classification method based on machine learning. First of all, the user traffic is combed under the 24 kinds of service types divided by the author, which is used to construct the traffic matrix of the end target, and then it is characterized by the related popularity needed for the analysis of the original data packet processing. After feature selection, a feature parameter set is obtained to represent an end target, so that the traffic data of a day can be transformed into a sample to represent the behavior of the end target. After collecting enough sample data, the sample data needed for machine learning is obtained. After manual marking of the sample data, this paper uses C4.5 decision tree algorithm in machine learning to train and test the sample data. Finally, a good recognition effect is achieved. (2) aiming at the behavior similarity between individual targets, that is, to find the potential community groups in the network, this paper proposes a community discovery algorithm based on popular feature analysis to analyze the behavior. Due to the need to measure the behavioral similarity between the end targets, the author uses Dice similarity to calculate the similarity of popular features and cosine similarity to calculate the similarity of service types, and constructs a similarity matrix. Finally, the community structure partition based on the popular feature and service type is obtained by using the community discovery algorithm, and the final community partition result is obtained by synthesizing the two results.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP393.06
[Abstract]:With the rapid development of the Internet, how to regulate the network traffic and user behavior effectively and build a civilized, healthy, credible and stable network space has gradually attracted the attention of researchers. Therefore, how to characterize and identify different people in the network has become a focus of attention. In recent years, much research has been done on how to classify network flows by using popular features, but relatively few studies have been made on their application to the characterization and recognition of network end targets. In view of the research status of target representation and recognition on the network side, this paper proposes an analysis method based on the classification of service types. Firstly, traffic is classified according to different service types, and it is applied to the extraction and selection of behavior characteristics of network flows. Then the machine learning and community discovery algorithms are introduced to realize the recognition of the target in the network, and good results are obtained. The main work is as follows: (1) for the recognition of individual target, that is, to identify which end target a particular user behavior is generated by, this paper introduces a classification method based on machine learning. First of all, the user traffic is combed under the 24 kinds of service types divided by the author, which is used to construct the traffic matrix of the end target, and then it is characterized by the related popularity needed for the analysis of the original data packet processing. After feature selection, a feature parameter set is obtained to represent an end target, so that the traffic data of a day can be transformed into a sample to represent the behavior of the end target. After collecting enough sample data, the sample data needed for machine learning is obtained. After manual marking of the sample data, this paper uses C4.5 decision tree algorithm in machine learning to train and test the sample data. Finally, a good recognition effect is achieved. (2) aiming at the behavior similarity between individual targets, that is, to find the potential community groups in the network, this paper proposes a community discovery algorithm based on popular feature analysis to analyze the behavior. Due to the need to measure the behavioral similarity between the end targets, the author uses Dice similarity to calculate the similarity of popular features and cosine similarity to calculate the similarity of service types, and constructs a similarity matrix. Finally, the community structure partition based on the popular feature and service type is obtained by using the community discovery algorithm, and the final community partition result is obtained by synthesizing the two results.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP393.06
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 李喬;何慧;方濱興;張宏莉;王雅山;;基于信任的網(wǎng)絡(luò)群體異常行為發(fā)現(xiàn)[J];計(jì)算機(jī)學(xué)報(bào);2014年01期
2 劉興彬;楊建華;謝高崗;胡s,
本文編號(hào):2359073
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2359073.html
最近更新
教材專著