基于遷移學(xué)習(xí)的P2P流量識(shí)別研究
[Abstract]:With the large-scale development of P2P technology based Internet application and the rapid increase of the number of users, data transmission network is facing more and more pressure in construction and maintenance because of the consumption of P2P technology to network resources. How to manage P2P applications well and enable them to develop healthily under the existing network resources is a hot issue that experts and scholars at home and abroad pay close attention to. P2P traffic identification is the foundation of managing P2P applications, and its research has not been interrupted. At present, the main algorithms are port based detection and identification technology, content-based scanning recognition technology, and traffic feature recognition technology. To some extent, each technology solves the problem of P2P traffic identification, but each has its own defects. Machine learning algorithm is a hot research direction in the field of computer nowadays. Machine learning algorithm is a kind of algorithm which can automatically analyze and obtain laws from data and use them to predict unknown data. At present, there are many machine learning algorithms that can effectively identify P2P traffic, but they are all based on a large number of manually labeled training samples, and these samples are difficult to reuse after the rapid change of network conditions. In this paper, under the new machine learning framework of migration learning, combined with the traditional machine learning algorithm, a new technical scheme is proposed to solve the P2P traffic identification problem. This new algorithm can obtain better recognition accuracy in the case of a small number of manually labeled samples. The main contributions and innovations of this thesis are as follows: first, the paper studies the migration learning method based on adaptive lifting in the field of text classification, and introduces it into the field of P2P traffic identification. An improved algorithm which pays more attention to real-time is put forward. Transfer learning based on adaptive lifting is a migration learning method used in the field of text classification. This paper combines it with the characteristics of P2P traffic identification and adjusts the weight of auxiliary data. So that it can migrate to the source data more pertinently, form the comprehensive training set to train the classifier, and finally get a reliable P2P recognizer. On this basis, this paper also uses the auxiliary data dynamic clipping technology based on iterative error rate to remove the auxiliary data which is too different from the source data, accelerate the iteration speed and reduce the time consumption. Simulation results show that the improved algorithm is more real-time and applicable. Secondly, by combining the traditional K-nearest neighbor method with the transfer learning framework, a K-nearest neighbor based transfer learning method is proposed, which is applied to P2P traffic identification and improves the algorithm in terms of complexity. The algorithm uses K-nearest neighbor method to filter the auxiliary data, removes the auxiliary data which is different from the source data, and makes the auxiliary data and the source data more similar to the source data to form a comprehensive training set, together to train a reliable P2P traffic classifier. On this basis, the algorithm is pregrouped by singular value decomposition, which reduces the computational cost of the K-nearest neighbor method. The simulation results show that the algorithm is effective and the improved algorithm can enhance the real-time performance of the whole algorithm. Thirdly, a simple peer-to-peer traffic identification system based on Java and Web is established to facilitate the verification and communication of algorithms and data sets. On the basis of the above two algorithms, the system realizes these two algorithms with Web as the core language, and exposes them. Users can upload their own data sets to identify or download the data sets of others. It provides an effective platform for the exchange of P2P traffic identification algorithms.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP18;TP393.02
【參考文獻(xiàn)】
相關(guān)期刊論文 前9條
1 徐雅斌;李艷平;劉曦子;;一個(gè)基于云計(jì)算的P2P流量識(shí)別系統(tǒng)模型的研究[J];電信科學(xué);2012年10期
2 柴寶仁;谷文成;牛占云;周宏君;王克生;;基于Boosting算法的垃圾郵件過(guò)濾方法研究[J];北京理工大學(xué)學(xué)報(bào);2013年01期
3 王丹,魏紅;P2P模式的系統(tǒng)結(jié)構(gòu)研究[J];沈陽(yáng)航空工業(yè)學(xué)院學(xué)報(bào);2003年02期
4 徐鵬;劉瓊;林森;;基于支持向量機(jī)的Internet流量分類研究[J];計(jì)算機(jī)研究與發(fā)展;2009年03期
5 黎俊鋒;朱鋒峰;;基于樣本密度的FCM改進(jìn)算法[J];科學(xué)技術(shù)與工程;2007年04期
6 胡愛(ài)娜;;基于MapReduce的分布式EM算法的研究與應(yīng)用[J];科技通報(bào);2013年06期
7 鄒臘梅;肖基毅;龔向堅(jiān);;Web文本挖掘技術(shù)研究[J];情報(bào)雜志;2007年02期
8 魯剛;張宏莉;葉麟;;P2P流量識(shí)別[J];軟件學(xué)報(bào);2011年06期
9 譚駿;陳興蜀;杜敏;;基于特征加權(quán)與最近鄰法的P2P協(xié)議識(shí)別算法[J];四川大學(xué)學(xué)報(bào)(工程科學(xué)版);2011年04期
,本文編號(hào):2156157
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2156157.html