當(dāng)前位置：主頁(yè) > 管理論文 > 移動(dòng)網(wǎng)絡(luò)論文 >

基于遷移學(xué)習(xí)的P2P流量識(shí)別研究

發(fā)布時(shí)間：2018-07-31 16:31

【摘要】：隨著基于P2P技術(shù)的互聯(lián)網(wǎng)應(yīng)用的大規(guī)模發(fā)展和用戶數(shù)量的激增,由于P2P技術(shù)對(duì)網(wǎng)絡(luò)資源的消耗,數(shù)據(jù)傳輸網(wǎng)絡(luò)在建設(shè)和維護(hù)上面臨著越來(lái)越大的壓力。如何管理好P2P應(yīng)用,使之能夠在現(xiàn)有網(wǎng)絡(luò)資源下健康發(fā)展是國(guó)內(nèi)外專家學(xué)者關(guān)注的熱點(diǎn)問(wèn)題。 P2P流量識(shí)別是管理好P2P應(yīng)用的基礎(chǔ),其研究一直沒(méi)有中斷過(guò),目前主要的算法有基于端口的檢測(cè)識(shí)別技術(shù)、基于內(nèi)容的掃描識(shí)別技術(shù),以及基于流量特征的識(shí)別技術(shù),各項(xiàng)技術(shù)在一定程度上解決了P2P流量識(shí)別的問(wèn)題,但都有各自的缺陷。機(jī)器學(xué)習(xí)算法是當(dāng)今計(jì)算機(jī)領(lǐng)域的熱門研究方向,機(jī)器學(xué)習(xí)算法是一類從數(shù)據(jù)中自動(dòng)分析獲得規(guī)律,并利用規(guī)律對(duì)未知數(shù)據(jù)進(jìn)行預(yù)測(cè)的算法。目前已有不少機(jī)器學(xué)習(xí)算法能夠?qū)2P流量進(jìn)行有效識(shí)別,但是都要基于大量的手工標(biāo)記的訓(xùn)練樣本,且這些樣本在網(wǎng)絡(luò)情況快速變化后難以重復(fù)利用。本論文在遷移學(xué)習(xí)這一全新的機(jī)器學(xué)習(xí)框架下,結(jié)合傳統(tǒng)機(jī)器學(xué)習(xí)算法提出新的技術(shù)方案來(lái)解決P2P流量識(shí)別問(wèn)題,這類新算法可以在少量手工標(biāo)記樣本的情況下獲得較好的識(shí)別正確率。本論文的主要貢獻(xiàn)和創(chuàng)新之包括以下三點(diǎn)：第一、對(duì)文本分類領(lǐng)域的基于自適應(yīng)提升的遷移學(xué)習(xí)方法進(jìn)行了研究,將其引入P2P流量識(shí)別領(lǐng)域,并提出了更注重實(shí)時(shí)性的改進(jìn)算法。基于自適應(yīng)提升的遷移學(xué)習(xí)是一種在文本分類領(lǐng)域中使用的遷移學(xué)習(xí)方法,本論文將其與P2P流量識(shí)別特點(diǎn)相結(jié)合,通過(guò)調(diào)整輔助數(shù)據(jù)的權(quán)重,使其更有針對(duì)性的遷移到源數(shù)據(jù)中,組成綜合訓(xùn)練集對(duì)分類器進(jìn)行訓(xùn)練,最終得到一個(gè)可靠的P2P識(shí)別器。在此基礎(chǔ)上,本論文還通過(guò)使用基于迭代錯(cuò)誤率的輔助數(shù)據(jù)動(dòng)態(tài)裁剪技術(shù),去除了與源數(shù)據(jù)相差過(guò)大的輔助數(shù)據(jù),加快了迭代速度,減少了時(shí)間消耗。仿真實(shí)驗(yàn)證明改進(jìn)后的算法更具有實(shí)時(shí)性和應(yīng)用性。第二、將傳統(tǒng)的K近鄰法與遷移學(xué)習(xí)框架相結(jié)合,提出了一種基于K近鄰的遷移學(xué)習(xí)方法,將其用于P2P流量識(shí)別領(lǐng)域并在復(fù)雜度方面該改進(jìn)了算法。該算法利用K近鄰法篩選輔助數(shù)據(jù),去除與源數(shù)據(jù)相差較大的輔助數(shù)據(jù),使與源數(shù)據(jù)更相似的輔助數(shù)據(jù)與源數(shù)據(jù)組成綜合訓(xùn)練集,共同訓(xùn)練可靠的P2P流量識(shí)別分類器。在此基礎(chǔ)上,本論文還通過(guò)奇異值分解進(jìn)行預(yù)分組,減少了K近鄰法部分的計(jì)算量,仿真實(shí)驗(yàn)也證實(shí)了該算法的有效性,以及改進(jìn)算法可以增強(qiáng)整個(gè)算法的實(shí)時(shí)性。第三、建立了一套簡(jiǎn)易的基于Java和Web的P2P流量識(shí)別系統(tǒng),方便算法和數(shù)據(jù)集的檢驗(yàn)和交流。該系統(tǒng)在上述兩種算法的基礎(chǔ)上,以Web為界面,Java語(yǔ)言為核心實(shí)現(xiàn)了這兩種算法,并將其公開(kāi),使用者可以上傳自己的數(shù)據(jù)集加以識(shí)別或下載他人的數(shù)據(jù)集,為P2P流量識(shí)別算法的交流提供了一個(gè)有效的平臺(tái)。
[Abstract]:With the large-scale development of P2P technology based Internet application and the rapid increase of the number of users, data transmission network is facing more and more pressure in construction and maintenance because of the consumption of P2P technology to network resources. How to manage P2P applications well and enable them to develop healthily under the existing network resources is a hot issue that experts and scholars at home and abroad pay close attention to. P2P traffic identification is the foundation of managing P2P applications, and its research has not been interrupted. At present, the main algorithms are port based detection and identification technology, content-based scanning recognition technology, and traffic feature recognition technology. To some extent, each technology solves the problem of P2P traffic identification, but each has its own defects. Machine learning algorithm is a hot research direction in the field of computer nowadays. Machine learning algorithm is a kind of algorithm which can automatically analyze and obtain laws from data and use them to predict unknown data. At present, there are many machine learning algorithms that can effectively identify P2P traffic, but they are all based on a large number of manually labeled training samples, and these samples are difficult to reuse after the rapid change of network conditions. In this paper, under the new machine learning framework of migration learning, combined with the traditional machine learning algorithm, a new technical scheme is proposed to solve the P2P traffic identification problem. This new algorithm can obtain better recognition accuracy in the case of a small number of manually labeled samples. The main contributions and innovations of this thesis are as follows: first, the paper studies the migration learning method based on adaptive lifting in the field of text classification, and introduces it into the field of P2P traffic identification. An improved algorithm which pays more attention to real-time is put forward. Transfer learning based on adaptive lifting is a migration learning method used in the field of text classification. This paper combines it with the characteristics of P2P traffic identification and adjusts the weight of auxiliary data. So that it can migrate to the source data more pertinently, form the comprehensive training set to train the classifier, and finally get a reliable P2P recognizer. On this basis, this paper also uses the auxiliary data dynamic clipping technology based on iterative error rate to remove the auxiliary data which is too different from the source data, accelerate the iteration speed and reduce the time consumption. Simulation results show that the improved algorithm is more real-time and applicable. Secondly, by combining the traditional K-nearest neighbor method with the transfer learning framework, a K-nearest neighbor based transfer learning method is proposed, which is applied to P2P traffic identification and improves the algorithm in terms of complexity. The algorithm uses K-nearest neighbor method to filter the auxiliary data, removes the auxiliary data which is different from the source data, and makes the auxiliary data and the source data more similar to the source data to form a comprehensive training set, together to train a reliable P2P traffic classifier. On this basis, the algorithm is pregrouped by singular value decomposition, which reduces the computational cost of the K-nearest neighbor method. The simulation results show that the algorithm is effective and the improved algorithm can enhance the real-time performance of the whole algorithm. Thirdly, a simple peer-to-peer traffic identification system based on Java and Web is established to facilitate the verification and communication of algorithms and data sets. On the basis of the above two algorithms, the system realizes these two algorithms with Web as the core language, and exposes them. Users can upload their own data sets to identify or download the data sets of others. It provides an effective platform for the exchange of P2P traffic identification algorithms.
【學(xué)位授予單位】：北京郵電大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2014
【分類號(hào)】：TP18;TP393.02

【參考文獻(xiàn)】

相關(guān)期刊論文前9條

1 徐雅斌;李艷平;劉曦子;;一個(gè)基于云計(jì)算的P2P流量識(shí)別系統(tǒng)模型的研究[J];電信科學(xué);2012年10期

2 柴寶仁;谷文成;牛占云;周宏君;王克生;;基于Boosting算法的垃圾郵件過(guò)濾方法研究[J];北京理工大學(xué)學(xué)報(bào);2013年01期

3 王丹,魏紅;P2P模式的系統(tǒng)結(jié)構(gòu)研究[J];沈陽(yáng)航空工業(yè)學(xué)院學(xué)報(bào);2003年02期

4 徐鵬;劉瓊;林森;;基于支持向量機(jī)的Internet流量分類研究[J];計(jì)算機(jī)研究與發(fā)展;2009年03期

5 黎俊鋒;朱鋒峰;;基于樣本密度的FCM改進(jìn)算法[J];科學(xué)技術(shù)與工程;2007年04期

6 胡愛(ài)娜;;基于MapReduce的分布式EM算法的研究與應(yīng)用[J];科技通報(bào);2013年06期

7 鄒臘梅;肖基毅;龔向堅(jiān);;Web文本挖掘技術(shù)研究[J];情報(bào)雜志;2007年02期

8 魯剛;張宏莉;葉麟;;P2P流量識(shí)別[J];軟件學(xué)報(bào);2011年06期

9 譚駿;陳興蜀;杜敏;;基于特征加權(quán)與最近鄰法的P2P協(xié)議識(shí)別算法[J];四川大學(xué)學(xué)報(bào)(工程科學(xué)版);2011年04期

，

本文編號(hào)：2156157

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/guanlilunwen/ydhl/2156157.html

上一篇：基于Hadoop的網(wǎng)絡(luò)流量數(shù)據(jù)處理系統(tǒng)的實(shí)現(xiàn)與應(yīng)用
下一篇：基于任務(wù)復(fù)制的多維QoS云計(jì)算任務(wù)調(diào)度

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于遷移學(xué)習(xí)的P2P流量識(shí)別研究