融合杜鵑搜索的粒子群算法的P2P流量識別方法研究
發(fā)布時間:2019-02-14 12:21
【摘要】:隨著互聯(lián)網(wǎng)的發(fā)展,對等網(wǎng)絡(luò)技術(shù)(Peer-to-Peer,簡稱P2P)得到了廣泛使用,已經(jīng)占據(jù)互聯(lián)網(wǎng)業(yè)務(wù)總量50%以上。一方面給人們工作生活帶來便利,另一方面P2P也帶來網(wǎng)絡(luò)擁塞,信息安全等難題。所以有必要對P2P流量進(jìn)行管理和控制,因此實(shí)現(xiàn)對P2P流量識別的問題變成了關(guān)鍵。 P2P流量識別本質(zhì)上是模式識別問題,其識別的準(zhǔn)確性很大程度上取決于選擇的流量特征和構(gòu)建的分類器方法,本文主要圍繞杜鵑搜索和粒子群算法在P2P流量特征選擇和最優(yōu)P2P分類器構(gòu)建中的應(yīng)用展開了深入研究,主要工作如下。 (1)融合杜鵑搜索的粒子群算法的P2P流量特征選擇。在P2P流量識別問題中,通常單一特征識別率低,因此在實(shí)際工作中需要引入多種特征來提高流量識別率。雖然采用支持向量機(jī)(Support Vector machine,SVM)分類器能夠克服維數(shù)災(zāi)難問題,但是過多的特征還是無法避免這個問題,而且會增加流量特征采樣的工作量,導(dǎo)致識別算法識別效率下降,難以滿足識別實(shí)時性的問題。因此,可以這里引入融合杜鵑搜索的粒子群算法的特征選擇新方法,在眾多特征集合中選擇出具有最佳分類性能的特征子集,以提高識別算法識別精度和計(jì)算效率。 (2)基于融合杜鵑搜索的粒子群算法的P2P流量識別方法。SVM能夠很好的解決傳統(tǒng)的機(jī)器學(xué)習(xí)面臨的過學(xué)習(xí)、欠學(xué)習(xí),陷入局部最優(yōu)解和維數(shù)災(zāi)難等問題,因此,,本文應(yīng)用SVM進(jìn)行P2P流量識別。然而SVM的懲罰參數(shù)和核函數(shù)以及核函數(shù)的參數(shù)在極大程度上影響了SVM的性能。實(shí)際操作中沒有公認(rèn)的參數(shù)調(diào)節(jié)方法,常用的參數(shù)調(diào)節(jié)方法要么計(jì)算費(fèi)時如網(wǎng)格搜索,或者易陷入局部最優(yōu),如基于遺傳算法的SVM參數(shù)優(yōu)化。因此,本文采用融合杜鵑搜索的粒子群算法對支持向量機(jī)參數(shù)進(jìn)行優(yōu)化。 最后,對于本文所提出的特征選擇和SVM參數(shù)優(yōu)化方法,在機(jī)器學(xué)習(xí)UCI數(shù)據(jù)庫和真實(shí)校園P2P數(shù)據(jù)上進(jìn)行了測試,并和已有遺傳算法,粒子群算法,杜鵑搜索算法等進(jìn)行了實(shí)驗(yàn)對比。結(jié)果表明本文提出的融合杜鵑搜索的粒子群算法的特征選擇算法能夠獲得優(yōu)秀的特征子集,經(jīng)過本文算法優(yōu)化后的SVM也具有更好的識別性能。
[Abstract]:With the development of the Internet, Peer-to-Peer Network (Peer-to-Peer,) technology has been widely used, accounting for more than 50% of the total Internet services. On the one hand, it brings convenience to people's work and life, on the other hand, P2P also brings network congestion, information security and other problems. So it is necessary to manage and control P2P traffic, so the problem of P2P traffic identification becomes the key. P2P traffic recognition is essentially a pattern recognition problem. The accuracy of P2P traffic recognition depends to a great extent on the selected traffic characteristics and the constructed classifier method. This paper focuses on the application of rhododendron search and particle swarm optimization in P2P traffic feature selection and optimal P2P classifier construction. The main work is as follows. (1) P2P traffic feature selection based on particle swarm optimization (PSO) based on rhododendron search. In P2P traffic identification problem, the single feature recognition rate is usually low, so it is necessary to introduce a variety of features to improve the traffic identification rate in the actual work. Although the support vector machine (Support Vector machine,SVM) classifier can overcome the problem of dimensionality disaster, too many features can not avoid the problem, and the workload of traffic feature sampling will be increased, which will result in the reduction of recognition efficiency of the recognition algorithm. It is difficult to meet the problem of real-time recognition. Therefore, a new feature selection method based on the particle swarm optimization (PSO) combined with rhododendron search can be introduced here to select the feature subset with the best classification performance from many feature sets, so as to improve the recognition accuracy and computational efficiency of the recognition algorithm. (2) based on the particle swarm optimization algorithm of rhododendron search, SVM can solve the problems of overlearning, underlearning, falling into local optimal solution and dimensionality disaster in traditional machine learning. This paper uses SVM to identify P2P traffic. However, the penalty parameters and kernel functions and kernel function parameters of SVM greatly affect the performance of SVM. In practice, there is no recognized parameter adjustment method. The commonly used parameter adjustment methods are either time-consuming to compute, such as grid search, or easily fall into local optimum, such as SVM parameter optimization based on genetic algorithm. Therefore, the support vector machine (SVM) parameters are optimized by the particle swarm optimization (PSO) algorithm combined with rhododendron search. Finally, the methods of feature selection and SVM parameter optimization proposed in this paper are tested on the machine learning UCI database and real campus P2P data, and the existing genetic algorithm, particle swarm optimization algorithm, The algorithm of rhododendron search is compared. The results show that the feature selection algorithm of the particle swarm optimization algorithm combined with rhododendron search proposed in this paper can obtain an excellent feature subset, and the SVM optimized by this algorithm has better recognition performance.
【學(xué)位授予單位】:湖北工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.02;TP18
本文編號:2422198
[Abstract]:With the development of the Internet, Peer-to-Peer Network (Peer-to-Peer,) technology has been widely used, accounting for more than 50% of the total Internet services. On the one hand, it brings convenience to people's work and life, on the other hand, P2P also brings network congestion, information security and other problems. So it is necessary to manage and control P2P traffic, so the problem of P2P traffic identification becomes the key. P2P traffic recognition is essentially a pattern recognition problem. The accuracy of P2P traffic recognition depends to a great extent on the selected traffic characteristics and the constructed classifier method. This paper focuses on the application of rhododendron search and particle swarm optimization in P2P traffic feature selection and optimal P2P classifier construction. The main work is as follows. (1) P2P traffic feature selection based on particle swarm optimization (PSO) based on rhododendron search. In P2P traffic identification problem, the single feature recognition rate is usually low, so it is necessary to introduce a variety of features to improve the traffic identification rate in the actual work. Although the support vector machine (Support Vector machine,SVM) classifier can overcome the problem of dimensionality disaster, too many features can not avoid the problem, and the workload of traffic feature sampling will be increased, which will result in the reduction of recognition efficiency of the recognition algorithm. It is difficult to meet the problem of real-time recognition. Therefore, a new feature selection method based on the particle swarm optimization (PSO) combined with rhododendron search can be introduced here to select the feature subset with the best classification performance from many feature sets, so as to improve the recognition accuracy and computational efficiency of the recognition algorithm. (2) based on the particle swarm optimization algorithm of rhododendron search, SVM can solve the problems of overlearning, underlearning, falling into local optimal solution and dimensionality disaster in traditional machine learning. This paper uses SVM to identify P2P traffic. However, the penalty parameters and kernel functions and kernel function parameters of SVM greatly affect the performance of SVM. In practice, there is no recognized parameter adjustment method. The commonly used parameter adjustment methods are either time-consuming to compute, such as grid search, or easily fall into local optimum, such as SVM parameter optimization based on genetic algorithm. Therefore, the support vector machine (SVM) parameters are optimized by the particle swarm optimization (PSO) algorithm combined with rhododendron search. Finally, the methods of feature selection and SVM parameter optimization proposed in this paper are tested on the machine learning UCI database and real campus P2P data, and the existing genetic algorithm, particle swarm optimization algorithm, The algorithm of rhododendron search is compared. The results show that the feature selection algorithm of the particle swarm optimization algorithm combined with rhododendron search proposed in this paper can obtain an excellent feature subset, and the SVM optimized by this algorithm has better recognition performance.
【學(xué)位授予單位】:湖北工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.02;TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 孫玉芬;盧炎生;;流數(shù)據(jù)挖掘綜述[J];計(jì)算機(jī)科學(xué);2007年01期
2 葉志偉;鄭肇葆;萬幼川;虞欣;;基于蟻群優(yōu)化的特征選擇新方法[J];武漢大學(xué)學(xué)報(bào)(信息科學(xué)版);2007年12期
相關(guān)博士學(xué)位論文 前1條
1 劉衍民;粒子群算法的研究及應(yīng)用[D];山東師范大學(xué);2011年
本文編號:2422198
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2422198.html
最近更新
教材專著