基于分類器集成的網(wǎng)頁惡意代碼檢測(cè)研究
[Abstract]:In this era of rapid development of the Internet, the Internet not only enriches people's entertainment life, but also makes great contributions to people in all aspects, and improves people's lives. However, the network not only brings convenience to people's life, but also brings hidden trouble. In the rapid development of the network, lawbreakers see the opportunity to use malicious code to destroy network security and seek economic benefits. Governments and countries pay more and more attention to malicious code detection. Malicious code detection is generally divided into two methods: static detection and dynamic detection. Static detection [1] is mainly based on matching rules and feature values to extract page features. Dynamic detection [2] is by running malicious code in virtual environment, according to the behavior of malicious code to extract features, this paper is mainly aimed at JavaScript malicious code [3], based on machine learning to detect malicious code. The main work and results of this paper are as follows: 1. In this paper, the confused JavaScript code is compiled into machine code by V8 engine, and the Operand classification in machine code is simplified and mixed with the opcode according to the characteristics of malicious code. The eigenvalues are extracted by Bi-Gram and Tri-Gram according to the information gain of the processed machine code. A method based on frequency, distance and mutual information is proposed to find breakpoints for sample processing and to calculate the variable length N-gram features of a single sample. The experimental results show that the feature extraction of the mixture of operands and opcodes can express the behavior of machine code more carefully, and the problem of separating effective sequences can be avoided by the feature of variable length N-Gram statistics, and the classification effect is improved. 2. On the basis of studying common classification algorithms and classifier ensemble algorithms, aiming at the problem of single input, an integrated classifier input optimization [5] is proposed, and the input data sets are processed in different ways. Internal multiple classifiers can be trained to form a classification model for integration [6]. And by adding the secondary classifier, the original single-layer classifier integration structure is transformed into multi-level classifier integration, and the weight is introduced to set different weights for each classifier. Through training, the best weight distribution is found. Experiments show that multi-level weighted classifier ensemble has better classification effect. Based on the above algorithms, an online malicious code detection system is designed and developed. Users can submit malicious script code or site address online, the system can quickly detect. Users can submit test reports and view test reports submitted by others. Detected by the system as malicious code, the system will automatically save to the database.
【學(xué)位授予單位】:浙江工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP393.08
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 修揚(yáng);劉嘉勇;;基于操作碼序列頻率向量和行為特征向量的惡意軟件檢測(cè)[J];信息安全與通信保密;2016年09期
2 賀鳴;孫建軍;成穎;;基于樸素貝葉斯的文本分類研究綜述[J];情報(bào)科學(xué);2016年07期
3 張凱;王東安;李超;賈冰;;基于協(xié)同采樣主動(dòng)學(xué)習(xí)的惡意代碼檢測(cè)[J];高技術(shù)通訊;2016年05期
4 盧曉勇;陳木生;;基于隨機(jī)森林和欠采樣集成的垃圾網(wǎng)頁檢測(cè)[J];計(jì)算機(jī)應(yīng)用;2016年03期
5 廖國輝;劉嘉勇;;基于數(shù)據(jù)挖掘和機(jī)器學(xué)習(xí)的惡意代碼檢測(cè)方法[J];信息安全研究;2016年01期
6 付壘朋;張瀚;霍路陽;;基于多類特征的JavaScript惡意腳本檢測(cè)算法[J];模式識(shí)別與人工智能;2015年12期
7 向濤;李濤;趙雪專;李旭冬;;基于隨機(jī)森林的精確目標(biāo)檢測(cè)方法[J];計(jì)算機(jī)應(yīng)用研究;2016年09期
8 李盟;賈曉啟;王蕊;林東岱;;一種惡意代碼特征選取和建模方法[J];計(jì)算機(jī)應(yīng)用與軟件;2015年08期
9 徐青;朱焱;唐壽洪;;分析多類特征和欺詐技術(shù)檢測(cè)JavaScript惡意代碼[J];計(jì)算機(jī)應(yīng)用與軟件;2015年07期
10 宣以廣;周華;;基于字符熵的JavaScript代碼混淆自動(dòng)檢測(cè)方法[J];計(jì)算機(jī)應(yīng)用與軟件;2015年01期
相關(guān)博士學(xué)位論文 前3條
1 解男男;機(jī)器學(xué)習(xí)方法在入侵檢測(cè)中的應(yīng)用研究[D];吉林大學(xué);2015年
2 孫鑫;機(jī)器學(xué)習(xí)中特征選問題研究[D];吉林大學(xué);2013年
3 羅瑜;支持向量機(jī)在機(jī)器學(xué)習(xí)中的應(yīng)用研究[D];西南交通大學(xué);2007年
相關(guān)碩士學(xué)位論文 前3條
1 王宇恒;推薦系統(tǒng)中隨機(jī)森林算法的優(yōu)化與應(yīng)用[D];浙江大學(xué);2016年
2 李運(yùn);機(jī)器學(xué)習(xí)算法在數(shù)據(jù)挖掘中的應(yīng)用[D];北京郵電大學(xué);2015年
3 李洋;基于機(jī)器學(xué)習(xí)的網(wǎng)頁惡意代碼檢測(cè)技術(shù)研究[D];西安電子科技大學(xué);2013年
,本文編號(hào):2370581
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/2370581.html