用于入侵取證的大規(guī)模取證日志自動(dòng)簡(jiǎn)化技術(shù)研究
[Abstract]:With the rapid development of information technology, the problem of computer crime (such as hacking intrusion) has gradually become an unstable factor that can not be ignored, and has directly affected the normal order of the country in politics, economy, culture and other fields. In the current situation, the research of intrusion forensics is very important to combat computer crime and enhance the security of computer network. All kinds of log data are important candidate evidence sources for intrusion forensics analysis, which effectively record many user behaviors and the behavior of intrusion detection system (IDS). However, there are still many problems when the existing log data are used for forensic analysis. One of the more prominent problems is that the scale of the log data set is too large, the amount of data per week can reach hundreds of thousands or even millions of records. This inevitably causes useful information (such as attacks related events) to be annihilated in a large number of useless or redundant events triggered by normal system behavior, which makes it more difficult for intrusion forensics analysis. In this paper, a parallel forensics log automatic deletion method based on information theory and attribute weight is proposed. Its working principle is based on Hadoop open source framework, MapReduce model is used to divide multiple attributes vertically, and each attribute subset is processed in parallel. For each subset of attributes, mutual information and entropy weights are used to evaluate the correlation between the current attributes and other attributes. The attributes with larger entropy weight and smaller mutual information value must be independent. In this case, entropy weight is used as weight to weight the selected attributes, get a Score value, sort the Score value, set a threshold, and take redundant log records that need to be deleted as the intermediate results. Finally, the residual log records are simplified twice by using specially designed functions, and the redundant log records that need to be deleted are obtained. Experiments on several representative data sets on Windows and Linux platforms show that the proposed method is fast and efficient, does not require any prior knowledge, has less manual intervention and is suitable for large scale data.
【學(xué)位授予單位】:南京大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP393.08
【參考文獻(xiàn)】
相關(guān)期刊論文 前9條
1 程苗;;基于云計(jì)算的Web數(shù)據(jù)挖掘[J];計(jì)算機(jī)科學(xué);2011年S1期
2 張凈;孫志揮;宋余慶;倪巍偉;晏燕華;;基于信息論的高維海量數(shù)據(jù)離群點(diǎn)挖掘[J];計(jì)算機(jī)科學(xué);2011年07期
3 王之元;楊學(xué)軍;;并行計(jì)算系統(tǒng)度量指標(biāo)綜述[J];計(jì)算機(jī)工程與科學(xué);2010年10期
4 史偉奇;張波云;謝冬青;;基于遠(yuǎn)程控制技術(shù)的動(dòng)態(tài)取證系統(tǒng)[J];計(jì)算機(jī)工程;2007年16期
5 高獻(xiàn)偉;鄭捷文;楊澤明;許榕生;;智能網(wǎng)絡(luò)取證系統(tǒng)[J];計(jì)算機(jī)仿真;2006年03期
6 郭新濤,梁敏,阮備軍,朱揚(yáng)勇;挖掘Web日志降低信息搜尋的時(shí)間費(fèi)用[J];計(jì)算機(jī)研究與發(fā)展;2004年10期
7 莊力可;張長(zhǎng)水;勒中堅(jiān);;基于時(shí)間密度的Web日志用戶瀏覽行為分析[J];計(jì)算機(jī)科學(xué);2004年04期
8 王玲,錢(qián)華林;計(jì)算機(jī)取證技術(shù)及其發(fā)展趨勢(shì)[J];軟件學(xué)報(bào);2003年09期
9 孫安香,宋君強(qiáng),伍湘君;并行計(jì)算的數(shù)據(jù)分配[J];計(jì)算機(jī)工程與科學(xué);1997年02期
相關(guān)碩士學(xué)位論文 前1條
1 段超;基于多屬性的空間離群點(diǎn)檢測(cè)算法研究[D];華東理工大學(xué);2013年
,本文編號(hào):2253603
本文鏈接:http://sikaile.net/jingjilunwen/zhengzhijingjixuelunwen/2253603.html