面向大規(guī)模網(wǎng)絡(luò)日志的主動故障檢測方法的研究
本文選題:大數(shù)據(jù) + 網(wǎng)絡(luò)日志。 參考:《東北師范大學(xué)》2017年碩士論文
【摘要】:隨著網(wǎng)絡(luò)服務(wù)和元素的增加,其產(chǎn)生的網(wǎng)絡(luò)日志被網(wǎng)絡(luò)服務(wù)者視為監(jiān)控網(wǎng)絡(luò)健康和故障排除的重要數(shù)據(jù)源之一。在大型生產(chǎn)網(wǎng)絡(luò)中,直接分析網(wǎng)絡(luò)日志進(jìn)行主動故障檢查已成為一個具有挑戰(zhàn)性的任務(wù)。這是因?yàn)橐韵聝蓚原因:首先,日志的非結(jié)構(gòu)化:日志消息是非結(jié)構(gòu)化文本消息,不同的供應(yīng)商或者不同的操作系統(tǒng)提供的文本消息格式不同;其次,日志的多樣化:在大型生產(chǎn)網(wǎng)絡(luò)中包含著各式各樣的網(wǎng)絡(luò)設(shè)備,它們所發(fā)生的網(wǎng)絡(luò)事件會生成各種各樣的日志信息。在本文中,我將通過構(gòu)建兩個新穎的模型一同協(xié)作完成故障的自動檢測任務(wù):基于原始日志的模板提取模型和基于日志模板的分類故障檢測模型。其中,第一個模型的目標(biāo)是基于聚類的思想,從非結(jié)構(gòu)化的日志中直接地、自動地提取日志模板;第二個模型的目標(biāo)是基于日志模板建立一個故障分類器模型,從而可以判定當(dāng)前新增的日志塊是否與故障有關(guān)。即本文的主動故障檢測模型是將原始日志作為輸入,判斷日志是否與故障有關(guān)作為輸出,從而快速并主動地完成檢測任務(wù),幫助網(wǎng)絡(luò)維護(hù)者進(jìn)行預(yù)防性的維護(hù)操作以及止損操作。本文首先分析出來原始日志的最小結(jié)構(gòu),然后在不需要領(lǐng)域知識的前提下,根據(jù)日志的模板詞與參數(shù)詞理論,從三個不同的角度進(jìn)行日志模板提取,并對提取日志模板模型進(jìn)行了優(yōu)化;然后從日志模板中提取四個特征并自動表征日志模板序列的模式,采用支持向量機(jī)與高斯核函數(shù)進(jìn)行監(jiān)督機(jī)器學(xué)習(xí),分析出當(dāng)前狀態(tài)是否可能導(dǎo)致故障;最后使用了實(shí)習(xí)公司中的實(shí)際生產(chǎn)數(shù)據(jù),對兩個模型進(jìn)行優(yōu)化和準(zhǔn)確率的驗(yàn)證,驗(yàn)證了模型的實(shí)用性。
[Abstract]:With the increase of network services and elements, the network log generated by the network services is regarded as one of the important data sources for monitoring network health and troubleshooting. In large production networks, direct analysis of network logs for active fault checking has become a challenging task. This is due to two reasons: first, the unstructured nature of the log: the log message is an unstructured text message, and the format of the text message provided by different vendors or different operating systems is different; second, Log diversification: a large production network contains a wide variety of network devices, which occur network events that generate a variety of log information. In this paper, I will construct two novel models to work together to complete the automatic fault detection task: the template extraction model based on the original log and the classification fault detection model based on the log template. The goal of the first model is to extract the log template directly and automatically from the unstructured log based on the idea of clustering, and the goal of the second model is to build a fault classifier model based on the log template. This can determine whether the current new log block is related to the failure. That is, the active fault detection model in this paper takes the original log as the input, determines whether the log is related to the fault as the output, and thus completes the detection task quickly and actively. Assist network maintainers in preventive maintenance and stop loss operations. This paper first analyzes the minimum structure of the original log, and then extracts the log template from three different angles according to the theory of template words and parameter words of the log without the need of domain knowledge. The model of extracting log template is optimized, then four features are extracted from log template and the pattern of log template sequence is represented automatically. Support vector machine and Gao Si kernel function are used to supervise machine learning. Finally, the actual production data of the internship company is used to optimize and verify the accuracy of the two models, which verifies the practicability of the model.
【學(xué)位授予單位】:東北師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP311.13;TP393.06
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 王兆豐;單甘霖;;一種基于k-均值的DBSCAN算法參數(shù)動態(tài)選擇方法[J];計算機(jī)工程與應(yīng)用;2017年03期
2 周澍綺;;基于Kleinberg算法的楚辭文獻(xiàn)突發(fā)信息監(jiān)測研究[J];電腦知識與技術(shù);2015年04期
3 胡睿;林昭文;柯宏力;馬嚴(yán);;一種基于密度和滑動窗口的數(shù)據(jù)流聚類算法[J];計算機(jī)科學(xué);2011年05期
4 莊軍;郭平;周楊;周勁;蔡日旭;;路由器日志序列模式挖掘[J];計算機(jī)科學(xué);2005年11期
相關(guān)碩士學(xué)位論文 前4條
1 王振華;基于日志分析的網(wǎng)絡(luò)設(shè)備故障預(yù)測研究[D];重慶大學(xué);2015年
2 侯曉凱;基于神經(jīng)網(wǎng)絡(luò)的多狀態(tài)網(wǎng)絡(luò)設(shè)備故障預(yù)測的研究[D];山東大學(xué);2014年
3 高學(xué)玲;網(wǎng)絡(luò)健康評估與故障預(yù)測的研究與實(shí)現(xiàn)[D];西北大學(xué);2013年
4 王兆永;面向大規(guī)模批量日志數(shù)據(jù)存儲方法的研究[D];電子科技大學(xué);2011年
,本文編號:1839273
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1839273.html