面向大規(guī)模網(wǎng)絡(luò)日志的主動(dòng)故障檢測(cè)方法的研究
本文選題:大數(shù)據(jù) + 網(wǎng)絡(luò)日志。 參考:《東北師范大學(xué)》2017年碩士論文
【摘要】:隨著網(wǎng)絡(luò)服務(wù)和元素的增加,其產(chǎn)生的網(wǎng)絡(luò)日志被網(wǎng)絡(luò)服務(wù)者視為監(jiān)控網(wǎng)絡(luò)健康和故障排除的重要數(shù)據(jù)源之一。在大型生產(chǎn)網(wǎng)絡(luò)中,直接分析網(wǎng)絡(luò)日志進(jìn)行主動(dòng)故障檢查已成為一個(gè)具有挑戰(zhàn)性的任務(wù)。這是因?yàn)橐韵聝蓚(gè)原因:首先,日志的非結(jié)構(gòu)化:日志消息是非結(jié)構(gòu)化文本消息,不同的供應(yīng)商或者不同的操作系統(tǒng)提供的文本消息格式不同;其次,日志的多樣化:在大型生產(chǎn)網(wǎng)絡(luò)中包含著各式各樣的網(wǎng)絡(luò)設(shè)備,它們所發(fā)生的網(wǎng)絡(luò)事件會(huì)生成各種各樣的日志信息。在本文中,我將通過構(gòu)建兩個(gè)新穎的模型一同協(xié)作完成故障的自動(dòng)檢測(cè)任務(wù):基于原始日志的模板提取模型和基于日志模板的分類故障檢測(cè)模型。其中,第一個(gè)模型的目標(biāo)是基于聚類的思想,從非結(jié)構(gòu)化的日志中直接地、自動(dòng)地提取日志模板;第二個(gè)模型的目標(biāo)是基于日志模板建立一個(gè)故障分類器模型,從而可以判定當(dāng)前新增的日志塊是否與故障有關(guān)。即本文的主動(dòng)故障檢測(cè)模型是將原始日志作為輸入,判斷日志是否與故障有關(guān)作為輸出,從而快速并主動(dòng)地完成檢測(cè)任務(wù),幫助網(wǎng)絡(luò)維護(hù)者進(jìn)行預(yù)防性的維護(hù)操作以及止損操作。本文首先分析出來原始日志的最小結(jié)構(gòu),然后在不需要領(lǐng)域知識(shí)的前提下,根據(jù)日志的模板詞與參數(shù)詞理論,從三個(gè)不同的角度進(jìn)行日志模板提取,并對(duì)提取日志模板模型進(jìn)行了優(yōu)化;然后從日志模板中提取四個(gè)特征并自動(dòng)表征日志模板序列的模式,采用支持向量機(jī)與高斯核函數(shù)進(jìn)行監(jiān)督機(jī)器學(xué)習(xí),分析出當(dāng)前狀態(tài)是否可能導(dǎo)致故障;最后使用了實(shí)習(xí)公司中的實(shí)際生產(chǎn)數(shù)據(jù),對(duì)兩個(gè)模型進(jìn)行優(yōu)化和準(zhǔn)確率的驗(yàn)證,驗(yàn)證了模型的實(shí)用性。
[Abstract]:With the increase of network services and elements, the network log generated by the network services is regarded as one of the important data sources for monitoring network health and troubleshooting. In large production networks, direct analysis of network logs for active fault checking has become a challenging task. This is due to two reasons: first, the unstructured nature of the log: the log message is an unstructured text message, and the format of the text message provided by different vendors or different operating systems is different; second, Log diversification: a large production network contains a wide variety of network devices, which occur network events that generate a variety of log information. In this paper, I will construct two novel models to work together to complete the automatic fault detection task: the template extraction model based on the original log and the classification fault detection model based on the log template. The goal of the first model is to extract the log template directly and automatically from the unstructured log based on the idea of clustering, and the goal of the second model is to build a fault classifier model based on the log template. This can determine whether the current new log block is related to the failure. That is, the active fault detection model in this paper takes the original log as the input, determines whether the log is related to the fault as the output, and thus completes the detection task quickly and actively. Assist network maintainers in preventive maintenance and stop loss operations. This paper first analyzes the minimum structure of the original log, and then extracts the log template from three different angles according to the theory of template words and parameter words of the log without the need of domain knowledge. The model of extracting log template is optimized, then four features are extracted from log template and the pattern of log template sequence is represented automatically. Support vector machine and Gao Si kernel function are used to supervise machine learning. Finally, the actual production data of the internship company is used to optimize and verify the accuracy of the two models, which verifies the practicability of the model.
【學(xué)位授予單位】:東北師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13;TP393.06
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 王兆豐;單甘霖;;一種基于k-均值的DBSCAN算法參數(shù)動(dòng)態(tài)選擇方法[J];計(jì)算機(jī)工程與應(yīng)用;2017年03期
2 周澍綺;;基于Kleinberg算法的楚辭文獻(xiàn)突發(fā)信息監(jiān)測(cè)研究[J];電腦知識(shí)與技術(shù);2015年04期
3 胡睿;林昭文;柯宏力;馬嚴(yán);;一種基于密度和滑動(dòng)窗口的數(shù)據(jù)流聚類算法[J];計(jì)算機(jī)科學(xué);2011年05期
4 莊軍;郭平;周楊;周勁;蔡日旭;;路由器日志序列模式挖掘[J];計(jì)算機(jī)科學(xué);2005年11期
相關(guān)碩士學(xué)位論文 前4條
1 王振華;基于日志分析的網(wǎng)絡(luò)設(shè)備故障預(yù)測(cè)研究[D];重慶大學(xué);2015年
2 侯曉凱;基于神經(jīng)網(wǎng)絡(luò)的多狀態(tài)網(wǎng)絡(luò)設(shè)備故障預(yù)測(cè)的研究[D];山東大學(xué);2014年
3 高學(xué)玲;網(wǎng)絡(luò)健康評(píng)估與故障預(yù)測(cè)的研究與實(shí)現(xiàn)[D];西北大學(xué);2013年
4 王兆永;面向大規(guī)模批量日志數(shù)據(jù)存儲(chǔ)方法的研究[D];電子科技大學(xué);2011年
,本文編號(hào):1839273
本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1839273.html