MapReduce環(huán)境下的性能異常檢測(cè)和資源調(diào)度方法
本文選題:云計(jì)算 + MapReduce; 參考:《北京郵電大學(xué)》2013年碩士論文
【摘要】:MapReduce是由Google提出的一個(gè)廣為人知的編程框架,Hadoop開源實(shí)現(xiàn)了這一框架。因?yàn)镸apReduce適合處理大規(guī)模數(shù)據(jù),許多企業(yè)都采用其進(jìn)行數(shù)據(jù)挖掘,數(shù)據(jù)存儲(chǔ)等。MapReduce需要一個(gè)調(diào)度策略來決定工作如何執(zhí)行以及工作執(zhí)行過程中的資源分配,目前許多調(diào)度策略主要是為了提高集群資源利用率,而沒有充分考慮一個(gè)工作對(duì)于完成時(shí)間的要求。此外,MapReduce是一個(gè)架構(gòu)在廉價(jià)設(shè)備上的十分復(fù)雜的系統(tǒng),經(jīng)常會(huì)有異常發(fā)生,能否及時(shí)檢測(cè)到系統(tǒng)的異常并進(jìn)行處理對(duì)于系統(tǒng)的正常高效運(yùn)行十分重要。 本文針對(duì)以上的兩點(diǎn)問題進(jìn)行了研究: 1)針對(duì)資源調(diào)度問題,本文提出了一種調(diào)度機(jī)制以保證集群中運(yùn)行的每個(gè)工作都能夠按時(shí)完成,從而達(dá)到其性能要求。和其他的調(diào)度策略相比,本文的方法能夠預(yù)測(cè)一個(gè)工作的運(yùn)行狀況,并根據(jù)預(yù)測(cè)結(jié)果合理地分配資源給每個(gè)工作,以盡量避免不必要的資源浪費(fèi)。調(diào)度策略在一個(gè)仿真環(huán)境中進(jìn)行了評(píng)估,結(jié)果表明本文的方法能夠保證工作在其預(yù)期時(shí)間內(nèi)完成并能夠節(jié)省資源。 2)針對(duì)異常檢測(cè)問題,本文提出并分析了一種MapReduce環(huán)境下的異常檢測(cè)方法。該方法基于相似節(jié)點(diǎn)理論,通過運(yùn)用密度聚類的方法實(shí)時(shí)分析系統(tǒng)的性能指標(biāo)來檢測(cè)異常。本文還對(duì)相似節(jié)點(diǎn)理論和異常檢測(cè)算法進(jìn)行了實(shí)驗(yàn)驗(yàn)證。和現(xiàn)有的其他方法相比,本文提出的方法具有處理過程簡(jiǎn)單、算法復(fù)雜度低、檢測(cè)靈敏度高且適于在線和離線檢測(cè)的優(yōu)點(diǎn)。
[Abstract]:MapReduce is a well-known programming framework proposed by Google. Because MapReduce is suitable for large-scale data processing, many enterprises use it for data mining, data storage, and so on. MapReduce requires a scheduling strategy to determine how work is performed and how resources are allocated during work execution. At present, many scheduling strategies are mainly aimed at improving the utilization of cluster resources, without fully considering the completion time requirement of a single task. In addition, MapReduce is a very complex system based on cheap devices, and there are often exceptions. It is very important to detect and deal with the anomalies in time for the normal and efficient operation of the system. In this paper, the above two problems are studied: 1) aiming at the resource scheduling problem, this paper proposes a scheduling mechanism to ensure that every task running in the cluster can be completed on time, so as to meet its performance requirements. Compared with other scheduling strategies, the proposed method can predict the running status of a job and allocate resources to each task reasonably according to the prediction results, so as to avoid unnecessary waste of resources as far as possible. The scheduling policy is evaluated in a simulation environment. The results show that the proposed method can ensure that the work is completed within the expected time and can save resources. 2) aiming at the problem of anomaly detection, an anomaly detection method in MapReduce environment is proposed and analyzed. This method is based on the theory of similar nodes and detects anomalies by using density clustering method to analyze the performance of the system in real time. The theory of similar nodes and the algorithm of anomaly detection are also verified experimentally in this paper. Compared with other existing methods, the proposed method has the advantages of simple processing, low algorithm complexity, high detection sensitivity and suitable for on-line and off-line detection.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP338.6
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 岳侖,杜新華,張華;特征檢測(cè)與異常檢測(cè)相結(jié)合的入侵檢測(cè)模型[J];通信技術(shù);2003年11期
2 吉治鋼,蔡利棟;基于Fuzzy ART神經(jīng)網(wǎng)絡(luò)的Linux進(jìn)程行為異常檢測(cè)[J];計(jì)算機(jī)工程;2005年03期
3 李戰(zhàn)春;李之棠;黎耀;;基于相關(guān)特征矩陣和神經(jīng)網(wǎng)絡(luò)的異常檢測(cè)研究[J];計(jì)算機(jī)工程與應(yīng)用;2006年07期
4 盧艷軍;蔡國浩;張靖;;廣域網(wǎng)入侵異常檢測(cè)技術(shù)實(shí)現(xiàn)[J];中國新通信;2006年19期
5 張兆莉;蔡永泉;史曉龍;;一種用于異常檢測(cè)的系統(tǒng)調(diào)用參數(shù)及序列分析算法[J];微計(jì)算機(jī)信息;2006年33期
6 陳競(jìng);苗茹;;入侵檢測(cè)系統(tǒng)研究[J];電腦知識(shí)與技術(shù)(學(xué)術(shù)交流);2007年13期
7 劉星星;;基于數(shù)據(jù)流特征的網(wǎng)絡(luò)擁塞控制與異常檢測(cè)研究[J];電腦與電信;2007年10期
8 李閏平;李斌;王W,
本文編號(hào):1946282
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1946282.html