基于支持向量機(jī)的HDFS副本放置改進(jìn)策略
發(fā)布時間:2018-05-10 06:03
本文選題:支持向量機(jī) + 云存儲 ; 參考:《計算機(jī)工程》2015年11期
【摘要】:為實現(xiàn)超大規(guī)模數(shù)據(jù)的存儲并提高容錯性,Hadoop分布式文件系統(tǒng)(HDFS)采用一種機(jī)架感知的多副本放置策略。但在放置過程中沒有綜合考慮各節(jié)點(diǎn)服務(wù)器的差異性,導(dǎo)致集群出現(xiàn)負(fù)載失衡。由于放置時采用隨機(jī)方式,造成節(jié)點(diǎn)之間的網(wǎng)絡(luò)距離過長,使得傳輸數(shù)據(jù)會消耗大量時間。針對以上問題,提出一種基于SVM的副本放置策略。通過綜合考慮節(jié)點(diǎn)負(fù)載情況、節(jié)點(diǎn)硬件性能、節(jié)點(diǎn)網(wǎng)絡(luò)距離為副本找到最佳的放置節(jié)點(diǎn)。實驗結(jié)果表明,與HDFS原有的副本放置策略相比,該策略能更有效地實現(xiàn)負(fù)載均衡。
[Abstract]:In order to store large scale data and improve fault tolerance, Hadoop distributed file system (HDFS) adopts a rack aware multi-replica placement strategy. However, in the process of placement, the differences of node servers are not considered synthetically, which leads to the load imbalance of the cluster. The network distance between nodes is too long because of the random way of placement, which makes the transmission of data consume a lot of time. Aiming at the above problems, a replica placement strategy based on SVM is proposed. By considering the load of the node, the hardware performance of the node and the distance of the node network to the replica, the optimal placement node is found. The experimental results show that the proposed strategy is more effective than the original replica placement strategy of HDFS.
【作者單位】: 重慶大學(xué)計算機(jī)學(xué)院;
【分類號】:TP333;TP18
,
本文編號:1868100
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1868100.html
最近更新
教材專著