適用于海量數(shù)據(jù)應(yīng)用的多維Hash表結(jié)構(gòu)
發(fā)布時(shí)間:2018-06-09 15:55
本文選題:多維 + Hash表。 參考:《清華大學(xué)學(xué)報(bào)(自然科學(xué)版)》2017年06期
【摘要】:傳統(tǒng)的Hash表通過對(duì)目標(biāo)數(shù)據(jù)進(jìn)行Hash計(jì)算,可以實(shí)現(xiàn)數(shù)據(jù)的快速存取與檢索。為了保持較好的存儲(chǔ)性能,需要使整個(gè)Hash表保持疏松的狀態(tài),從而犧牲掉10%~25%的空間。這對(duì)于海量數(shù)據(jù)存儲(chǔ)而言,是一種巨大的空間浪費(fèi)。該文提出一種多維Hash表結(jié)構(gòu),通過增加Hash表在邏輯上的維度,大大降低了Hash表的沖突率,實(shí)現(xiàn)了在較高的填充率下獲得較滿意的性能。實(shí)驗(yàn)結(jié)果表明:在千萬(wàn)的數(shù)據(jù)量級(jí)上,二維Hash表的沖突率比傳統(tǒng)Hash表的減小2~4個(gè)數(shù)量級(jí),總體性能則提升了1個(gè)數(shù)量級(jí)。該文還在原有填充率的基礎(chǔ)上,提出失效率的概念,進(jìn)一步完善和統(tǒng)一了Hash表性能評(píng)價(jià)指標(biāo)。
[Abstract]:The traditional Hash table can quickly access and retrieve the target data by Hash calculation. In order to maintain good storage performance, it is necessary to keep the whole Hash table loose at the expense of 10 or 25% of the space. This is a huge waste of space for massive data storage. In this paper, a multidimensional Hash table structure is proposed. By increasing the logical dimension of the Hash table, the collision rate of the Hash table is greatly reduced, and the satisfactory performance is achieved at a higher filling rate. The experimental results show that the collision rate of the two-dimensional Hash table is 2 ~ 4 orders of magnitude less than that of the traditional Hash table, and the overall performance of the two-dimensional Hash table is improved by 1 order of magnitude. Based on the original filling rate, the concept of failure rate is put forward, and the performance evaluation index of Hash table is further improved and unified.
【作者單位】: 國(guó)防科技大學(xué)計(jì)算機(jī)學(xué)院;
【分類號(hào)】:TP311.12
,
本文編號(hào):2000297
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2000297.html
最近更新
教材專著