M2LSH:基于LSH的高維數(shù)據(jù)近似最近鄰查找算法

發(fā)布時(shí)間：2018-12-07 19:32

【摘要】：在許多應(yīng)用中,LSH(Locality Sensitive Hashing)以及各種變體,是解決近似最近鄰問題的有效算法之一.雖然這些算法能夠很好地處理分布比較均勻的高維數(shù)據(jù),但從設(shè)計(jì)方案來看,都沒有針對(duì)數(shù)據(jù)分布不均勻的情況做相應(yīng)的優(yōu)化.針對(duì)這一問題,本文提出了一種新的基于LSH的解決方案(M2LSH,2 Layers Merging LSH),對(duì)于數(shù)據(jù)分布不均勻的情況依然能得到一個(gè)比較好的查詢效果.首先,將數(shù)據(jù)存放到具有計(jì)數(shù)功能的組合哈希向量表示的哈希桶中,然后通過二次哈希將這些桶號(hào)投影到一維空間,在此空間根據(jù)各個(gè)桶中存放的數(shù)據(jù)個(gè)數(shù)合并相鄰哈希桶,使得新哈希桶中的數(shù)據(jù)量能夠大致均衡.查詢時(shí)僅訪問有限個(gè)哈希桶,就能找到較優(yōu)結(jié)果.本文給出了詳細(xì)的理論分析,并通過實(shí)驗(yàn)驗(yàn)證了M2LSH的性能,不僅能減少訪問時(shí)間,也可提高結(jié)果的正確率.
[Abstract]:In many applications, LSH (Locality Sensitive Hashing) and various variants are one of the effective algorithms to solve the approximate nearest neighbor problem. Although these algorithms can deal with the high dimensional data with uniform distribution well, from the point of view of the design scheme, there is no corresponding optimization for the non-uniform distribution of the data. In order to solve this problem, a new solution based on LSH (M2LSH2 Layers Merging LSH),) is proposed in this paper. First, the data is stored in the hash bucket represented by the combined hash vector with counting function, and then these bucket numbers are projected to one dimensional space by the quadratic hash, where the adjacent hash buckets are merged according to the number of data stored in each bucket. So that the new hash bucket in the amount of data can be roughly balanced. The query can only access a limited number of hash buckets to find a better result. In this paper, a detailed theoretical analysis is given, and the performance of M2LSH is verified by experiments, which can not only reduce the access time, but also improve the accuracy of the results.
【作者單位】：寧波大學(xué)信息科學(xué)與工程學(xué)院;
【基金】：國家自然科學(xué)基金(No.61472194,No.61572266) 浙江省自然科學(xué)基金(No.LY13F020040,No.LY16F020003) 寧波市自然科學(xué)基金(No.2014A610023,No.2015A610119)
【分類號(hào)】：TP301.6
，

本文編號(hào)：2367724

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2367724.html

上一篇：BOW-HOG特征圖像分類
下一篇：大數(shù)據(jù)技術(shù)對(duì)社會(huì)治理的影響及對(duì)策研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

M2LSH:基于LSH的高維數(shù)據(jù)近似最近鄰查找算法