天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 碩博論文 > 信息類博士論文 >

基于量化的近似最近鄰搜索技術(shù)研究

發(fā)布時(shí)間:2018-01-01 06:41

  本文關(guān)鍵詞:基于量化的近似最近鄰搜索技術(shù)研究 出處:《中國(guó)科學(xué)技術(shù)大學(xué)》2017年博士論文 論文類型:學(xué)位論文


  更多相關(guān)文章: 最近鄰搜索 近似最近鄰搜索 量化 組合量化 近似正交的組合量化 稀疏組合量化 跨模態(tài)近似最近鄰搜索 跨模態(tài)協(xié)同量化 有監(jiān)督近似最近鄰搜索


【摘要】:最近鄰搜索是機(jī)器學(xué)習(xí)、計(jì)算機(jī)視覺(jué)和信息檢索里一個(gè)重要的基礎(chǔ)性問(wèn)題。然而,在大規(guī)模高維數(shù)據(jù)環(huán)境下,給定查詢點(diǎn),找到其精確的最近鄰需要大量的計(jì)算及存儲(chǔ)空間。近似最近鄰搜索算法由于其存儲(chǔ)空間少、查找效率高等優(yōu)點(diǎn)引起了人們的廣泛關(guān)注。而如何快速、高效、準(zhǔn)確地進(jìn)行近似最近鄰搜索是目前學(xué)術(shù)研究的一個(gè)熱點(diǎn)和難點(diǎn)。一般來(lái)說(shuō),近似最近鄰搜索的算法在盡可能保證其準(zhǔn)確性的情況下主要從兩個(gè)方面提高搜索速度。第一個(gè)是利用特殊的數(shù)據(jù)結(jié)構(gòu)來(lái)減少查詢點(diǎn)與數(shù)據(jù)點(diǎn)的比較次數(shù);第二個(gè)是利用緊湊碼來(lái)加速計(jì)算查詢點(diǎn)與數(shù)據(jù)點(diǎn)之間的距離,比如通過(guò)哈希算法或量化算法將數(shù)據(jù)點(diǎn)映射為緊湊碼。本文主要從第二個(gè)方面——基于量化的近似最近鄰搜索算法——研究如何獲得更優(yōu)質(zhì)的緊湊碼來(lái)提高查找準(zhǔn)確率和查找效率。本文主要研究?jī)?nèi)容和創(chuàng)新成果如下:1.針對(duì)無(wú)監(jiān)督的近似最近鄰搜索,本文提出一種組合量化方法。其主要思想是用若干個(gè)子中心點(diǎn)之和作為重構(gòu)點(diǎn)來(lái)近似數(shù)據(jù)點(diǎn),其中每個(gè)子中心點(diǎn)來(lái)自不同的子字典,數(shù)據(jù)點(diǎn)用這些子中心點(diǎn)在各自子字典中的索引值來(lái)表示。同時(shí),我們引入近似正交約束條件,使得計(jì)算查詢點(diǎn)與重構(gòu)點(diǎn)的距離可以用查詢點(diǎn)和這幾個(gè)子中心點(diǎn)的距離之和來(lái)代替進(jìn)而加速距離計(jì)算。與已有的量化方法的對(duì)比實(shí)驗(yàn)結(jié)果表明,近似正交的組合量化可以獲得更高的查找準(zhǔn)確率。2.本文提出一種稀疏組合量化算法,用以減少組合量化中創(chuàng)建查閱表所需的時(shí)間。大規(guī)模數(shù)據(jù)的近似最近鄰搜索通常結(jié)合倒排表進(jìn)一步加速搜索。而組合量化在對(duì)倒排表返回的數(shù)據(jù)點(diǎn)進(jìn)行排序的時(shí)候,創(chuàng)建查閱表所需的時(shí)間變得不可忽視。針對(duì)這一問(wèn)題,本文提出的稀疏組合量化方法,引入了一個(gè)稀疏條件,使得重構(gòu)字典里的每一個(gè)子中心點(diǎn)是一個(gè)稀疏向量。其好處是,當(dāng)創(chuàng)建查閱表需要計(jì)算查詢點(diǎn)與子中心點(diǎn)的歐氏距離的時(shí)候,由于子中心點(diǎn)是一個(gè)稀疏向量,可以加速距離計(jì)算。在大規(guī)模數(shù)據(jù)集上的近似最近鄰搜索表明,稀疏組合量化相比較于組合量化,可以獲得更快的查找速度。3.本文提出基于量化的近似最近鄰搜索算法用于跨模態(tài)最近鄰搜索領(lǐng)域中。所謂跨模態(tài)最近鄰搜索,指的是查詢點(diǎn)和數(shù)據(jù)點(diǎn)來(lái)自不同的數(shù)據(jù)模態(tài),例如用圖像查詢點(diǎn)去搜索相似的文本數(shù)據(jù)點(diǎn),或用文本查詢點(diǎn)去搜索相似的圖像數(shù)據(jù)點(diǎn)。本文提出的算法只假設(shè)一幅圖像和一段文本是一一對(duì)應(yīng)的,而不需要已知圖像和文本的類別。該算法首先將來(lái)自不同模態(tài)的一對(duì)數(shù)據(jù)映射到同一空間中,之后在這個(gè)映射后的空間對(duì)不同模態(tài)的數(shù)據(jù)通過(guò)組合量化進(jìn)行近似,同時(shí)使來(lái)自不同模態(tài)的一對(duì)數(shù)據(jù)的近似表示盡可能相同。大量的實(shí)驗(yàn)比較表明,本文提出的算法在跨模近似態(tài)最近鄰搜索中可以獲得更高的查找準(zhǔn)確率。4.針對(duì)有監(jiān)督近似最近鄰搜索,本文提出了一種新的量化方法。不同于無(wú)監(jiān)督近似最近鄰搜索,量化算法直接在數(shù)據(jù)庫(kù)上進(jìn)行量化,本文提出的算法是使數(shù)據(jù)點(diǎn)首先通過(guò)一個(gè)線性變換,之后在線性變換后的數(shù)據(jù)點(diǎn)上進(jìn)行組合量化。其優(yōu)化的目的不僅要使得量化后的近似表達(dá)能準(zhǔn)確地代表線性變換后的數(shù)據(jù)點(diǎn),同時(shí)也使得數(shù)據(jù)點(diǎn)在線性變換后具有類別可分離性,即相同類別的數(shù)據(jù)點(diǎn)在線性變換后距離很近,不同類別的數(shù)據(jù)點(diǎn)在線性變換后的空間內(nèi)相距很遠(yuǎn)。與現(xiàn)有的有監(jiān)督近似最近鄰搜索算法的實(shí)驗(yàn)比較表明,本文提出的算法可以獲得更高的查找準(zhǔn)確率。綜上,本文在無(wú)監(jiān)督的近似最近鄰搜索,跨模態(tài)的近似最近鄰搜索,以及有監(jiān)督的近似最近鄰搜索這三個(gè)領(lǐng)域提出了四個(gè)新穎的算法,用于提高近似最近鄰搜索的查找準(zhǔn)確率以及查找效率。大量實(shí)驗(yàn)結(jié)果表明了本文提出的方法的查找結(jié)果好于已有方法的查找結(jié)果。
[Abstract]:Nearest neighbor search is machine learning, computer vision and information retrieval in a fundamental question. However, in large scale and high dimensional data environment, a query point, to find the exact nearest neighbor needs a large amount of calculation and storage space. The approximate nearest neighbor search algorithm due to its less storage space, higher search efficiency has aroused people's attention. How to fast, efficient, accurate approximate nearest neighbor search is a research hotspot and difficulty. In general, approximate nearest neighbor search algorithm in as far as possible to ensure the accuracy of the situation mainly from two aspects to improve the search speed. The first is to use data the special structure to reduce the number of comparisons with the query point data point; the second is the use of compact code to accelerate the calculation between the query point and the data points in the distance, for example by hashing or quantity The algorithm will map the data points for the compact code. This paper mainly from second aspects: quantification of the approximate nearest neighbor search algorithm, study how to obtain better quality compact codes based on to improve the search accuracy and search efficiency. In this paper, the main research contents and innovations are as follows: 1. for approximate nearest neighbor search unsupervised, is proposed in this paper. A combination of quantitative methods. The main idea is to use a number of sub center as a reconstructed point to approximate the data points, where each sub center from different sub dictionaries, data points with these sub center point in each sub dictionary index to indicate the value. At the same time, we introduce approximate orthogonal constraint condition, the the calculation and reconstruction of the distance from the query point can be used in this query point and several sub center distance and instead thereby accelerating the distance calculation. Compared with the existing methods of quantification The results show that the combination of quantitative approximate orthogonality can obtain higher search accuracy of.2. this paper presents a combination of sparse quantization algorithm is used to reduce the time required to create table combination quantification. The large-scale data approximate nearest neighbor search is usually combined with inverted list to further accelerate the search. And when the sort of quantitative combination inverted list returned by the data points, the time required to create a lookup table is not to be ignored. To solve this problem, a combination of sparse quantization method is proposed in this paper, the introduction of a sparse condition, making each sub center reconstruction dictionary is a sparse vector. Its advantage is that when creating access the table needs to calculate the query point and sub center of the Euclidean distance, the sub center is a sparse vector, can accelerate the approximate distance calculation. In a large data set of nearest neighbor search show, Compared with the combination of sparse combination quantization quantization, can obtain faster searching speed of.3. this paper puts forward the quantitative approximate nearest neighbor search algorithm for nearest neighbor search in the field of cross modal based on cross modal. The so-called nearest neighbor search, refers to the query and data points from the data modal different, such as image query point to search for text data similar, or text query point to search for similar image data. This algorithm only if an image and a text is one-to-one, but does not need to know the image and text categories. Firstly, a mapping of data from different modes in the same space, after in the mapping space on the different modes of data through the combination of quantitative approximation, and the approximation of the data from different modes of representation as much as possible the same experiments than. Show that the proposed algorithm in cross modal approximate nearest neighbor search in the state can get a higher search accuracy of.4. for supervised approximate nearest neighbor search, this paper proposes a new quantitative method. Different from unsupervised approximate nearest neighbor search, quantization algorithm directly in the database on the quantification. The proposed algorithm is the first data point by a linear transformation, after the combination of quantization in a linear transformation of data points. The purpose is not only to make the approximate expression can accurately represent the linear transformation of the quantized data points, but also makes the data points in a linear transformation of class separability. That is the same type of data points in a linear transformation was very close, different categories of data points far apart in a linear transformation space. With the existing supervised approximate nearest neighbor search algorithm than experiment Show that the proposed algorithm can obtain higher search accuracy. To sum up, based on the approximate nearest neighbor search unsupervised, cross modal approximate nearest neighbor search, and the approximate nearest neighbor search in these three areas proposed four novel supervised algorithm, used to improve the approximate nearest neighbor search search the accuracy and efficiency of searching. The experimental results show that the method proposed in this paper to find the result is better than the existing methods of search results.

【學(xué)位授予單位】:中國(guó)科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.3

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 楊秀娟;;空間對(duì)象的雙色反向最近鄰查詢研究[J];煤炭技術(shù);2009年06期

2 張桂榕;;反向最近鄰查詢研究綜述[J];電腦知識(shí)與技術(shù);2011年28期

3 周屹;;不確定對(duì)象的反向最近鄰查詢研究[J];黑龍江工程學(xué)院學(xué)報(bào)(自然科學(xué)版);2012年04期

4 劉永山,薄樹(shù)奎,張強(qiáng),郝忠孝;多對(duì)象的最近鄰查詢[J];計(jì)算機(jī)工程;2004年11期

5 郝忠孝;劉永山;;空間對(duì)象的反最近鄰查詢[J];計(jì)算機(jī)科學(xué);2005年11期

6 王淼;郝忠孝;;不確定性對(duì)象的反向最近鄰查詢[J];計(jì)算機(jī)工程;2010年10期

7 張旭;何向南;金澈清;周傲英;;面向不確定圖的k最近鄰查詢[J];計(jì)算機(jī)研究與發(fā)展;2011年10期

8 楊澤雪;郝忠孝;;空間數(shù)據(jù)庫(kù)中的障礙反向最近鄰查詢[J];計(jì)算機(jī)工程與應(yīng)用;2011年34期

9 王丹丹;郝忠孝;;道路網(wǎng)絡(luò)中的多類型K最近鄰查詢[J];計(jì)算機(jī)工程與應(yīng)用;2012年03期

10 鄧瑾;周梅;;基于R樹(shù)及其變種的最近鄰查詢研究[J];現(xiàn)代計(jì)算機(jī);2013年09期

相關(guān)會(huì)議論文 前10條

1 張曉峰;王麗珍;肖清;趙麗紅;;基于概念劃分的連續(xù)最近鄰查詢研究[A];NDBC2010第27屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(B輯)[C];2010年

2 管猛;張剡;柏文陽(yáng);;基于地表的連續(xù)可見(jiàn)最近鄰查詢方法[A];NDBC2010第27屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(B輯)[C];2010年

3 陳璐;高云君;柳晴;陳剛;;受限相互最近鄰查詢處理[A];第29屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(B輯)(NDBC2012)[C];2012年

4 盛梅紅;沙朝鋒;宮學(xué)慶;嵇曉;周傲英;;道路網(wǎng)絡(luò)環(huán)境中的多對(duì)象最近鄰查詢[A];第二十三屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(研究報(bào)告篇)[C];2006年

5 劉月清;章勇;;一種改進(jìn)的動(dòng)態(tài)最近鄰聚類算法[A];全國(guó)自動(dòng)化新技術(shù)學(xué)術(shù)交流會(huì)會(huì)議論文集(一)[C];2005年

6 李傳文;谷峪;李芳芳;于戈;;一種障礙空間中不確定對(duì)象的連續(xù)最近鄰查詢方法[A];NDBC2010第27屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集A輯一[C];2010年

7 劉星毅;;基于歐式距離的最近鄰改進(jìn)算法[A];廣西計(jì)算機(jī)學(xué)會(huì)2010年學(xué)術(shù)年會(huì)論文集[C];2010年

8 劉先康;梁菁;任杰;蔣光慶;;修正最近鄰模糊分類算法在艦船目標(biāo)識(shí)別中的應(yīng)用[A];全國(guó)第4屆信號(hào)和智能信息處理與應(yīng)用學(xué)術(shù)會(huì)議論文集[C];2010年

9 劉俊嶺;孫煥良;;多維度量空間中發(fā)現(xiàn)相互kNN(英文)[A];NDBC2010第27屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集A輯二[C];2010年

10 余小高;;P2P環(huán)境中k最近鄰搜索算法研究[A];2009年全國(guó)開(kāi)放式分布與并行計(jì)算機(jī)學(xué)術(shù)會(huì)議論文集(下冊(cè))[C];2009年

相關(guān)博士學(xué)位論文 前9條

1 張婷;基于量化的近似最近鄰搜索技術(shù)研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2017年

2 楊澤雪;空間連接及最近鄰變體查詢研究[D];哈爾濱理工大學(xué);2014年

3 孫冬璞;時(shí)空數(shù)據(jù)庫(kù)多類型最近鄰查詢的研究[D];哈爾濱理工大學(xué);2010年

4 王建峰;基于哈希的最近鄰查找[D];中國(guó)科學(xué)技術(shù)大學(xué);2015年

5 張得天;時(shí)間依賴路網(wǎng)高效k最近鄰查詢混搭機(jī)制的研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2014年

6 杜欽生;高維空間的K最近鄰查詢及連接問(wèn)題研究[D];吉林大學(xué);2015年

7 張軍旗;支持最近鄰查找的高維空間索引[D];復(fù)旦大學(xué);2007年

8 李艷紅;路網(wǎng)中移動(dòng)對(duì)象最近鄰及反向最近鄰查詢處理研究[D];華中科技大學(xué);2011年

9 魏本昌;基于內(nèi)容的大規(guī)模圖像檢索技術(shù)研究[D];華中科技大學(xué);2015年

相關(guān)碩士學(xué)位論文 前10條

1 楊根茂;基于哈希加速的近似最近鄰檢索算法研究[D];浙江大學(xué);2015年

2 原s,

本文編號(hào):1363393


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1363393.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶c5dfb***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com