基于語義一致性和矩陣分解的跨模態(tài)哈希檢索研究
發(fā)布時間:2018-07-24 09:50
【摘要】:多模態(tài)是大數(shù)據(jù)的重要特性,隨著大數(shù)據(jù)時代的到來,像圖像檢索文本之類的跨模態(tài)數(shù)據(jù)之間的檢索已成為潛在的需求?缒B(tài)哈希(Cross-Modal Hashing)方法通過哈希函數(shù)將查詢數(shù)據(jù)轉(zhuǎn)變?yōu)闈h明空間中的二進(jìn)制編碼,即哈希編碼,形式上統(tǒng)一了各模態(tài)數(shù)據(jù),從而將跨模態(tài)數(shù)據(jù)之間的檢索轉(zhuǎn)變?yōu)楣>幋a之間的檢索,降低了存儲消耗同時加快了檢索速度。另外,哈希編碼之間通常保持了對應(yīng)數(shù)據(jù)之間的相似性,包括模態(tài)內(nèi)相似性和模態(tài)間相似性。相似性保持是本文研究的出發(fā)點,同時也是跨模態(tài)哈希方法的重要組成部分。然而當(dāng)前大多數(shù)跨模態(tài)哈希方法僅依據(jù)底層特征對數(shù)據(jù)之間的相似性進(jìn)行度量,忽略了語義的重要性,不利于縮小語義鴻溝,也不利于提高檢索的準(zhǔn)確率。人類是從語義層面對事物進(jìn)行區(qū)分和判斷的,因此數(shù)據(jù)之間的真實關(guān)系取決于語義。在底層特征具有噪聲或者判別性不強(qiáng)時,語義相似性的使用有利于生成具有較好判別性的哈希編碼,進(jìn)而提高檢索的準(zhǔn)確率。本文從語義層面度量模態(tài)內(nèi)相似性和模態(tài)間相似性,提出了兩種跨模態(tài)哈希方法,分別為:語義一致性跨模態(tài)哈希與基于語義一致性和矩陣分解的跨模態(tài)哈希。通過在現(xiàn)存的兩個主流的數(shù)據(jù)集上進(jìn)行實驗,驗證了方法的有效性。本文的主要研究內(nèi)容和創(chuàng)新點:(1)語義一致性跨模態(tài)哈希僅使用語義度量數(shù)據(jù)之間的相似性,降低了計算量和哈希編碼到高層語義的語義鴻溝,確保哈希編碼之間的相似性與原始數(shù)據(jù)之間的相似性具有語義上的一致性。哈希函數(shù)通過線性映射和二值化將數(shù)據(jù)轉(zhuǎn)變?yōu)楣>幋a。(2)基于語義一致性和矩陣分解的跨模態(tài)哈希同時利用語義和底層特征度量各模態(tài)內(nèi)數(shù)據(jù)之間的相似性,并用圖指示該相似性,縮小了底層特征到高層語義,以及哈希編碼到高層語義之間的語義鴻溝。利用矩陣分解構(gòu)建各模態(tài)數(shù)據(jù)共同的抽象空間,實現(xiàn)數(shù)據(jù)的抽象表達(dá),并通過量化抽象表達(dá)產(chǎn)生相應(yīng)的哈希編碼,最終將哈希函數(shù)的學(xué)習(xí)轉(zhuǎn)換成二元分類中超平面的學(xué)習(xí)。
[Abstract]:Multi-modal is an important feature of big data. With the advent of big data era, cross-modal data retrieval such as image retrieval text has become a potential demand. The cross-modal hash (Cross-Modal Hashing) method transforms the query data into binary encoding in the hamming space by hash function, which formally unifies the modal data. Thus, the retrieval between the cross-modal data is transformed into the retrieval between hash codes, which reduces the storage consumption and speeds up the retrieval. In addition, the similarity of corresponding data between hash codes is usually maintained, including intra-modal similarity and modal similarity. Similarity preservation is not only the starting point of this paper, but also an important part of cross-modal hash method. However, most cross-modal hash methods only measure the similarity of data based on the underlying features, ignoring the importance of semantics, which is not conducive to narrowing the semantic gap and improving the accuracy of retrieval. Human beings distinguish and judge things from semantic level, so the true relationship between data depends on semantics. When the underlying feature is noisy or discriminant, the use of semantic similarity is helpful to generate a good discriminant hash code and improve the retrieval accuracy. In this paper, two kinds of cross-modal hash methods are proposed from the semantic level, which are semantic consistency cross-modal hash and cross-modal hash based on semantic consistency and matrix decomposition. The validity of the method is verified by experiments on two existing data sets. The main contents and innovations of this paper are as follows: (1) semantic consistency cross-modal hash only uses semantic metrics to measure the similarity between data, which reduces the semantic gap between computation and hash coding to high-level semantics. Ensure semantic consistency between the similarity between hash coding and raw data. The hash function transforms the data into hash coding by linear mapping and binarization. (2) Cross-modal hash based on semantic consistency and matrix decomposition simultaneously uses semantic and underlying features to measure the similarity of data within each modal. The similarity is indicated by the graph, which reduces the semantic gap between the underlying features and high-level semantics, and between hash coding and high-level semantics. The abstract space of each modal data is constructed by matrix decomposition, and the abstract representation of data is realized, and the corresponding hash code is generated by quantifying abstract representation. Finally, the learning of hash function is transformed into the learning of hyperplane in binary classification.
【學(xué)位授予單位】:安徽大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.41
本文編號:2140991
[Abstract]:Multi-modal is an important feature of big data. With the advent of big data era, cross-modal data retrieval such as image retrieval text has become a potential demand. The cross-modal hash (Cross-Modal Hashing) method transforms the query data into binary encoding in the hamming space by hash function, which formally unifies the modal data. Thus, the retrieval between the cross-modal data is transformed into the retrieval between hash codes, which reduces the storage consumption and speeds up the retrieval. In addition, the similarity of corresponding data between hash codes is usually maintained, including intra-modal similarity and modal similarity. Similarity preservation is not only the starting point of this paper, but also an important part of cross-modal hash method. However, most cross-modal hash methods only measure the similarity of data based on the underlying features, ignoring the importance of semantics, which is not conducive to narrowing the semantic gap and improving the accuracy of retrieval. Human beings distinguish and judge things from semantic level, so the true relationship between data depends on semantics. When the underlying feature is noisy or discriminant, the use of semantic similarity is helpful to generate a good discriminant hash code and improve the retrieval accuracy. In this paper, two kinds of cross-modal hash methods are proposed from the semantic level, which are semantic consistency cross-modal hash and cross-modal hash based on semantic consistency and matrix decomposition. The validity of the method is verified by experiments on two existing data sets. The main contents and innovations of this paper are as follows: (1) semantic consistency cross-modal hash only uses semantic metrics to measure the similarity between data, which reduces the semantic gap between computation and hash coding to high-level semantics. Ensure semantic consistency between the similarity between hash coding and raw data. The hash function transforms the data into hash coding by linear mapping and binarization. (2) Cross-modal hash based on semantic consistency and matrix decomposition simultaneously uses semantic and underlying features to measure the similarity of data within each modal. The similarity is indicated by the graph, which reduces the semantic gap between the underlying features and high-level semantics, and between hash coding and high-level semantics. The abstract space of each modal data is constructed by matrix decomposition, and the abstract representation of data is realized, and the corresponding hash code is generated by quantifying abstract representation. Finally, the learning of hash function is transformed into the learning of hyperplane in binary classification.
【學(xué)位授予單位】:安徽大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 詹煒;;流形學(xué)習(xí)算法概述[J];武漢船舶職業(yè)技術(shù)學(xué)院學(xué)報;2013年02期
2 徐蓉;姜峰;姚鴻勛;;流形學(xué)習(xí)概述[J];智能系統(tǒng)學(xué)報;2006年01期
,本文編號:2140991
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2140991.html
最近更新
教材專著