基于深度學(xué)習(xí)表征的圖像檢索技術(shù)
發(fā)布時(shí)間:2018-04-03 07:17
本文選題:圖像檢索 切入點(diǎn):深度表征 出處:《中國(guó)科學(xué)技術(shù)大學(xué)》2017年博士論文
【摘要】:隨著近年來(lái)數(shù)碼相機(jī)及智能手機(jī)的大規(guī)模普及,以及存儲(chǔ)設(shè)備容量的持續(xù)增加,多媒體內(nèi)容特別是視覺(jué)數(shù)據(jù)呈現(xiàn)出爆發(fā)式的增長(zhǎng)態(tài)勢(shì)。因而,對(duì)于海量的視覺(jué)內(nèi)容,如何進(jìn)行迅速有效的檢索一直是國(guó)內(nèi)外學(xué)術(shù)與工業(yè)界的研究熱點(diǎn)。早期的圖像檢索系統(tǒng)通常使用基于文本的查詢方式,通過(guò)將用戶提供的查詢文本與互聯(lián)網(wǎng)網(wǎng)頁(yè)內(nèi)容進(jìn)行匹配,從而檢索到與查詢文本相關(guān)聯(lián)的圖像。隨著計(jì)算機(jī)視覺(jué)技術(shù)的發(fā)展,基于內(nèi)容的圖像檢索(Content-Based Image Retrieval,CBIR)在解析用戶查詢意圖、增強(qiáng)用戶體驗(yàn)等方面起到了與文本查詢相互補(bǔ)的作用,同時(shí)也在商品搜索、地標(biāo)檢索、商標(biāo)查重等商用場(chǎng)景中起到了突出的作用。深度學(xué)習(xí)技術(shù)在最近幾年內(nèi)呈現(xiàn)出了非常引人注目的研究進(jìn)展。在圖像內(nèi)容表征方面,基于深度學(xué)習(xí)的圖像表征(簡(jiǎn)稱為深度表征)更是在眾多的計(jì)算機(jī)視覺(jué)任務(wù)中表現(xiàn)出了優(yōu)異的性能。而在深度學(xué)習(xí)的多種模型中,深度卷積神經(jīng)網(wǎng)絡(luò)(Convolutional Neural Network,CNN)則尤其擅長(zhǎng)于對(duì)圖像的內(nèi)容進(jìn)行抽象與描述,在圖像檢索領(lǐng)域獲得了廣泛的關(guān)注與深入的研究。不同于傳統(tǒng)的圖像表征方式,深度表征側(cè)重于對(duì)圖像進(jìn)行語(yǔ)義層面的全局表達(dá),通過(guò)一個(gè)端到端的模型提取圖像中的重要信息,并使用緊湊的特征描述對(duì)圖像內(nèi)容進(jìn)行有效的描述。盡管現(xiàn)有的基于深度表征的圖像檢索方法已經(jīng)取得了令人矚目的檢索性能,但仍有一系列問(wèn)題難以克服:(1)不同于傳統(tǒng)的基于局部視覺(jué)特征的表征方法,深度表征在語(yǔ)義層面對(duì)圖像進(jìn)行整體的刻畫(huà),因而呈現(xiàn)出對(duì)局部細(xì)節(jié)表征不夠突出,且對(duì)圖像空間位置、幾何形變比較敏感的特點(diǎn);(2)基于局部表征的方法可以利用局部特征之間的空間關(guān)系對(duì)圖像匹配進(jìn)行幾何校驗(yàn),以實(shí)現(xiàn)更加精確的匹配,而深度表征則難以利用這一性質(zhì)對(duì)檢索性能進(jìn)行增強(qiáng);(3)現(xiàn)有的方法多使用具有人工標(biāo)注的公共基準(zhǔn)數(shù)據(jù)集對(duì)檢索算法的性能進(jìn)行驗(yàn)證,無(wú)法實(shí)現(xiàn)對(duì)任意查詢實(shí)時(shí)響應(yīng)的檢索質(zhì)量評(píng)估,不便于搜索引擎根據(jù)需要對(duì)檢索結(jié)果進(jìn)行修正。針對(duì)以上問(wèn)題,本論文的研究?jī)?nèi)容圍繞著基于深度表征的圖像檢索技術(shù)進(jìn)行展開(kāi),包括如何構(gòu)造良好的表征方式,如何對(duì)檢索結(jié)果進(jìn)行性能增強(qiáng),以及如何對(duì)檢索結(jié)果進(jìn)行有效的實(shí)時(shí)評(píng)估。論文的創(chuàng)新點(diǎn)包括以下幾點(diǎn):(1)論文提出一種基于通用目標(biāo)檢測(cè)技術(shù)的深度表征方式,可充分結(jié)合深度學(xué)習(xí)的語(yǔ)義表征能力與圖像顯著區(qū)域的判別能力。首先,本文使用通用物體檢測(cè)子在圖像中檢測(cè)出少量最有可能包含物體的區(qū)域,然后在這些區(qū)域中提取深度表征。同時(shí)為了對(duì)區(qū)域中的局部屬性進(jìn)行描述,本文在其中提取局部不變特征,并與深度表征進(jìn)行融合,可實(shí)現(xiàn)更加豐富的圖像表征。(2)論文提出在深度表征層面進(jìn)行數(shù)據(jù)庫(kù)增強(qiáng)與查詢結(jié)果重排序,分別在線下索引階段與線上查詢階段,以極小的計(jì)算與存儲(chǔ)開(kāi)銷對(duì)檢索性能進(jìn)行增強(qiáng)。在索引階段,本文利用數(shù)據(jù)庫(kù)圖像之間的相互關(guān)系,通過(guò)鄰域信息實(shí)現(xiàn)無(wú)監(jiān)督的特征更新,使得獲得的特征具有更好的檢索性能。在查詢階段,本文提出對(duì)初始檢索結(jié)果進(jìn)行殘差表達(dá),充分利用查詢特征的鄰域信息,對(duì)檢索結(jié)果進(jìn)行重排序。(3)論文提出一種基于檢索結(jié)果相關(guān)性的方法對(duì)檢索質(zhì)量進(jìn)行自動(dòng)評(píng)估,并實(shí)現(xiàn)線上多檢索結(jié)果選優(yōu)等應(yīng)用。對(duì)每個(gè)檢索結(jié)果,本文通過(guò)其深度表征之間的相關(guān)性構(gòu)造一特征矩陣,并使用卷積神經(jīng)網(wǎng)絡(luò)對(duì)檢索質(zhì)量進(jìn)行回歸學(xué)習(xí)。由多種表征方式獲得的相關(guān)性矩陣可以拼接在一起,實(shí)現(xiàn)基于多特征融合的質(zhì)量評(píng)估方法。本文從深度表征出發(fā),分別在特征構(gòu)造、線下索引、線上重排、質(zhì)量評(píng)估等各方面對(duì)圖像檢索技術(shù)進(jìn)行充分而全面的研究。論文分別從方法層面、實(shí)驗(yàn)層面與應(yīng)用層面對(duì)所提出的方法進(jìn)行闡述與驗(yàn)證,充分證明方法的可靠性與實(shí)用性。
[Abstract]:With the popularization of digital camera and intelligent mobile phone in recent years, and the storage capacity continues to increase, the multimedia content especially visual data showing explosive growth. Therefore, the visual content of the mass, how quickly effective retrieval has been a hot research in domestic and foreign academic and industrial circles. Early image retrieval the system usually use text query based on the matching, users query text and web content, so as to retrieve and query text associated with the image. With the development of computer vision technology, content-based image retrieval (Content-Based Image, Retrieval, CBIR) in the analysis of the user's query intention, and enhance the user experience. Play a complementary role and text query, but also in product search, standard search, check and other business with the trademark in the scene To highlight the role of deep learning technology. In recent years showing progress very compelling. In image content representation, image representation based on deep learning (referred to as the depth of characterization) is in many computer vision tasks showed excellent performance. A variety of models in the deep learning in depth, convolutional neural network (Convolutional Neural Network, CNN) is especially good at the content of image abstraction and description, received wide attention and in-depth research in the field of image retrieval. The image representation is different from the traditional, expression of global depth representation focuses on the semantic level of the image, by a end to end model to extract important information in the image, and use the compact feature description by description of image content. Although the image retrieval based on the existing depth characterization Cable method has achieved remarkable retrieval performance, but there are still a series of problems difficult to overcome: (1) different from the traditional method to characterize the local visual features based on depth representation in the semantic layer facing the overall image of the characterization, thus showing the details of characterization is not prominent, and the spatial location of image features. The geometric deformation sensitive; (2) method based on local representation can exploit the spatial relationship between local features of image matching, geometric calibration, matching to achieve more accurate, and the depth of characterization is difficult to use this property to enhance the retrieval performance; (3) the existing methods used with a common reference data manual mark set to verify the performance of the retrieval algorithm, can not achieve the retrieval quality evaluation of query real-time response to arbitrary, not easy to search engine according to the needs of the search results Amendment to the above problems, the research content of this paper around the depth image retrieval technology based on the characterization of, including how to construct the representation of good search results, how to enhance performance, and how to search results and effective real-time assessment. The innovation of this paper include the following: (1) paper a general representation depth based on target detection technique, can fully combine discriminative semantic representation capability and image saliency of deep learning. First of all, this paper use generic object detection in images detected in small amounts are most likely to contain the object area, then the depth of characterization extraction in these areas. At the same time in order to local property the area is described, in which the local invariant feature extraction, and integration with the depth of characterization, can achieve more rich image representation (2). This paper proposes enhanced database and query results reordering in depth characterization level, respectively, under the online index stage and online query stage, the retrieval performance is enhanced by computation and storage overhead minimum. In the indexing stage, using the relationship between the image database, through the implementation of neighborhood information unsupervised feature updates, the the retrieval has better performance. In the query stage, the initial retrieval results are proposed to make full use of the residual expression, the query feature of neighborhood information to re rank the search results. (3) this paper proposes a method based on the correlation of the search results automatically assess the quality of the retrieval, the retrieval results and realize multi line optimization for each application. The retrieval results, the correlation between the depth of the structure through the characterization of a feature matrix, and the use of convolutional neural network The retrieval quality regression. Correlation matrix obtained by multiple representations can be spliced together to achieve quality evaluation method based on multi feature fusion. In this paper, starting from the depth of characterization, respectively in the feature structure, line index, line rearrangement, quality evaluation, the parties face image retrieval technology fully and comprehensively studied. This paper respectively. From the aspect of method, experiment and application level in the proposed method of verification, proved the reliability and practicability of the method.
【學(xué)位授予單位】:中國(guó)科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41
【相似文獻(xiàn)】
相關(guān)期刊論文 前1條
1 端木夏;這篇新聞?dòng)猩疃萚J];新聞界;2002年03期
相關(guān)博士學(xué)位論文 前1條
1 孫韶言;基于深度學(xué)習(xí)表征的圖像檢索技術(shù)[D];中國(guó)科學(xué)技術(shù)大學(xué);2017年
,本文編號(hào):1704165
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1704165.html
最近更新
教材專著