基于集合表示的圖像分類
發(fā)布時(shí)間:2018-07-29 11:43
【摘要】:圖像分類的實(shí)質(zhì)性問(wèn)題是識(shí)別圖像中的物體或目標(biāo),這就需要準(zhǔn)確的對(duì)圖像中的視覺信息進(jìn)行描述。局部信息由于其對(duì)背景細(xì)節(jié)、光照等外在條件的魯棒性使其成為目前特征表示的主流,尤其是在尺度不變特征變換以及基于尺度不變特征變換各種改進(jìn)算法出現(xiàn)之后。然而不同圖像局的部特征的個(gè)數(shù)往往不相同,不適于直接在局部特征上進(jìn)行分類和檢索等后續(xù)操作,因此在圖像的局部特征集合上需求統(tǒng)一的集合表示方法。集合表示就是用一定的方法對(duì)圖像提取的所有局部特征點(diǎn)進(jìn)行操作,形成一個(gè)矢量來(lái)表示該圖像。本文的主要工作和貢獻(xiàn)如下:首先,本文從圖像的集合表示角度,詳細(xì)闡述了三種集合表示方法即詞袋模型、高效匹配核和局部聚合描述符,并且基于這三種集合表示方法在本文選定的數(shù)據(jù)庫(kù)上做了大量實(shí)驗(yàn),驗(yàn)證三種集合表示方法的分類性能。其次,驗(yàn)證不同的聚類算法和聚類中心個(gè)數(shù)對(duì)最新提出的局部聚合描述符圖像集合表示方法適用性。本文根據(jù)聚類中心個(gè)數(shù)的選定方式和局部特征的分配方式的不同,選用K-means、仿射傳播算法和高斯混合模型三種聚類算法。最后,對(duì)局部聚合描述符提出自己的改進(jìn)方法。在本文全面研究了歸一化和pooling兩種操作對(duì)局部聚合描述符的作用和有效性。歸一化的方式選用power-law和L2范數(shù),pooling 方法采用 sum pooling、average pooling 和廣義的 max pooling。PPMI、Caltech-101和Scene-15分別是關(guān)于動(dòng)作、物體和場(chǎng)景的數(shù)據(jù)庫(kù),在這三個(gè)數(shù)據(jù)庫(kù)上驗(yàn)證了上述方法的有效性。
[Abstract]:The essential problem of image classification is to identify the object or object in the image, which requires the accurate description of the visual information in the image. Because of its robustness to background details, illumination and other external conditions, local information has become the mainstream of feature representation, especially after the emergence of various improved scaling invariant feature transformation and scale-invariant feature transformation. However, the number of local features in different image bureaus is often different, which is not suitable for local features classification and retrieval. Therefore, a unified set representation method is required on the local feature sets of images. Set representation is to operate all the local feature points extracted from the image by a certain method and form a vector to represent the image. The main work and contributions of this paper are as follows: firstly, from the point of view of image set representation, three sets representation methods, namely word bag model, efficient matching kernel and local aggregation descriptor, are described in detail. Based on these three sets representation methods, a lot of experiments are done on the selected database to verify the classification performance of the three sets representation methods. Secondly, the applicability of different clustering algorithms and the number of clustering centers to the newly proposed local aggregation descriptor image set representation method is verified. According to the difference of the number of clustering centers and the distribution of local features, K-means, affine propagation algorithm and Gao Si hybrid model are selected in this paper. Finally, an improved method is proposed for the local aggregation descriptor. In this paper, we study the effect and validity of normalized and pooling operations on local aggregation descriptors. Power-law and L2 norm pooling methods are used to normalize sum poolingaverage pooling and generalized max pooling.PPMIM Caltech-101 and Scene-15 are respectively databases on actions, objects and scenes. The effectiveness of the above methods is verified on these three databases.
【學(xué)位授予單位】:哈爾濱工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.41
本文編號(hào):2152572
[Abstract]:The essential problem of image classification is to identify the object or object in the image, which requires the accurate description of the visual information in the image. Because of its robustness to background details, illumination and other external conditions, local information has become the mainstream of feature representation, especially after the emergence of various improved scaling invariant feature transformation and scale-invariant feature transformation. However, the number of local features in different image bureaus is often different, which is not suitable for local features classification and retrieval. Therefore, a unified set representation method is required on the local feature sets of images. Set representation is to operate all the local feature points extracted from the image by a certain method and form a vector to represent the image. The main work and contributions of this paper are as follows: firstly, from the point of view of image set representation, three sets representation methods, namely word bag model, efficient matching kernel and local aggregation descriptor, are described in detail. Based on these three sets representation methods, a lot of experiments are done on the selected database to verify the classification performance of the three sets representation methods. Secondly, the applicability of different clustering algorithms and the number of clustering centers to the newly proposed local aggregation descriptor image set representation method is verified. According to the difference of the number of clustering centers and the distribution of local features, K-means, affine propagation algorithm and Gao Si hybrid model are selected in this paper. Finally, an improved method is proposed for the local aggregation descriptor. In this paper, we study the effect and validity of normalized and pooling operations on local aggregation descriptors. Power-law and L2 norm pooling methods are used to normalize sum poolingaverage pooling and generalized max pooling.PPMIM Caltech-101 and Scene-15 are respectively databases on actions, objects and scenes. The effectiveness of the above methods is verified on these three databases.
【學(xué)位授予單位】:哈爾濱工程大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 趙理君;唐娉;霍連志;鄭柯;;圖像場(chǎng)景分類中視覺詞包模型方法綜述[J];中國(guó)圖象圖形學(xué)報(bào);2014年03期
2 張琳波;肖柏華;王楓;石磊;;圖像內(nèi)容表示模型綜述[J];計(jì)算機(jī)科學(xué);2013年07期
3 高雋;謝昭;張駿;吳克偉;;圖像語(yǔ)義分析與理解綜述[J];模式識(shí)別與人工智能;2010年02期
相關(guān)碩士學(xué)位論文 前2條
1 王永飛;子空間學(xué)習(xí)在基于Kinect的場(chǎng)景分類中的應(yīng)用[D];華南理工大學(xué);2014年
2 郭玉言;基于最大幾何流向和快速魯棒性特性的靜態(tài)人體檢測(cè)算法的研究[D];西安電子科技大學(xué);2014年
,本文編號(hào):2152572
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2152572.html
最近更新
教材專著