基于嵌入結(jié)構(gòu)性信息視覺特征的圖像理解模型研究

發(fā)布時間：2018-04-25 03:05

本文選題：結(jié)構(gòu)性視覺特征 + 圖像理解�。� 參考：《天津大學》2015年博士論文

【摘要】：隨著互聯(lián)網(wǎng)技術(shù)的發(fā)展和移動互聯(lián)網(wǎng)的興起,大量的多媒體信息特別是圖像被上傳到互聯(lián)網(wǎng)上并且數(shù)目仍在不斷的增加,這些信息將人們帶入了圖片大數(shù)據(jù)時代。為了挖掘海量圖像數(shù)據(jù)中的有效信息及其中所蘊含的經(jīng)濟和社會價值,涉及到與圖像相關(guān)的諸多技術(shù),圖像理解是其中較重要的一環(huán)。傳統(tǒng)的用于圖像理解的方法主要是基于詞袋模型,即首先提取圖像的底層特征表示,然后構(gòu)建一個編碼詞典,最后將圖像底層特征依據(jù)編碼詞典進行映射得到圖像的直方圖特征表示。雖然該方法廣泛應(yīng)用于圖像理解的相關(guān)方向如圖像中的物體識別和圖像檢索,并取得了一定的效果,但是基于詞袋模型的表示方法會將圖像中蘊含的結(jié)構(gòu)信息丟失,導致圖像特征表示的辨別性和魯棒性存在一定的局限性。區(qū)別于基于詞袋模型的圖像特征表示方法,本論文提出了一種新的構(gòu)建圖像特征表示的方法,該方法將與圖像相關(guān)的結(jié)構(gòu)信息融合到圖像的特征表示中,以此來提高圖像特征表示的辨別性和魯棒性。本論文提出了三種不同形式的結(jié)構(gòu)信息的嵌入視覺特征表示方法,并將其分別應(yīng)用于圖像理解領(lǐng)域中的圖像檢索,圖像分類和圖像語義標注中。第一種方法是在基于輪廓圖像的圖像檢索和分類的應(yīng)用中,不同于傳統(tǒng)的方法直接提取輪廓圖像的特征點并構(gòu)建特征描述符,在本論文中,提出了將輪廓圖像所描述的物體的本身所具有的結(jié)構(gòu)對稱性嵌入到圖像的視覺特征表示中,從而構(gòu)建出包含物體對稱結(jié)構(gòu)的圖像視覺特征表示。該特征表示能夠有效的提高特征描述符的辨別性和魯棒性。在實驗中,將該嵌入對稱結(jié)構(gòu)的視覺特征表示應(yīng)用于輪廓圖像的分類和檢索上,實驗結(jié)果表明該方法能夠有效的提高輪廓圖像的檢索和分類的準確度證明了結(jié)構(gòu)信息嵌入特征表示中的有效性。第二種方法是在基于視覺屬性的圖像檢索的應(yīng)用中,不同于傳統(tǒng)的方法僅考慮視覺屬性查詢詞與其他相關(guān)視覺屬性之間的共存性,在本論文提出的方法中,首先將視覺屬性結(jié)構(gòu)之間的互斥和共存性嵌入到視覺屬性的特征表達之中。然后利用該嵌入結(jié)構(gòu)信息的圖像特征表示,提出了一種基于特征重建的圖像檢索框架,該框架能夠有效的保留圖像的結(jié)構(gòu)特征從而能夠有效的提高圖像檢索的穩(wěn)定性和魯棒性。實驗結(jié)果證明該方法能夠有效的降低查詢關(guān)鍵詞的歧義性,提高圖像檢索的準確度。第三種方法是在基于弱監(jiān)督的圖像標注的應(yīng)用中,由于圖像中包含有多個物體,因此傳統(tǒng)的基于詞袋的特征表示方法具有歧義性,進而無法表示圖像中不同物體之間的結(jié)構(gòu)關(guān)系,最后使得圖像標注結(jié)果的不準確。本論文提出一種將語義標簽的結(jié)構(gòu)相關(guān)性信息嵌入到圖像特征表示之中的方法,從而克服原始特征表示存在的歧義性。實驗結(jié)果表明該嵌入語義標簽的結(jié)構(gòu)性特征表示能夠有效的提高圖像特征表示的辨別性和泛化能力,進行能夠推動圖像標注的查全性和準確率的提升。本論文為了驗證視覺特征的結(jié)構(gòu)性在圖像理解中的作用,考慮到不同的應(yīng)用場景下圖像特征中嵌入不同層次的結(jié)構(gòu)性信息:底層信息中的物體本身的對稱結(jié)構(gòu),中層特征信息中的視覺屬性的相關(guān)性結(jié)構(gòu)以及高層物體標簽信息之間的語義結(jié)構(gòu)。通過不同的實驗結(jié)果證明本文所提出的嵌入結(jié)構(gòu)性信息的視覺特征表示能夠有效的提高特征表示的辨別性和魯棒性,同時實驗結(jié)果也表明了嵌入結(jié)構(gòu)性信息的視覺特征的有效性以及能夠?qū)τ嬎銠C視覺中圖像理解方向的發(fā)展具有一定的推動作用。
[Abstract]:With the development of Internet technology and the rise of mobile Internet, a large number of multimedia information, especially images, are uploaded to the Internet and the number of them is still increasing. These information will bring people into the era of big picture data. In order to excavate the effective information in the massive image data and the economic and social value contained in it, The image understanding is a very important part of the technology related to the image. The traditional method for image understanding is mainly based on the word bag model, that is, first extracting the underlying feature representation of the image, and then constructing a coding dictionary, and then mapping the underlying feature of the image to the histogram of the image to get the histogram. Although this method is widely used in the related direction of image understanding, such as the object recognition and image retrieval in the image, the representation method based on the word bag model will lose the structure information contained in the image, which leads to the limitation of the discrimination and robustness of the image feature representation. In this paper, a new method of image feature representation is proposed in this paper, which combines the structure information associated with the image into the feature representation of the image, in order to improve the discrimination and robustness of the image feature representation. Three different types of structural letters are proposed in this paper. It is applied to image retrieval, image classification and image semantic annotation in the field of image understanding. The first method is to extract feature points of contour images and construct feature descriptors directly from the traditional methods in the application of image retrieval and classification based on contour images. In this paper, the structure symmetry of the object described by the outline image is embedded into the visual feature representation of the image, and the image visual feature containing the symmetrical structure of the object is constructed. The feature representation can effectively improve the discrimination and robustness of the feature descriptors. In the experiment, it is embedded. The visual features of symmetric structures are applied to the classification and retrieval of the contour images. The experimental results show that the accuracy of the method can effectively improve the retrieval and classification of the contour images. The second methods are different in the application of image retrieval based on visual attributes. The traditional method only considers the coexistence between the visual attribute query word and the other related visual attributes. In the method proposed in this paper, the mutual exclusion and coexistence between the visual attribute structures are embedded in the feature expression of the visual attributes. The reconstructed image retrieval framework, which can effectively preserve the structural features of the image, can effectively improve the stability and robustness of the image retrieval. The experimental results show that the proposed method can effectively reduce the ambiguity of the query key words and improve the accuracy of image retrieval. The third method is based on the weak supervised image. In the application of annotation, because there are many objects in the image, the traditional feature representation method based on the word bag is ambiguous, and can not express the structure relationship between different objects in the image. Finally, the result of the image annotation is inaccurate. The experimental results show that the structural features of the embedded semantic label can effectively improve the discriminability and generalization ability of the image feature representation, and improve the recall and accuracy of the image annotation. The role of characteristic structure in image understanding, taking into account the structural information embedded in different layers in different application scenes: the symmetry structure of the object itself in the underlying information, the correlation structure of the visual attributes in the middle feature information and the semantic structure between the label information of the high-level object. The experimental results show that the visual feature of the embedded structural information presented in this paper can effectively improve the discrimination and robustness of the feature representation, and the experimental results also show the effectiveness of the visual features embedded in the structural information and can push the development of the image understanding direction in the computer vision to a certain extent. Use.

【學位授予單位】：天津大學
【學位級別】：博士
【學位授予年份】：2015
【分類號】：TP391.41

【相似文獻】

相關(guān)期刊論文前10條

1 張克軍;劉哲;;圖像理解原理的數(shù)學評價[J];計算機工程與設(shè)計;2007年08期

2 姚慶棟，，劉濟林，徐勝榮，華中;一種圖像理解的知識基系統(tǒng)V語言[J];紅外與毫米波學報;1995年03期

3 陳振羽,周焰,王祖喜,李德華,胡漢平;關(guān)于計算機圖像理解的知識與知識表達[J];紅外與激光工程;2000年01期

4 錢曉華;郭樹旭;李雪妍;;基于圖像理解視角的分割全局評價算法[J];電子學報;2012年10期

5 范成法,葉秀清,顧偉康;一個基于知識的道路圖像理解系統(tǒng)[J];計算機研究與發(fā)展;1999年09期

6 朱蓉;;基于語義信息的圖像理解關(guān)鍵問題研究[J];計算機應(yīng)用研究;2009年04期

7 董志芳;;巧用圖像理解相關(guān)運算[J];電氣電子教學學報;2010年03期

8 許茜;殷緒成;李巖;郝紅衛(wèi);曹曉鐘;;基于圖像理解的能見度測量方法[J];模式識別與人工智能;2013年06期

9 周海英;穆志純;;圖像理解中的視覺感知與圖像的關(guān)聯(lián)組織[J];小型微型計算機系統(tǒng);2014年04期

10 席大春,周成平,婁聯(lián)堂;基于圖像理解的橋梁自動打擊效果評估系統(tǒng)研究[J];計算機應(yīng)用研究;2004年11期

相關(guān)會議論文前3條

1 張鋼;程良倫;鐘欽靈;;圖像理解的度量學習方法[A];中國自動化學會中南六�。▍^(qū)）2010年第28屆年會·論文集[C];2010年

2 郝博;王吉軍;魏小鵬;魏昱寧;;室外場景圖像理解及情感語義提取技術(shù)的研究[A];中國圖學新進展2007——第一屆中國圖學大會暨第十屆華東六省一市工程圖學學術(shù)年會論文集[C];2007年

3 胡良梅;張駿;謝昭;;Booosting及其在圖像理解中應(yīng)用綜述[A];第七屆全國信息獲取與處理學術(shù)會議論文集[C];2009年

相關(guān)博士學位論文前10條

1 張華;基于嵌入結(jié)構(gòu)性信息視覺特征的圖像理解模型研究[D];天津大學;2015年

2 謝昭;圖像理解的關(guān)鍵問題和方法研究[D];合肥工業(yè)大學;2007年

3 錢樂樂;基于視覺層次感知機制的圖像理解方法研究[D];合肥工業(yè)大學;2009年

4 胡良梅;基于信息融合的圖像理解方法研究[D];合肥工業(yè)大學;2006年

5 張會章;基于視覺感知的圖像理解方法研究[D];西北工業(yè)大學;2003年

6 劉淼;基于結(jié)構(gòu)和表觀模型的圖像理解方法及其應(yīng)用研究[D];吉林大學;2008年

7 白明;自主移動機器人的運動規(guī)劃與圖像理解研究[D];大連理工大學;2011年

8 沈會良;中低層圖像理解算法研究[D];浙江大學;2002年

9 胡德昆;基于生物視覺感知機制的圖像理解技術(shù)研究[D];電子科技大學;2012年

10 韓光;面向非結(jié)構(gòu)環(huán)境圖像理解的算法研究[D];南京理工大學;2010年

相關(guān)碩士學位論文前9條

1 郭訓力;面向智能眼鏡的圖像理解技術(shù)研究[D];南京大學;2014年

2 陳征;基于上下文的圖像理解算法研究[D];山東師范大學;2015年

3 傅光磊;基于語義綁定的分層視覺詞匯庫的圖像理解算法研究[D];上海交通大學;2010年

4 曾凡濤;基于改進LBP特征的圖像理解[D];吉林大學;2014年

5 位保振;胰腺ERCP圖像理解中關(guān)鍵技術(shù)的研究[D];內(nèi)蒙古科技大學;2013年

6 付振中;圖像理解中高層算法研究及其在RoboCup中型組中的應(yīng)用[D];山東大學;2008年

7 武麗麗;基于有監(jiān)督學習圖像理解中的序模型研究[D];燕山大學;2015年

8 張海洋;基于圖像理解的坦克分隊戰(zhàn)術(shù)訓練系統(tǒng)[D];南京理工大學;2012年

9 王佳銳;基于圖像理解的動態(tài)特征目標分析與辨識方法研究[D];哈爾濱工業(yè)大學;2009年

本文編號：1799456

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/shoufeilunwen/xxkjbs/1799456.html

上一篇：半導體薄膜材料等離激元表面增強拉曼散射的研究
下一篇：低軌衛(wèi)星高動態(tài)通信鏈路同步技術(shù)研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于嵌入結(jié)構(gòu)性信息視覺特征的圖像理解模型研究