基于視覺認(rèn)知機(jī)理的圖像語義內(nèi)容獲取研究

發(fā)布時(shí)間：2018-06-16 00:29

本文選題：超像素分割 + 顯著性檢測(cè)�。� 參考：《北京科技大學(xué)》2016年博士論文

【摘要】：為了利用計(jì)算機(jī)模擬人類的視覺認(rèn)知機(jī)理,實(shí)現(xiàn)人類或其他高等生物的視覺功能,達(dá)到對(duì)反映客觀世界的圖像場(chǎng)景的感知、識(shí)別和理解,就需要根據(jù)圖像的視覺內(nèi)容來獲取人類能夠理解的語義內(nèi)容。由于在視覺感知初始階段,視覺注意往往會(huì)快速定位在一些具有一定語義信息的局部區(qū)域或者目標(biāo)上,這些區(qū)域或者目標(biāo)正是語義內(nèi)容所描述的對(duì)象；同時(shí),隨著局部區(qū)域的快速定位,視覺系統(tǒng)會(huì)根據(jù)這些區(qū)域之間的形狀及局部特征的視覺差異性,自動(dòng)聚焦場(chǎng)景中的主要或者顯著性目標(biāo)進(jìn)行感知；最后認(rèn)知系統(tǒng)會(huì)圍繞聚焦的顯著目標(biāo)及其相關(guān)聯(lián)信息而展開,從而形成針對(duì)整個(gè)場(chǎng)景描述的語義內(nèi)容及感知。因此,本文首先利用一種改進(jìn)的超像素分割方法,提取圖像中具有一定語義信息的局部區(qū)域：然后結(jié)合局部區(qū)域的視覺特征,構(gòu)建顯著性目標(biāo)或區(qū)域檢測(cè)模型,獲取圖像中的中高級(jí)語義信息-顯著性及顯著性視覺內(nèi)容：最后以顯著目標(biāo)或者區(qū)域及其相關(guān)信息為視覺引導(dǎo),利用神經(jīng)網(wǎng)絡(luò)通過深度學(xué)習(xí)建立起圖像的自動(dòng)語義標(biāo)注模型,獲取場(chǎng)景的最終高級(jí)語義描述內(nèi)容。具體工作如下：1)在局部區(qū)域的提取過程中,提出一種基于SLIC0融合紋理信息的超像素分割方法。此方法在分割過程中融合能夠反映圖像中目標(biāo)及區(qū)域固有外輪廓及邊界的紋理特征。同時(shí)采用圍繞種子像素點(diǎn)搜索其周圍圓形區(qū)域的策略,從而在進(jìn)一步提高處理效率的基礎(chǔ)上使得分割的超像素可以更加逼近圖像中局部區(qū)域或者目標(biāo)的外輪廓,保證相對(duì)快速分割出具有規(guī)則大小及形狀,以及其邊界符合目標(biāo)及區(qū)域的外輪廓的超像素。最后通過在公共數(shù)據(jù)集BSDS500上進(jìn)行實(shí)驗(yàn)及量化比較分析,結(jié)果表明本文所提的SLICO-t超像素分割方法優(yōu)越于目前評(píng)價(jià)很高的SLICO方法。其中在邊界召回率方面,相對(duì)比較穩(wěn)定的超過了SLICO方法的8到9個(gè)百分點(diǎn)。2)在顯著目標(biāo)或者區(qū)域檢測(cè)過程中,首先提出一種針對(duì)超像素局部區(qū)域信息進(jìn)行描述的稀疏直方圖模型。這種直方圖模型整合描述了局部區(qū)域的局部紋理、顏色及形狀信息。然后在此基礎(chǔ)上提出一種圖像顯著性檢測(cè)方法,使得檢測(cè)的顯著目標(biāo)或者區(qū)域清晰完整地從背景場(chǎng)景中分離開來,同時(shí),顯著性目標(biāo)或者區(qū)域具有相對(duì)完整的外輪廓及形狀特征,以及局部紋理細(xì)節(jié)信息。最后通過在Achanta等人提供的公開測(cè)試數(shù)據(jù)集上進(jìn)行實(shí)驗(yàn)及量化評(píng)估,并與目前流行的五種顯著性檢測(cè)方法比較,結(jié)果表明本文提出的顯著性檢測(cè)方法在精準(zhǔn)率、平均F-measure以及絕對(duì)均值錯(cuò)誤率方面優(yōu)于其它幾種顯著性檢測(cè)方法。3)在圖像的自動(dòng)標(biāo)注及語義內(nèi)容獲取過程中,本文首先以場(chǎng)景中顯著目標(biāo)的視覺特征為先驗(yàn)知識(shí),感知場(chǎng)景中的顯著目標(biāo)或者區(qū)域。然后在已經(jīng)感知的顯著目標(biāo)或者區(qū)域的基礎(chǔ)上再次利用整體局部區(qū)域特征進(jìn)行進(jìn)一步映射增強(qiáng)。這種雙層映射過程,使用兩種視覺特征進(jìn)行訓(xùn)練學(xué)習(xí),它是一種基于神經(jīng)網(wǎng)絡(luò)的在自我學(xué)習(xí)過程中進(jìn)行決策層面融合的過程。同時(shí),在圖像與文本語義信息的encoding過程中,借鑒使用已經(jīng)被成功驗(yàn)證的保序映射的方式進(jìn)行映射,從而比較準(zhǔn)確的挖掘揭示圖像與語義文本描述之間的潛在關(guān)系。最后通過在三種公共數(shù)據(jù)集Flickr8k,Flickr30k及MSCOCO上分別進(jìn)行訓(xùn)練、驗(yàn)證及測(cè)試,并應(yīng)用于圖像語義的雙向檢索進(jìn)行評(píng)估衡量。結(jié)果表明本文所提方法相比目前公開發(fā)表的方法,在不同召回率方面(Recall@K(k=1,5,10))都有了進(jìn)一步提高,并且獲取的語義內(nèi)容更加符合人類的認(rèn)知習(xí)慣,顯得自然流暢。同時(shí),本文的研究成果對(duì)圖像局部特征表征及提取、圖像分割以及更廣泛領(lǐng)域的圖像理解相關(guān)方面的研究具有重要的參考價(jià)值。
[Abstract]:In order to use the computer to simulate the human visual cognitive mechanism and realize the visual function of human or other higher organisms, to achieve the perception, recognition and understanding of the image scene reflecting the objective world, it is necessary to obtain the semantic internal capacity that human can understand according to the visual content of the image. It is often located quickly in some local areas or targets with certain semantic information. These areas or targets are the objects described by the semantic content. At the same time, with the rapid localization of local areas, the visual system will automatically focus the main scene in the scene according to the visual differences between the shapes and local characteristics between these regions. At the end of this paper, a modified super pixel segmentation method is used to extract the local region with certain semantic information in the image. Then, combining the visual features of the local area, a significant target or regional detection model is constructed to obtain the middle and advanced semantic information in the image - significance and significant visual content: finally, the visual guidance is guided by the significant target or the region and its related information, and the automatic semantic annotation of the image is established by using the neural network through depth learning. The final high-level semantic description of the scene is obtained. The specific work is as follows: 1) in the process of extracting local regions, a super pixel segmentation method based on SLIC0 fusion texture information is proposed. This method combines the texture features that can reflect the target and the region with the outer contour and boundary in the segmentation process. The strategy of searching around the round area around the seed pixel points, so as to further improve the processing efficiency, the segmented super pixel can be more approximated by the local area or the outer contour of the target, ensuring the relatively fast segmentation with the regular size and shape, and its boundary conforming to the target and the outer contour of the region. The results show that the SLICO-t super pixel segmentation method proposed in this paper is superior to the present SLICO method with high evaluation. In the aspect of the recall rate of the boundary, the relatively stable 8 to 9 percentage point.2 over the SLICO method is more stable than the SLICO method. In the process of region detection, a sparse histogram model is proposed to describe the local region information of the super pixel. The histogram model integrates the local texture, color and shape information. On the basis of this, an image saliency detection method is proposed to make the detection of the significant target or area. A clear and complete separation from the background scene. At the same time, the significant target or region has relatively complete external wheel profile and shape features, as well as local texture details. Finally, the experimental and quantitative evaluation is performed on the open test data set provided by Achanta et al. And compared with the five prevailing methods of detection. The results show that the significant detection method proposed in this paper is superior to other significant detection methods.3 in precision rate, average F-measure and absolute mean error rate. In the process of automatic image tagging and semantic content acquisition, this paper first takes the visual features of the significant targets in the scene as prior knowledge, and perceives the display in the scene. A further mapping and enhancement using the overall local region feature on the basis of a perceived significant target or region. This two-layer mapping process uses two visual features for training and learning. It is a neural network based fusion of decision-making levels in the process of self-learning. At the same time, in the encoding process of image and text semantic information, drawing on the use of a sequential mapping which has been successfully verified, the potential relationship between the image and the semantic text description is revealed more accurately. Finally, the training is carried out on the three common datasets, Flickr8k, Flickr30k and MSCOCO. The results show that the proposed method has been further improved in terms of different recall rates (Recall@K (k=1,5,10)), and the semantic content obtained is more consistent with human cognitive habits and appears to be natural and fluent. The research results of this paper have important reference value for the research of image local feature representation and extraction, image segmentation and the research of image understanding in a wide range of fields.
【學(xué)位授予單位】：北京科技大學(xué)
【學(xué)位級(jí)別】：博士
【學(xué)位授予年份】：2016
【分類號(hào)】：TP391.41

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 李倩倩;陽愛民;李心廣;;圖像語義的圖形化標(biāo)注和檢索研究[J];計(jì)算機(jī)應(yīng)用與軟件;2008年12期

2 時(shí)慧琨;;一種利用用戶反饋日志獲取圖像語義標(biāo)注方法[J];通化師范學(xué)院學(xué)報(bào);2010年12期

3 郭海鳳;張盈盈;李廣水;韓立新;;基于社會(huì)網(wǎng)絡(luò)的圖像語義獲取研究綜述[J];計(jì)算機(jī)與現(xiàn)代化;2014年01期

4 石躍祥;朱東輝;蔡自興;B.Benhabib;;圖像語義特征的抽取方法及其應(yīng)用[J];計(jì)算機(jī)工程;2007年19期

5 孫季豐;袁春林;邱衛(wèi)東;余英林;;一個(gè)具有圖像語義的物體分類系統(tǒng)的實(shí)現(xiàn)[J];科學(xué)技術(shù)與工程;2008年03期

6 王妍寧;郭雷;方俊;;一種新的圖像語義自動(dòng)標(biāo)注模型[J];計(jì)算機(jī)工程與應(yīng)用;2011年07期

7 李曉雅;;淺析計(jì)算機(jī)圖像語義的識(shí)別應(yīng)用技術(shù)[J];知識(shí)經(jīng)濟(jì);2012年09期

8 李大湘;趙小強(qiáng);李娜;;圖像語義分析的多示例學(xué)習(xí)算法綜述[J];控制與決策;2013年04期

9 林春漪;馬麗紅;尹俊勛;陳建宇;;基于多層貝葉斯網(wǎng)絡(luò)的醫(yī)學(xué)圖像語義建模[J];生物醫(yī)學(xué)工程學(xué)雜志;2009年02期

10 魏晗;李弼程;張瑞杰;唐永旺;;圖像語義提取方法研究[J];現(xiàn)代電子技術(shù);2011年24期

相關(guān)會(huì)議論文前3條

1 張楊;房斌;徐傳運(yùn);;基于本體和描述邏輯的圖像語義識(shí)別[A];全國(guó)第20屆計(jì)算機(jī)技術(shù)與應(yīng)用學(xué)術(shù)會(huì)議（CACIS·2009）暨全國(guó)第1屆安全關(guān)鍵技術(shù)與應(yīng)用學(xué)術(shù)會(huì)議論文集（上冊(cè)）[C];2009年

2 葉劍燁;謝祖銘;周向東;張亮;施伯樂;;一種新的圖像語義自動(dòng)標(biāo)注方法[A];第二十一屆中國(guó)數(shù)據(jù)庫學(xué)術(shù)會(huì)議論文集（研究報(bào)告篇）[C];2004年

3 王偉強(qiáng);付立波;高文;黃慶明;蔣樹強(qiáng);;一種基于筆畫特征的疊加文字檢測(cè)方法[A];全國(guó)網(wǎng)絡(luò)與信息安全技術(shù)研討會(huì)論文集（下冊(cè)）[C];2007年

相關(guān)博士學(xué)位論文前9條

1 趙永威;圖像語義表達(dá)與度量學(xué)習(xí)技術(shù)研究[D];解放軍信息工程大學(xué);2016年

2 南柄飛;基于視覺認(rèn)知機(jī)理的圖像語義內(nèi)容獲取研究[D];北京科技大學(xué);2016年

3 陳久軍;基于統(tǒng)計(jì)學(xué)習(xí)的圖像語義挖掘研究[D];浙江大學(xué);2006年

4 于永新;基于本體的圖像語義識(shí)別和檢索研究[D];天津大學(xué);2009年

5 李曉燕;海量圖像語義分析和檢索技術(shù)研究[D];浙江大學(xué);2009年

6 林春漪;基于混合貝葉斯網(wǎng)絡(luò)的醫(yī)學(xué)圖像語義建模及其檢索的研究[D];華南理工大學(xué);2006年

7 許紅濤;Web圖像語義分析與自動(dòng)標(biāo)注研究[D];復(fù)旦大學(xué);2009年

8 鮑泓;基于視覺感知的中國(guó)畫圖像語義自動(dòng)分類研究[D];北京交通大學(xué);2012年

9 王梅;基于多標(biāo)簽學(xué)習(xí)的圖像語義自動(dòng)標(biāo)注研究[D];復(fù)旦大學(xué);2008年

相關(guān)碩士學(xué)位論文前10條

1 甄健華;圖像語義自動(dòng)標(biāo)注過程研究[D];河北師范大學(xué);2015年

2 楊雪;基于紋理基元塊的圖像語義分割[D];西南科技大學(xué);2015年

3 張智慧;基于本體的圖像語義檢索研究[D];西南科技大學(xué);2015年

4 高瞰;社會(huì)化媒體中的圖像語義理解[D];華北電力大學(xué);2015年

5 王行行;面向檢索的鞋底花紋圖像語義表達(dá)算法研究[D];大連海事大學(xué);2015年

6 羅世操;基于深度學(xué)習(xí)的圖像語義提取與圖像檢索技術(shù)研究[D];東華大學(xué);2016年

7 陳鴻翔;基于卷積神經(jīng)網(wǎng)絡(luò)的圖像語義分割[D];浙江大學(xué);2016年

8 趙鵬坤;基于屬性約簡(jiǎn)的圖像語義自動(dòng)標(biāo)注方法[D];太原科技大學(xué);2015年

9 劉先明;互聯(lián)網(wǎng)圖像語義表達(dá)規(guī)律分析及主題發(fā)現(xiàn)[D];哈爾濱工業(yè)大學(xué);2010年

10 王小蕾;基于上下文的社交圖像語義信息的精化與豐富[D];華北電力大學(xué);2012年

，

本文編號(hào)：2024360

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/shoufeilunwen/xxkjbs/2024360.html

上一篇：具有時(shí)滯的隨機(jī)系統(tǒng)滾動(dòng)時(shí)域控制研究
下一篇：面向大規(guī)模數(shù)值模擬的并行非結(jié)構(gòu)網(wǎng)格生成方法研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于視覺認(rèn)知機(jī)理的圖像語義內(nèi)容獲取研究