基于深度卷積神經(jīng)網(wǎng)絡(luò)的室外場景理解研究

發(fā)布時(shí)間：2018-10-25 10:47

【摘要】：場景理解是計(jì)算機(jī)視覺和人工智能領(lǐng)域的研究熱點(diǎn),其研究成果已被廣泛應(yīng)用于機(jī)器人導(dǎo)航、網(wǎng)絡(luò)搜索、安防監(jiān)控、醫(yī)療衛(wèi)生等眾多領(lǐng)域。場景理解的各個(gè)分支任務(wù),如目標(biāo)檢測、圖像語義分割等,近年來都取得了突破性進(jìn)展,但仍然存在眾多不足之處。如由于目標(biāo)自身形變和外界因素干擾,通常難以獲得可靠、魯棒的特征用于場景中的動(dòng)態(tài)目標(biāo)分類。深度卷積神經(jīng)網(wǎng)絡(luò)(Deep Convolutional Neural Networks,DCNN)通過端到端的特征學(xué)習(xí),能有效實(shí)現(xiàn)對場景圖片的語義分類,但卻難以實(shí)現(xiàn)對場景圖片的精確語義分割。圍繞上述問題,本文的主要研究內(nèi)容如下:1)首先,提出一種基于多任務(wù)空間金字塔池化DCNN的動(dòng)態(tài)目標(biāo)分類方法。該方法首先通過高斯混合模型提取出視頻中場景動(dòng)態(tài)目標(biāo)物體,經(jīng)圖像形態(tài)學(xué)處理而獲得較為準(zhǔn)確、完整的目標(biāo)圖像塊。然后將獲得的目標(biāo)圖像塊送入多任務(wù)空間金字塔池化DCNN而實(shí)現(xiàn)對該目標(biāo)圖像塊的分類,同時(shí)獲得其語義標(biāo)簽。實(shí)驗(yàn)結(jié)果表明,高層卷積特征對部分遮擋、重疊、視角變化等具有較強(qiáng)的魯棒性,多任務(wù)空間金字塔池化DCNN在動(dòng)態(tài)目標(biāo)分類任務(wù)上能取得很高的分類精度并給出較為準(zhǔn)確的目標(biāo)語義標(biāo)簽。2)其次,針對傳統(tǒng)手工特征魯棒性和表達(dá)能力的不足,提出一種將DCNN與MeanShift圖像分割算法相結(jié)合的室外場景語義分割方法。該方法首先通過MeanShift算法對場景圖像進(jìn)行預(yù)分割,然后在分割后的各局部區(qū)域隨機(jī)采集樣本圖像塊并將其送入DCNN獲得其類別概率,最后將各局部區(qū)域的樣本圖像塊的類別概率進(jìn)行平均獲得其語義標(biāo)簽進(jìn)而實(shí)現(xiàn)語義分割。關(guān)于DCNN卷積核大小、卷積核個(gè)數(shù)和訓(xùn)練數(shù)據(jù)集的擴(kuò)展等因素對場景圖像語義分割結(jié)果的影響做了研究分析。與基于SIFT局部特征描述子的SEVI-BOVW方法進(jìn)行對比的實(shí)驗(yàn)結(jié)果表明,本方法在準(zhǔn)確率和識(shí)別速度上均有較大提升。3)最后,基于DCNN,提出了一種聯(lián)合物體檢測與語義分割的場景理解方法,并將其與基于HOG(Histogram of Oriented Gradients)紋理特征及支持向量機(jī)(Support Vector Machine,SVM)分類算法的背景物體語義分割方法相結(jié)合應(yīng)用于機(jī)器人的校園導(dǎo)航。該場景理解方法由Faster R-CNN算法檢測場景圖片中的前景目標(biāo)物體,通過Deeplab-CRFs模型對場景圖片中的前景目標(biāo)物體進(jìn)行語義預(yù)分割,最后由GrabCut前景提取算法將二者的檢測、分割結(jié)果相結(jié)合而實(shí)現(xiàn)對目標(biāo)物體更精確、更完整的語義分割。實(shí)驗(yàn)證明,該方法能準(zhǔn)確、全面地對目標(biāo)進(jìn)行檢測及語義分割,并有效用于機(jī)器人的校園導(dǎo)航。
[Abstract]:Scene understanding is a hot topic in the field of computer vision and artificial intelligence. Its research results have been widely used in many fields such as robot navigation, network search, security monitoring, medical care and so on. Various branch tasks of scene understanding, such as target detection, image semantic segmentation and so on, have made a breakthrough in recent years, but there are still many shortcomings. For example, it is difficult to obtain reliable and robust features for dynamic target classification in the scene because of the deformation of the target itself and the interference of external factors. Deep convolution neural network (Deep Convolutional Neural Networks,DCNN) can effectively realize semantic classification of scene images by end-to-end feature learning, but it is difficult to achieve accurate semantic segmentation of scene images. The main contents of this paper are as follows: 1) first of all, a dynamic object classification method based on multi-task space pyramid pool DCNN is proposed. Firstly, the dynamic object of scene in video is extracted by Gao Si mixed model, and the complete target image block is obtained by morphological processing. Then the target image block is sent into the multi-task space pyramid to pool DCNN to realize the classification of the target image block and the semantic label is obtained at the same time. The experimental results show that the high level convolution features are robust to partial occlusion, overlap, angle change, etc. Multi-task space pyramidal DCNN can achieve high classification accuracy and give accurate target semantic tags in dynamic target classification tasks. An outdoor scene semantic segmentation method combining DCNN and MeanShift image segmentation algorithm is proposed. Firstly, the scene image is presegmented by MeanShift algorithm, and then the sample image blocks are collected randomly in each local region after segmentation, and the probability of classification is obtained by sending them into DCNN. Finally, the category probability of the sample image block of each local region is averaged to obtain its semantic label, and then the semantic segmentation is realized. The effects of the size of DCNN convolution kernel, the number of convolution cores and the expansion of training data set on the result of scene image semantic segmentation are studied and analyzed. Compared with the SEVI-BOVW method based on SIFT local feature descriptor, the experimental results show that the accuracy and recognition speed of the method are greatly improved. Finally, a scene understanding method combining object detection and semantic segmentation is proposed based on DCNN,. It is combined with the semantic segmentation method of background object based on HOG (Histogram of Oriented Gradients) texture feature and support Vector Machine (Support Vector Machine,SVM) classification algorithm in the campus navigation of robot. In this method, the foreground object in scene image is detected by Faster R-CNN algorithm, and the foreground object in scene image is segmented by Deeplab-CRFs model. Finally, GrabCut foreground extraction algorithm detects the two objects. The segmentation results combine to achieve a more accurate and complete semantic segmentation of the target object. Experiments show that the proposed method can detect and segment objects accurately and comprehensively, and can be effectively used in robot campus navigation.
【學(xué)位授予單位】：杭州電子科技大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2016
【分類號】：TP391.41;TP183

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 顧勇;張燦果;龔志廣;;基于圖像塊分割融合算法在醫(yī)學(xué)圖像中的應(yīng)用[J];河北建筑工程學(xué)院學(xué)報(bào);2007年02期

2 李天偉;黃謙;郭模燦;何四華;;圖像塊混沌特征在海面運(yùn)動(dòng)目標(biāo)檢測中的應(yīng)用[J];中國造船;2011年02期

3 李軍;部分圖像塊的顯示及特技制作技巧[J];電腦編程技巧與維護(hù);1997年04期

4 李生金;蒲寶明;賀寶岳;王維維;;基于圖像塊的滯留物/移取物的檢測方法[J];小型微型計(jì)算機(jī)系統(tǒng);2014年01期

5 趙德斌;陳耀強(qiáng);高文;;基于圖像塊方向的自適應(yīng)無失真編碼[J];模式識(shí)別與人工智能;1998年01期

6 陳琦,李華,朱光喜;一種新的應(yīng)用于屏幕共享的圖像塊識(shí)別算法[J];電訊技術(shù);2000年06期

7 劉尚翼;霍永津;羅欣榮;白仲亮;魏林鋒;項(xiàng)世軍;;基于圖像塊相關(guān)性分類的加密域可逆數(shù)據(jù)隱藏[J];武漢大學(xué)學(xué)報(bào)(理學(xué)版);2013年05期

8 陳奮,閆冬梅,趙忠明;一種快速圖像塊填充算法及其在遙感影像處理中的應(yīng)用[J];計(jì)算機(jī)應(yīng)用;2005年10期

9 馬文龍,余寧梅,銀磊,高勇;圖像塊動(dòng)態(tài)劃分矢量量化[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2005年02期

10 李維釗,王廣偉;圖像塊平坦測度與系數(shù)掃描方式選擇[J];山東電子;2000年04期

相關(guān)會(huì)議論文前2條

1 李趙紅;侯建軍;宋偉;;基于圖像塊等級模型的多重認(rèn)證水印算法[A];第八屆全國信息隱藏與多媒體安全學(xué)術(shù)大會(huì)湖南省計(jì)算機(jī)學(xué)會(huì)第十一屆學(xué)術(shù)年會(huì)論文集[C];2009年

2 鐘凡;莫銘臻;秦學(xué)英;彭群生;;基于WSSD的不規(guī)則圖像塊快速匹配[A];中國計(jì)算機(jī)圖形學(xué)進(jìn)展2008--第七屆中國計(jì)算機(jī)圖形學(xué)大會(huì)論文集[C];2008年

相關(guān)博士學(xué)位論文前5條

1 霍雷剛;圖像處理中的塊先驗(yàn)理論及應(yīng)用研究[D];西安電子科技大學(xué);2015年

2 欽夏孟;稠密圖像塊匹配方法及其應(yīng)用[D];北京理工大學(xué);2015年

3 林樂平;基于過完備字典的非凸壓縮感知理論與方法研究[D];西安電子科技大學(xué);2016年

4 向濤;復(fù)雜場景下目標(biāo)檢測算法研究[D];電子科技大學(xué);2016年

5 宋偉;幾類數(shù)字圖像水印算法的研究[D];北京交通大學(xué);2010年

相關(guān)碩士學(xué)位論文前10條

1 王榮麗;基于半監(jiān)督學(xué)習(xí)的目標(biāo)跟蹤方法研究[D];浙江師范大學(xué);2015年

2 祝漢城;數(shù)字圖像的客觀質(zhì)量評價(jià)方法研究[D];中國礦業(yè)大學(xué);2015年

3 陸杰;使用自組織增量神經(jīng)網(wǎng)絡(luò)實(shí)現(xiàn)單層非監(jiān)督特征學(xué)習(xí)[D];南京大學(xué);2015年

4 熊耀先;基于圖像塊統(tǒng)計(jì)特性的EPLL遙感圖像復(fù)原方法[D];國防科學(xué)技術(shù)大學(xué);2014年

5 張書揚(yáng);基于冗余字典的圖像壓縮感知技術(shù)研究[D];吉林大學(xué);2016年

6 楊存強(qiáng);基于圖像塊多級分類和稀疏表示的超分辨率重建算法研究[D];天津工業(yè)大學(xué);2016年

7 李向向;視頻監(jiān)控下實(shí)時(shí)異常行為檢測研究[D];南京郵電大學(xué);2016年

8 程曉東;基于幀間塊約束和進(jìn)化計(jì)算的視頻壓縮感知重構(gòu)方法[D];西安電子科技大學(xué);2016年

9 李小青;基于脊波冗余字典和多目標(biāo)遺傳優(yōu)化的壓縮感知圖像重構(gòu)[D];西安電子科技大學(xué);2016年

10 文俊;基于深度卷積神經(jīng)網(wǎng)絡(luò)的室外場景理解研究[D];杭州電子科技大學(xué);2016年

，

本文編號：2293486

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2293486.html

上一篇：考慮學(xué)生感知滿意度的高校教學(xué)時(shí)間調(diào)度問題
下一篇：一種新穎的螢火蟲算法求解PID控制器參數(shù)自整定問題

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于深度卷積神經(jīng)網(wǎng)絡(luò)的室外場景理解研究