基于深度學(xué)習(xí)的圖像物體檢測(cè)與分類(lèi)
發(fā)布時(shí)間:2018-07-16 15:08
【摘要】:圖像物體檢測(cè)與分類(lèi)既是計(jì)算機(jī)視覺(jué)領(lǐng)域的基礎(chǔ),同時(shí)也是視覺(jué)領(lǐng)域的核心內(nèi)容。圖像物體檢測(cè)與分類(lèi)與人們生活密切相關(guān)。近年來(lái),由于深度學(xué)習(xí)方法在ImageNet ILSVRC競(jìng)賽輝煌的成果,圖像物體檢測(cè)和分類(lèi)的研究越來(lái)越活躍。大數(shù)據(jù)時(shí)代的到來(lái)給人工智能的發(fā)展提供前所未有的機(jī)遇,在這個(gè)時(shí)代背景下,深度學(xué)習(xí)在包括圖像物體檢測(cè)等方面取得的突破性進(jìn)展并非偶然。R-CNN首次提出了被廣泛采用的基于深度學(xué)習(xí)的物體檢測(cè)流程,并首先采用選擇性搜索提出候選區(qū)域,利用深度卷積網(wǎng)絡(luò)從候選區(qū)域提取特征,然后利用支持向量機(jī)等線性分類(lèi)器基于特征將區(qū)域分為物體和背景。本文通過(guò)對(duì)R-CNN模型進(jìn)行改進(jìn),實(shí)現(xiàn)一個(gè)基于深度學(xué)習(xí)的圖像物體檢測(cè)與分類(lèi)系統(tǒng)。首先,對(duì)于區(qū)域檢測(cè)模塊進(jìn)行改進(jìn),在檢測(cè)窗生成模塊使用檢測(cè)速率更高的Edge Boxes算法代替選擇性搜索。其次,我們對(duì)R-CNN進(jìn)行改進(jìn),打破傳統(tǒng)的分級(jí)訓(xùn)練思想,修改了 R-CNN的網(wǎng)絡(luò)結(jié)構(gòu),通過(guò)端對(duì)端的訓(xùn)練方式,提高了目標(biāo)檢測(cè)和分類(lèi)算法在PASCAL VOC數(shù)據(jù)集的平均準(zhǔn)確率(mAP)。此外,我們基于R-CNN改進(jìn)的目標(biāo)檢測(cè)與分類(lèi)算法減少了訓(xùn)練階段的緩存空間,提高了空間利用率。最終我們的目標(biāo)檢測(cè)與分類(lèi)算法在PASCAL VOC數(shù)據(jù)集獲得了 56.8的mAP,相比DPM v5模型提升70%,相比R-CNN提升了 10%。此外,以往的研究注重于檢測(cè)效果和分類(lèi)效果的提升,側(cè)重于在數(shù)據(jù)方面的研究。然而,基于卷積神經(jīng)網(wǎng)絡(luò)的可視化工作也是十分有必要的。因此,本文在CNN特征提取可視化也做了很多工作?梢园l(fā)現(xiàn),隨著網(wǎng)絡(luò)層數(shù)的增加,學(xué)習(xí)到的特征語(yǔ)義越來(lái)越抽象,越能從語(yǔ)義上概括圖像的特征。
[Abstract]:Image object detection and classification is not only the foundation of computer vision field, but also the core content of vision field. Image object detection and classification are closely related to people's life. In recent years, the research of image object detection and classification has become more and more active due to the brilliant achievements of deep learning methods in ImageNet ILSVRC. The arrival of the big data era provides an unprecedented opportunity for the development of artificial intelligence. The breakthrough in depth learning, including image object detection, is not accidental. R-CNN proposes a widely used object detection process based on depth learning for the first time. The feature is extracted from candidate region by deep convolution network, and then the region is divided into object and background based on feature by linear classifier such as support vector machine (SVM). In this paper, an image object detection and classification system based on depth learning is implemented by improving R-CNN model. Firstly, the region detection module is improved, and the Edge boxes algorithm with higher detection rate is used to replace the selective search in the window generation module. Secondly, we improve R-CNN, break the traditional hierarchical training idea, modify the network structure of R-CNN, and improve the average accuracy (mAP) of target detection and classification algorithm in Pascal VOC dataset through end-to-end training. In addition, our improved target detection and classification algorithm based on R-CNN reduces the buffer space in the training phase and improves the space utilization ratio. Finally, our target detection and classification algorithm obtains 56.8 mAPs in Pascal VOC dataset, 70 steps higher than DPM v5 model and 10 parts higher than R-CNN model. In addition, previous studies have focused on the improvement of detection and classification effects, as well as on data. However, visualization based on convolutional neural networks is also necessary. Therefore, this paper has done a lot of work in CNN feature extraction visualization. It can be found that with the increase of the number of network layers, the feature semantics learned becomes more and more abstract, and the feature of the image can be summarized more semantically.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP391.41
本文編號(hào):2126769
[Abstract]:Image object detection and classification is not only the foundation of computer vision field, but also the core content of vision field. Image object detection and classification are closely related to people's life. In recent years, the research of image object detection and classification has become more and more active due to the brilliant achievements of deep learning methods in ImageNet ILSVRC. The arrival of the big data era provides an unprecedented opportunity for the development of artificial intelligence. The breakthrough in depth learning, including image object detection, is not accidental. R-CNN proposes a widely used object detection process based on depth learning for the first time. The feature is extracted from candidate region by deep convolution network, and then the region is divided into object and background based on feature by linear classifier such as support vector machine (SVM). In this paper, an image object detection and classification system based on depth learning is implemented by improving R-CNN model. Firstly, the region detection module is improved, and the Edge boxes algorithm with higher detection rate is used to replace the selective search in the window generation module. Secondly, we improve R-CNN, break the traditional hierarchical training idea, modify the network structure of R-CNN, and improve the average accuracy (mAP) of target detection and classification algorithm in Pascal VOC dataset through end-to-end training. In addition, our improved target detection and classification algorithm based on R-CNN reduces the buffer space in the training phase and improves the space utilization ratio. Finally, our target detection and classification algorithm obtains 56.8 mAPs in Pascal VOC dataset, 70 steps higher than DPM v5 model and 10 parts higher than R-CNN model. In addition, previous studies have focused on the improvement of detection and classification effects, as well as on data. However, visualization based on convolutional neural networks is also necessary. Therefore, this paper has done a lot of work in CNN feature extraction visualization. It can be found that with the increase of the number of network layers, the feature semantics learned becomes more and more abstract, and the feature of the image can be summarized more semantically.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類(lèi)號(hào)】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 黃凱奇;任偉強(qiáng);譚鐵牛;;圖像物體分類(lèi)與檢測(cè)算法綜述[J];計(jì)算機(jī)學(xué)報(bào);2014年06期
2 余凱;賈磊;陳雨強(qiáng);徐偉;;深度學(xué)習(xí)的昨天、今天和明天[J];計(jì)算機(jī)研究與發(fā)展;2013年09期
3 孫志軍;薛磊;許陽(yáng)明;王正;;深度學(xué)習(xí)研究綜述[J];計(jì)算機(jī)應(yīng)用研究;2012年08期
相關(guān)碩士學(xué)位論文 前3條
1 邊云龍;基于深度學(xué)習(xí)的視頻中的體育類(lèi)型檢測(cè)技術(shù)的研究[D];北京郵電大學(xué);2015年
2 王恒歡;基于深度學(xué)習(xí)的圖像識(shí)別算法研究[D];北京郵電大學(xué);2015年
3 杜騫;深度學(xué)習(xí)在圖像語(yǔ)義分類(lèi)中的應(yīng)用[D];華中師范大學(xué);2014年
,本文編號(hào):2126769
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2126769.html
最近更新
教材專(zhuān)著