天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 自動化論文 >

基于深度學(xué)習(xí)的單目圖像深度估計

發(fā)布時間:2019-06-08 12:08
【摘要】:3D場景解析是計算機(jī)視覺領(lǐng)域一個重要的研究課題,而深度估計是理解場景的3D幾何關(guān)系的重要方法。在許多計算機(jī)視覺任務(wù)中,與只使用RGB圖像的情況相比,額外地融入相對準(zhǔn)確可靠的深度信息能夠較大地提升算法的性能,例如語義分割,姿態(tài)估計及目標(biāo)檢測。傳統(tǒng)的單目圖像深度估計方法都基于光學(xué)幾何約束或一些環(huán)境假設(shè),例如運動中恢復(fù)結(jié)構(gòu),焦點或者光照變化等。然而,在缺少以上約束或假設(shè)的情況下,研究出一個能夠僅根據(jù)一幅單目圖像的信息精確地估計深度的計算機(jī)視覺系統(tǒng),是一項極具挑戰(zhàn)的任務(wù)。該任務(wù)有以下兩大難點:其一是一般的計算機(jī)視覺系統(tǒng)很難像人類的大腦一樣從單目圖像中抓取到充足的可用以推測3D結(jié)構(gòu)的信息;其二是該任務(wù)本身是一個病態(tài)問題,即一張二維圖像對應(yīng)無窮多種真實的3D場景。這種將單幅圖像映射到深度圖的固有的不確定性決定了視覺模型不可能僅憑單幅圖像估計出精確的深度值。針對這兩個難題,本文分別提出了以下方法:首先,本文提出了一個將卷積神經(jīng)網(wǎng)絡(luò)與條件隨機(jī)場統(tǒng)一于一個深度學(xué)習(xí)框架內(nèi)的計算機(jī)視覺模型。卷積神經(jīng)網(wǎng)絡(luò)能夠提取豐富的相關(guān)特征,條件隨機(jī)場則可根據(jù)像素的位置與顏色信息對卷積網(wǎng)絡(luò)輸出進(jìn)行優(yōu)化;其次,針對這一問題的病態(tài)性,本文提出了一個融合稀疏已知標(biāo)簽的視覺模型,該模型以已獲得的一些相對精確的深度值為參考,較大地減少了其他像素點上合理深度值的搜索范圍,從而使模型在一定的程度上減少了RGB圖像到深度圖之間映射的不確定性?偠灾,本文提供了從單目圖像估計深度的最新研究進(jìn)展,包括相關(guān)的數(shù)據(jù)庫,研究方法及其性能。對單目圖像深度估計存在的問題以及未來的發(fā)展方向做出了分析與討論。同時,提出了一種從單目圖像中學(xué)習(xí)深度信息特征表達(dá)的計算機(jī)視覺模型?紤]到該問題的病態(tài)性,又提出了一種融合稀疏已知標(biāo)簽的視覺模型,減少了單目圖像與深度圖之間的映射的不確定性。并且,在NYU Depth v2數(shù)據(jù)集上驗證了以上兩個視覺模型的有效性與優(yōu)越性。
[Abstract]:3D scene analysis is an important research topic in the field of computer vision, and depth estimation is an important method to understand the 3D geometric relationship of scene. In many computer vision tasks, the extra integration of relatively accurate and reliable depth information can greatly improve the performance of the algorithm, such as semantic segmentation, attitude estimation and target detection, compared with the use of RGB images only. Traditional monocular image depth estimation methods are based on optical geometric constraints or some environmental assumptions, such as restoration structure in motion, focus or light change, and so on. However, in the absence of the above constraints or assumptions, it is a challenging task to develop a computer vision system which can accurately estimate the depth according to the information of only one monocular image. The task has the following two difficulties: one is that the general computer vision system is difficult to capture enough information from monocular images like the human brain to infer 3D structure; The other is that the task itself is a morbid problem, that is, a two-dimensional image corresponds to infinitely many real 3D scenes. The inherent uncertainty of mapping a single image to a depth map determines that the visual model can not estimate the exact depth value only from a single image. In order to solve these two problems, the following methods are proposed in this paper: firstly, a computer vision model which unifies convolution neural network and conditional random field into a deep learning framework is proposed. Convolution neural network can extract rich related features, and conditional random field can optimize the output of convolution network according to the position and color information of pixels. Secondly, in view of the pathological nature of this problem, a visual model combining sparse known tags is proposed in this paper, which is based on some relatively accurate depth values obtained. The search range of reasonable depth value on other pixels is greatly reduced, so that the model reduces the uncertainty of mapping between RGB image and depth map to a certain extent. In a word, this paper provides the latest research progress of depth estimation from monocular images, including related databases, research methods and performance. The problems existing in depth estimation of monocular images and the development direction in the future are analyzed and discussed. At the same time, a computer vision model for learning depth information feature representation from monocular images is proposed. Considering the pathological nature of the problem, a visual model combining sparse known tags is proposed, which reduces the uncertainty of mapping between monocular images and depth maps. Moreover, the effectiveness and superiority of the above two visual models are verified on the NYU Depth v2 dataset.
【學(xué)位授予單位】:哈爾濱理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.41;TP18

【參考文獻(xiàn)】

相關(guān)期刊論文 前4條

1 馮春;吳小鋒;尹飛鴻;楊名利;;基于局部特征匹配的雙焦單目立體視覺深度估計[J];計算機(jī)技術(shù)與發(fā)展;2016年10期

2 許路;趙海濤;孫韶媛;;基于深層卷積神經(jīng)網(wǎng)絡(luò)的單目紅外圖像深度估計[J];光學(xué)學(xué)報;2016年07期

3 明英;蔣晶玨;明星;;基于柯西分布的單幅圖像深度估計[J];武漢大學(xué)學(xué)報(信息科學(xué)版);2016年06期

4 江靜;張雪松;;基于計算機(jī)視覺的深度估計方法[J];光電技術(shù)應(yīng)用;2011年01期

,

本文編號:2495273

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2495273.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶85bbd***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com