基于卷積神經(jīng)網(wǎng)絡(luò)的計(jì)算機(jī)視覺(jué)關(guān)鍵技術(shù)研究
發(fā)布時(shí)間:2018-08-14 14:08
【摘要】:近年來(lái),深度學(xué)習(xí)技術(shù)的研究引起了學(xué)術(shù)界和工業(yè)界的廣泛興趣,推動(dòng)了人工智能領(lǐng)域一系列應(yīng)用研究的快速發(fā)展。卷積神經(jīng)網(wǎng)絡(luò)作為深度學(xué)習(xí)領(lǐng)域的一個(gè)重要研究分支,與計(jì)算機(jī)視覺(jué)相關(guān)技術(shù)的研究聯(lián)系尤為緊密。隨著網(wǎng)絡(luò)結(jié)構(gòu)的不斷優(yōu)化以及海量數(shù)據(jù)集的出現(xiàn),卷積神經(jīng)網(wǎng)絡(luò)近年來(lái)在一系列的計(jì)算機(jī)視覺(jué)應(yīng)用領(lǐng)域中取得了突破性的研究進(jìn)展。然而,計(jì)算機(jī)視覺(jué)作為一個(gè)研究?jī)?nèi)容相當(dāng)廣泛的領(lǐng)域,無(wú)論在特定技術(shù)的深度還是領(lǐng)域廣度的研究方面,仍然存在著很大的研究空間。計(jì)算機(jī)視覺(jué)領(lǐng)域可以分為三個(gè)層次的研究:低層特征研究、中層語(yǔ)義特征表達(dá)和高層語(yǔ)義理解。基于卷積神經(jīng)網(wǎng)絡(luò),本文針對(duì)計(jì)算機(jī)三個(gè)層次中的代表性關(guān)鍵技術(shù)進(jìn)行了探索,分別是物體識(shí)別、場(chǎng)景標(biāo)注和場(chǎng)景識(shí)別。具體的三大研究?jī)?nèi)容及其分別包含的創(chuàng)新點(diǎn)包括:第一,在物體識(shí)別領(lǐng)域,針對(duì)單列卷積神經(jīng)網(wǎng)絡(luò)容易過(guò)擬合的問(wèn)題,研究了異構(gòu)多列卷積神經(jīng)網(wǎng)絡(luò)在物體識(shí)別應(yīng)用中的效果;谥髁鲾(shù)據(jù)集的實(shí)驗(yàn)表明,異構(gòu)多列卷積神經(jīng)網(wǎng)絡(luò)相比于單列的卷積神經(jīng)網(wǎng)絡(luò)能有效提高網(wǎng)絡(luò)的泛化性能。針對(duì)傳統(tǒng)的網(wǎng)絡(luò)融合存在的融合方式單一,泛化性能較差的問(wèn)題,提出了一種基于滑動(dòng)窗口的網(wǎng)絡(luò)融合策略。滑動(dòng)窗口融合策略針對(duì)不同網(wǎng)絡(luò)中輸出的置信度信息進(jìn)行有選擇的融合,相比于傳統(tǒng)單一的網(wǎng)絡(luò)融合方式,滑動(dòng)窗口融合策略是一種更加一般化的方法,并且兼容了經(jīng)典的融合策略,能夠有效提高網(wǎng)絡(luò)融合的效果。第二,在場(chǎng)景標(biāo)注領(lǐng)域,提出了一種基于卷積神經(jīng)網(wǎng)絡(luò)的場(chǎng)景標(biāo)注方法,并且在主流的室內(nèi)室外場(chǎng)景標(biāo)注數(shù)據(jù)集中,均取得了優(yōu)于經(jīng)典算法的場(chǎng)景標(biāo)注效果。針對(duì)場(chǎng)景標(biāo)注中的特征學(xué)習(xí)問(wèn)題,研究了基于訓(xùn)練的卷積神經(jīng)網(wǎng)絡(luò)特征和通用的卷積神經(jīng)網(wǎng)絡(luò)特征在場(chǎng)景標(biāo)注任務(wù)中的應(yīng)用效果。針對(duì)傳統(tǒng)場(chǎng)景標(biāo)注算法中存在標(biāo)注結(jié)果區(qū)域一致性問(wèn)題,提出了區(qū)域一致性激勵(lì)算法。區(qū)域一致性激勵(lì)算法利用場(chǎng)景圖像中的全局邊緣概率,迭代地對(duì)場(chǎng)景標(biāo)注的區(qū)域一致性效果進(jìn)行改善。基于公共數(shù)據(jù)集上的實(shí)驗(yàn)表明,區(qū)域一致性激勵(lì)算法相比于經(jīng)典的同類算法能夠取得更好的場(chǎng)景標(biāo)注準(zhǔn)確度和視覺(jué)一致性。第三,在場(chǎng)景識(shí)別領(lǐng)域,提出了一種基于多尺度顯著區(qū)域特征學(xué)習(xí)的場(chǎng)景識(shí)別方法,并且在公共數(shù)據(jù)集的實(shí)驗(yàn)中取得了相比于同類經(jīng)典算法更好的場(chǎng)景識(shí)別效果。針對(duì)場(chǎng)景圖像內(nèi)容信息較為復(fù)雜的問(wèn)題,提出了一種顯著區(qū)域的判別策略,并且利用顯著區(qū)域的多尺度信息對(duì)一幅場(chǎng)景圖像進(jìn)行表達(dá)。針對(duì)傳統(tǒng)人工設(shè)計(jì)特征在場(chǎng)景識(shí)別任務(wù)中的判別性能較弱的問(wèn)題,利用了卷積神經(jīng)網(wǎng)絡(luò)的遷移學(xué)習(xí)策略,在多尺度的顯著區(qū)域?qū)?chǎng)景圖像特征進(jìn)行學(xué)習(xí),完成特征表達(dá)。實(shí)驗(yàn)表明,基于多尺度顯著區(qū)域的特征學(xué)習(xí)策略能有效提高場(chǎng)景識(shí)別的準(zhǔn)確度。此外,卷積神經(jīng)網(wǎng)絡(luò)的遷移學(xué)習(xí)特征相比于傳統(tǒng)的人工設(shè)計(jì)特征具有更好的判別性能。本文基于卷積神經(jīng)網(wǎng)絡(luò)針對(duì)計(jì)算機(jī)視覺(jué)的三個(gè)關(guān)鍵技術(shù)進(jìn)行了研究。針對(duì)每個(gè)具體問(wèn)題,設(shè)計(jì)了卷積神經(jīng)網(wǎng)絡(luò)的結(jié)構(gòu)和應(yīng)用模式,也針對(duì)特定領(lǐng)域的具體問(wèn)題提出了一些有效的解決方法;诠矓(shù)據(jù)集的實(shí)驗(yàn)表明,本文提出的方法在相應(yīng)的領(lǐng)域中能夠取得相比于經(jīng)典的傳統(tǒng)方法更好的實(shí)驗(yàn)結(jié)果。
[Abstract]:In recent years, the research of deep learning technology has aroused widespread interest in academia and industry, and has promoted the rapid development of a series of Applied Research in the field of artificial intelligence.As an important branch of deep learning, convolutional neural network is especially closely related to the related technology of computer vision. With the emergence of discontinuous optimization and massive data sets, convolutional neural networks have made breakthroughs in a series of computer vision applications in recent years. However, computer vision, as an area of considerable research content, still has a great deal of research in the depth and breadth of a particular technology. Research space. The field of computer vision can be divided into three levels: low-level feature research, middle-level semantic feature expression and high-level semantic understanding. The main research contents and their innovations include: Firstly, in the field of object recognition, aiming at the problem that single-column convolution neural network is easy to over-fit, the effect of heterogeneous multi-column convolution neural network in object recognition is studied. Integral neural network can effectively improve the generalization performance of the network. Aiming at the problems of single fusion mode and poor generalization performance in traditional network fusion, a network fusion strategy based on sliding window is proposed. In a single network fusion mode, sliding window fusion strategy is a more general method, and compatible with the classical fusion strategy, which can effectively improve the effect of network fusion. Second, in the field of scene annotation, a scene annotation method based on convolution neural network is proposed, and the number of indoor and outdoor scene annotations is mainstream. In view of the problem of feature learning in scene annotation, the application effect of convolution neural network features based on training and general convolution neural network features in scene annotation tasks is studied. A region consistency incentive algorithm is proposed. The region consistency incentive algorithm improves the region consistency of scene annotation iteratively by utilizing the global edge probability of scene image. Experiments on a common data set show that the region consistency incentive algorithm can achieve better scene scales than the classical algorithm. Thirdly, in the field of scene recognition, a method of scene recognition based on multi-scale salient region feature learning is proposed, and a better result of scene recognition is obtained in the experiment of common data set than that of other classical algorithms. A method of distinguishing salient regions is proposed, and the multi-scale information of salient regions is used to represent a scene image. Aiming at the problem that the traditional artificial design features have poor distinguishing performance in scene recognition tasks, the convolution neural network migration learning strategy is used to perform scene image features in multi-scale salient regions. Experiments show that the multi-scale salient region based feature learning strategy can effectively improve the accuracy of scene recognition. In addition, convolutional neural network transfer learning features have better discriminant performance than traditional artificial design features. The key technologies are studied. For each specific problem, the structure and application mode of convolutional neural network are designed, and some effective solutions to specific problems are proposed. Experiments based on common data sets show that the proposed method can achieve better results than the classical methods in the corresponding fields. Better experimental results.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41;TP18
本文編號(hào):2183103
[Abstract]:In recent years, the research of deep learning technology has aroused widespread interest in academia and industry, and has promoted the rapid development of a series of Applied Research in the field of artificial intelligence.As an important branch of deep learning, convolutional neural network is especially closely related to the related technology of computer vision. With the emergence of discontinuous optimization and massive data sets, convolutional neural networks have made breakthroughs in a series of computer vision applications in recent years. However, computer vision, as an area of considerable research content, still has a great deal of research in the depth and breadth of a particular technology. Research space. The field of computer vision can be divided into three levels: low-level feature research, middle-level semantic feature expression and high-level semantic understanding. The main research contents and their innovations include: Firstly, in the field of object recognition, aiming at the problem that single-column convolution neural network is easy to over-fit, the effect of heterogeneous multi-column convolution neural network in object recognition is studied. Integral neural network can effectively improve the generalization performance of the network. Aiming at the problems of single fusion mode and poor generalization performance in traditional network fusion, a network fusion strategy based on sliding window is proposed. In a single network fusion mode, sliding window fusion strategy is a more general method, and compatible with the classical fusion strategy, which can effectively improve the effect of network fusion. Second, in the field of scene annotation, a scene annotation method based on convolution neural network is proposed, and the number of indoor and outdoor scene annotations is mainstream. In view of the problem of feature learning in scene annotation, the application effect of convolution neural network features based on training and general convolution neural network features in scene annotation tasks is studied. A region consistency incentive algorithm is proposed. The region consistency incentive algorithm improves the region consistency of scene annotation iteratively by utilizing the global edge probability of scene image. Experiments on a common data set show that the region consistency incentive algorithm can achieve better scene scales than the classical algorithm. Thirdly, in the field of scene recognition, a method of scene recognition based on multi-scale salient region feature learning is proposed, and a better result of scene recognition is obtained in the experiment of common data set than that of other classical algorithms. A method of distinguishing salient regions is proposed, and the multi-scale information of salient regions is used to represent a scene image. Aiming at the problem that the traditional artificial design features have poor distinguishing performance in scene recognition tasks, the convolution neural network migration learning strategy is used to perform scene image features in multi-scale salient regions. Experiments show that the multi-scale salient region based feature learning strategy can effectively improve the accuracy of scene recognition. In addition, convolutional neural network transfer learning features have better discriminant performance than traditional artificial design features. The key technologies are studied. For each specific problem, the structure and application mode of convolutional neural network are designed, and some effective solutions to specific problems are proposed. Experiments based on common data sets show that the proposed method can achieve better results than the classical methods in the corresponding fields. Better experimental results.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41;TP18
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 莊福振;羅平;何清;史忠植;;遷移學(xué)習(xí)研究進(jìn)展[J];軟件學(xué)報(bào);2015年01期
,本文編號(hào):2183103
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2183103.html
最近更新
教材專著