跨視域攝像頭網(wǎng)絡(luò)下的監(jiān)控視頻結(jié)構(gòu)化與檢索

發(fā)布時(shí)間：2018-11-27 10:12

【摘要】：視頻監(jiān)控是城市公共安全領(lǐng)域一項(xiàng)重要的監(jiān)控手段。隨著監(jiān)控?cái)z像頭數(shù)目和監(jiān)控視頻數(shù)據(jù)量的急劇上升,傳統(tǒng)基于人工操作的監(jiān)控方式越來(lái)越難以滿足需求,亟需發(fā)展基于智能算法的視頻監(jiān)控技術(shù)。智能視頻監(jiān)控中的關(guān)鍵問(wèn)題在于"監(jiān)控視頻內(nèi)容結(jié)構(gòu)化"與"監(jiān)控對(duì)象檢索"。圍繞這兩大關(guān)鍵問(wèn)題,本文(1)針對(duì)監(jiān)控視頻內(nèi)容結(jié)構(gòu)化中的目標(biāo)元數(shù)據(jù)獲取問(wèn)題,開(kāi)展了群體目標(biāo)跟蹤的研究;(2)針對(duì)監(jiān)控視頻內(nèi)容結(jié)構(gòu)化中的目標(biāo)理解與描述問(wèn)題,開(kāi)展了圖像多屬性識(shí)別的研究;(3)針對(duì)監(jiān)控對(duì)象檢索中的基于圖像的檢索問(wèn)題,開(kāi)展了跨視域行人群組再識(shí)別的研究。群體目標(biāo)跟蹤獲取了每個(gè)行人的運(yùn)動(dòng)視頻片段和運(yùn)動(dòng)軌跡信息,為后續(xù)分析處理提供了重要的素材。圖像多屬性識(shí)別為每個(gè)監(jiān)控對(duì)象生成了高層語(yǔ)義描述信息,一方面為基于圖像的檢索提供了高層語(yǔ)義特征,另一方面為基于自然語(yǔ)言的檢索提供了可能�？缫曈蛐腥巳航M再識(shí)別的研究是對(duì)單行人再識(shí)別問(wèn)題的重要補(bǔ)充,為視頻監(jiān)控中基于行人外觀特征(非人臉)的跨視域行人檢索應(yīng)用提供了重要的技術(shù)基礎(chǔ)。本論文的主要研究工作與創(chuàng)新成果如下:(1)提出了一種基于群組關(guān)系演化的群體目標(biāo)跟蹤算法。該算法將低層次(Low-Level)的關(guān)鍵點(diǎn)跟蹤、中層次(Mid-Level)的圖像塊檢測(cè)及跟蹤和高層次(High-Level)的群組關(guān)系演化融入一個(gè)統(tǒng)一框架。不同于以往的計(jì)算光流、跟蹤關(guān)鍵點(diǎn)或者檢測(cè)行人目標(biāo),本文提出將人群表示成一組外觀獨(dú)特且穩(wěn)定的圖像塊。在低層次上,關(guān)鍵點(diǎn)跟蹤提供了非常精確的局部軌跡信息,可以用于檢測(cè)圖像塊以及推測(cè)群體的群組關(guān)系。在中層次上,采用所提出的分層樹(shù)形結(jié)構(gòu)對(duì)圖像塊之間的空間關(guān)系進(jìn)行建模和學(xué)習(xí)。在高層次上,群組關(guān)系的演化使得分層樹(shù)形結(jié)構(gòu)可以通過(guò)分裂、合并等形式進(jìn)行動(dòng)態(tài)更新。實(shí)驗(yàn)結(jié)果表明:所提出的圖像塊檢測(cè)方法為給定目標(biāo)的跟蹤提供了重要的輔助信息;所提出的動(dòng)態(tài)分層樹(shù)形結(jié)構(gòu)能夠有效學(xué)習(xí)目標(biāo)之間的空間關(guān)系;所提出的基于群組關(guān)系演化的群體目標(biāo)跟蹤算法顯著提高了群體目標(biāo)跟蹤的準(zhǔn)確性。(2)提出了一種基于空間幾何關(guān)系的圖像多屬性識(shí)別算法。該算法通過(guò)一個(gè)可以"端到端"訓(xùn)練的深層卷積神經(jīng)網(wǎng)絡(luò)來(lái)同時(shí)學(xué)習(xí)屬性之間的空間和語(yǔ)義關(guān)系,而僅僅利用了圖像的屬性標(biāo)簽類別信息作為訓(xùn)練監(jiān)督信號(hào)。具體來(lái)說(shuō),對(duì)于輸入圖像,使用所提出的"空間正則網(wǎng)絡(luò)"(SRN:Spatial Regularization Network)為每個(gè)可能的屬性類別標(biāo)簽生成一個(gè)注意力圖,并基于注意力圖來(lái)同時(shí)學(xué)習(xí)屬性之間的空間和語(yǔ)義關(guān)系。最后,將"空間正則網(wǎng)絡(luò)"得到的各個(gè)屬性的置信度得分與基本卷積神經(jīng)網(wǎng)絡(luò)(如:殘差網(wǎng)絡(luò)ResNet-101)得到的置信度得分進(jìn)行加和,修正屬性置信度得分。在多個(gè)不同類型的公開(kāi)數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果表明:"空間正則網(wǎng)絡(luò)"可以有效學(xué)習(xí)圖像中屬性之間的空間幾何關(guān)系;這種空間幾何關(guān)系可以顯著提升圖像多屬性識(shí)別的準(zhǔn)確性。(3)提出了一種基于塊匹配的行人群組再識(shí)別算法。相對(duì)于單行人再識(shí)別問(wèn)題,行人群組再識(shí)別面臨著更多的新問(wèn)題,比如:群組內(nèi)行人之間嚴(yán)重的相互遮擋、群組內(nèi)行人在不同視域下發(fā)生相對(duì)位置變化等。為了解決上述問(wèn)題,本文提出將行人群組再識(shí)別建模成兩組圖像塊匹配的問(wèn)題。首先,通過(guò)所提出的顯著性通道濾除掉外觀相似度不高或者不具判別能力的圖像塊匹配;然后,對(duì)于生成的候選匹配,采用所提出的空間一致性匹配進(jìn)行進(jìn)一步篩選,濾除掉空間匹配關(guān)系不一致的圖像塊匹配,最終得到兩張圖像的相似度。實(shí)驗(yàn)結(jié)果表明:所提出的算法在性能上顯著超過(guò)了目前主流的目標(biāo)再識(shí)別算法;所提出算法的兩個(gè)部分(顯著性通道和空間一致性匹配)在行人群組再識(shí)別性能的提升上相互促進(jìn)。
[Abstract]:Video monitoring is an important monitoring method in the field of urban public safety. With the rapid increase of the number of monitoring cameras and the amount of video data, the traditional monitoring methods based on manual operation are becoming more and more difficult to meet the demand, and the video monitoring technology based on the intelligent algorithm is urgently needed. The key problem in intelligent video monitoring is the "Monitor video content structuring" and the "Monitoring Object Retrieval". In order to solve the problem of target metadata acquisition in the structure of video content, this paper has carried out the research of group target tracking, and (2) to monitor the problem of target understanding and description in the structure of video content. The research of multi-attribute recognition of image is carried out; and (3) the research on the re-identification of the crowd group across the visual field is carried out in view of the image-based retrieval problem in the object retrieval. The target tracking of the group acquires the motion video clip and the motion track information of each pedestrian, and provides important material for subsequent analysis and processing. The multi-attribute recognition of the image provides high-level semantic description information for each monitoring object, on the one hand, provides high-level semantic features for image-based retrieval, and on the other hand, provides a possibility for retrieval based on natural language. The research on the re-identification of the cross-view line population group is an important supplement to the problem of the rerecognition of the single-line person, and provides an important technical basis for the application of the cross-view pedestrian search based on the pedestrian appearance characteristics (non-human face) in the video monitoring. The main research work and innovation achievement of this thesis are as follows: (1) A group target tracking algorithm based on group relation evolution is proposed. The algorithm combines the low-level key tracking, mid-level (mid-level) image block detection and tracking and high-level (High-Level) group relationship evolution into a unified framework. different from the conventional calculation light flow, the tracking key point, or the detection of the pedestrian target, the present invention proposes to represent the population as a group of image blocks that are unique and stable in appearance. At the low level, the key tracking provides very accurate local track information, which can be used to detect the group relationship between the image block and the presumed population. At the middle level, the spatial relationship between the image blocks is modeled and studied with the proposed hierarchical tree structure. At a high level, the evolution of the group relation enables the hierarchical tree structure to be dynamically updated in the form of splitting, merging and the like. The experimental results show that the proposed image block detection method provides important auxiliary information for the tracking of a given target, and the proposed dynamic hierarchical tree structure can effectively study the spatial relationship between the objects. The proposed group target tracking algorithm based on the group relationship evolution significantly improves the accuracy of the group target tracking. (2) An image multi-attribute recognition algorithm based on spatial geometric relation is proposed. The algorithm can learn the spatial and semantic relation between the attributes at the same time through a deep-layer convolution neural network which can be "end-to-end"-trained, and only the attribute tag class information of the image is used as the training supervision signal. Specifically, for the input image, an attention map is generated for each possible attribute category label using the proposed "space regular network" (SRN: Spatial Registration Network), and the spatial and semantic relationship between the attributes is simultaneously learned based on the attention map. Finally, the confidence score of each attribute obtained by the "space regular network" is summed with the confidence score obtained by the basic convolution neural network (e.g., residual network ResNet-101), and the attribute confidence score is corrected. The experimental results on a number of different types of open data sets show that the "space regular network" can effectively study the spatial geometric relation between the attributes in the image; this spatial geometry can significantly improve the accuracy of the multi-attribute recognition of the image. and (3) a block-matched row group re-identification algorithm is proposed. in contrast to that problem of the rerecognition of a single-line person, the group re-identification of the line group is faced with more new problems, such as the serious mutual occlusion between the pedestrian in the group, the relative position change of the pedestrian in the group under different visual field, and the like. In order to solve the above problems, this paper puts forward the problem that the group of line groups can be identified and modeled as two groups of image blocks. First, the image block matching with the appearance similarity is not high or the non-discrimination capability is not matched by the proposed saliency channel filtering; then, for the generated candidate matching, the proposed spatial consistency matching is adopted for further screening, and the similarity of the two images is finally obtained. The experimental results show that the proposed algorithm significantly exceeds the current target re-identification algorithm in performance, and the two parts of the proposed algorithm (the significance channel and the spatial consistency match) are mutually reinforcing in the improvement of the group re-recognition performance.
【學(xué)位授予單位】：中國(guó)科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】：博士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP391.41

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 肖明霞;;基于圖像塊的人臉檢測(cè)方法的研究[J];科學(xué)時(shí)代;2009年02期

2 顧勇;張燦果;龔志廣;;基于圖像塊分割融合算法在醫(yī)學(xué)圖像中的應(yīng)用[J];河北建筑工程學(xué)院學(xué)報(bào);2007年02期

3 李天偉;黃謙;郭模燦;何四華;;圖像塊混沌特征在海面運(yùn)動(dòng)目標(biāo)檢測(cè)中的應(yīng)用[J];中國(guó)造船;2011年02期

4 李軍;部分圖像塊的顯示及特技制作技巧[J];電腦編程技巧與維護(hù);1997年04期

5 李生金;蒲寶明;賀寶岳;王維維;;基于圖像塊的滯留物/移取物的檢測(cè)方法[J];小型微型計(jì)算機(jī)系統(tǒng);2014年01期

6 趙德斌;陳耀強(qiáng);高文;;基于圖像塊方向的自適應(yīng)無(wú)失真編碼[J];模式識(shí)別與人工智能;1998年01期

7 陳琦,李華,朱光喜;一種新的應(yīng)用于屏幕共享的圖像塊識(shí)別算法[J];電訊技術(shù);2000年06期

8 劉尚翼;霍永津;羅欣榮;白仲亮;魏林鋒;項(xiàng)世軍;;基于圖像塊相關(guān)性分類的加密域可逆數(shù)據(jù)隱藏[J];武漢大學(xué)學(xué)報(bào)(理學(xué)版);2013年05期

9 陳奮,閆冬梅,趙忠明;一種快速圖像塊填充算法及其在遙感影像處理中的應(yīng)用[J];計(jì)算機(jī)應(yīng)用;2005年10期

10 馬文龍,余寧梅,銀磊,高勇;圖像塊動(dòng)態(tài)劃分矢量量化[J];計(jì)算機(jī)輔助設(shè)計(jì)與圖形學(xué)學(xué)報(bào);2005年02期

相關(guān)會(huì)議論文前2條

1 李趙紅;侯建軍;宋偉;;基于圖像塊等級(jí)模型的多重認(rèn)證水印算法[A];第八屆全國(guó)信息隱藏與多媒體安全學(xué)術(shù)大會(huì)湖南省計(jì)算機(jī)學(xué)會(huì)第十一屆學(xué)術(shù)年會(huì)論文集[C];2009年

2 鐘凡;莫銘臻;秦學(xué)英;彭群生;;基于WSSD的不規(guī)則圖像塊快速匹配[A];中國(guó)計(jì)算機(jī)圖形學(xué)進(jìn)展2008--第七屆中國(guó)計(jì)算機(jī)圖形學(xué)大會(huì)論文集[C];2008年

相關(guān)博士學(xué)位論文前7條

1 霍雷剛;圖像處理中的塊先驗(yàn)理論及應(yīng)用研究[D];西安電子科技大學(xué);2015年

2 欽夏孟;稠密圖像塊匹配方法及其應(yīng)用[D];北京理工大學(xué);2015年

3 林樂(lè)平;基于過(guò)完備字典的非凸壓縮感知理論與方法研究[D];西安電子科技大學(xué);2016年

4 向濤;復(fù)雜場(chǎng)景下目標(biāo)檢測(cè)算法研究[D];電子科技大學(xué);2016年

5 鮑華;復(fù)雜場(chǎng)景下基于局部分塊和上下文信息的單視覺(jué)目標(biāo)跟蹤[D];中國(guó)科學(xué)技術(shù)大學(xué);2017年

6 朱烽;跨視域攝像頭網(wǎng)絡(luò)下的監(jiān)控視頻結(jié)構(gòu)化與檢索[D];中國(guó)科學(xué)技術(shù)大學(xué);2017年

7 宋偉;幾類數(shù)字圖像水印算法的研究[D];北京交通大學(xué);2010年

相關(guān)碩士學(xué)位論文前10條

1 王榮麗;基于半監(jiān)督學(xué)習(xí)的目標(biāo)跟蹤方法研究[D];浙江師范大學(xué);2015年

2 祝漢城;數(shù)字圖像的客觀質(zhì)量評(píng)價(jià)方法研究[D];中國(guó)礦業(yè)大學(xué);2015年

3 陸杰;使用自組織增量神經(jīng)網(wǎng)絡(luò)實(shí)現(xiàn)單層非監(jiān)督特征學(xué)習(xí)[D];南京大學(xué);2015年

4 熊耀先;基于圖像塊統(tǒng)計(jì)特性的EPLL遙感圖像復(fù)原方法[D];國(guó)防科學(xué)技術(shù)大學(xué);2014年

5 張書揚(yáng);基于冗余字典的圖像壓縮感知技術(shù)研究[D];吉林大學(xué);2016年

6 楊存強(qiáng);基于圖像塊多級(jí)分類和稀疏表示的超分辨率重建算法研究[D];天津工業(yè)大學(xué);2016年

7 李向向;視頻監(jiān)控下實(shí)時(shí)異常行為檢測(cè)研究[D];南京郵電大學(xué);2016年

8 程曉東;基于幀間塊約束和進(jìn)化計(jì)算的視頻壓縮感知重構(gòu)方法[D];西安電子科技大學(xué);2016年

9 李小青;基于脊波冗余字典和多目標(biāo)遺傳優(yōu)化的壓縮感知圖像重構(gòu)[D];西安電子科技大學(xué);2016年

10 文俊;基于深度卷積神經(jīng)網(wǎng)絡(luò)的室外場(chǎng)景理解研究[D];杭州電子科技大學(xué);2016年

，

本文編號(hào)：2360371

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2360371.html

上一篇：基于跨領(lǐng)域卷積稀疏自動(dòng)編碼器的抽象圖像情緒性分類
下一篇：跨視域攝像頭網(wǎng)絡(luò)下的監(jiān)控視頻結(jié)構(gòu)化與檢索

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

跨視域攝像頭網(wǎng)絡(luò)下的監(jiān)控視頻結(jié)構(gòu)化與檢索