基于語義分析的場景分類方法研究
發(fā)布時(shí)間:2017-12-28 03:16
本文關(guān)鍵詞:基于語義分析的場景分類方法研究 出處:《哈爾濱工業(yè)大學(xué)》2017年博士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 場景分類 自適應(yīng)樣本選擇 認(rèn)知模型 知識(shí)表示與推理
【摘要】:場景理解是計(jì)算機(jī)視覺理論研究和技術(shù)應(yīng)用所要挑戰(zhàn)的目標(biāo)之一,包括場景分類、圖像分割、目標(biāo)檢測與標(biāo)注等諸多技術(shù),其中,場景分類是實(shí)現(xiàn)場景理解的先決條件,在視頻監(jiān)控、機(jī)器人導(dǎo)航與決策等視覺應(yīng)用中有著不可或缺的作用。研究場景分類技術(shù)是計(jì)算機(jī)視覺、機(jī)器學(xué)習(xí)和模式識(shí)別等領(lǐng)域的重要課題。近年來,隨著計(jì)算技術(shù)及圖像傳感器的快速發(fā)展,拓展了圖像采集方式并促進(jìn)了視覺領(lǐng)域的發(fā)展。例如,流行的圖像分享網(wǎng)站如Flickr存儲(chǔ)的圖像數(shù)量已超過六十億,知名圖像社交網(wǎng)站Instagram的活躍用戶數(shù)量突破了一億。與此同時(shí),越來越多的設(shè)備具有了獲取圖像的能力,掀起了智能設(shè)備普及的浪潮,擴(kuò)展了設(shè)備的應(yīng)用場景和范圍。豐富的圖像數(shù)據(jù)可為用戶提供更優(yōu)質(zhì)的信息資源,但大量的圖像數(shù)據(jù)使手工分類越來越難以滿足日益增長的需求,也不符合設(shè)備智能化的趨勢。因此研究場景分類方法實(shí)現(xiàn)類別自動(dòng)標(biāo)注,是提高圖像檢索效率、拓展視覺智能應(yīng)用的必要途徑,F(xiàn)有場景分類方法主要包括基于底層視覺特征的分類方法和基于知識(shí)語義的推理方法。這些分類方法利用視覺特征訓(xùn)練視覺分類器完成分類任務(wù),通常在小規(guī)模樣本集上有較好的效果。主要不足在于,底層視覺特征與人類理解的高層語義間存在語義鴻溝,不能很好地描述圖像;基于知識(shí)語義的方法在構(gòu)造知識(shí)庫與推理時(shí)偏重于采用語義屬性而忽視了視覺屬性的重要作用。本文針對(duì)場景分類問題,提出了包括圖像樣本選擇、語義層次擴(kuò)展視覺詞包圖像描述、場景結(jié)構(gòu)分析以及視覺屬性知識(shí)庫構(gòu)建在內(nèi)的一套完整的理論體系。主要?jiǎng)?chuàng)新性工作有:1.從視覺認(rèn)知角度出發(fā),提出一種樣本自動(dòng)收集方法,解決基于不確定性主動(dòng)學(xué)習(xí)方法未考慮樣本類別分布,且需要對(duì)所選樣本進(jìn)行額外標(biāo)注的問題。將基于視覺詞包的確定性評(píng)價(jià)引入到基于熵的不確定性度量中,使主動(dòng)學(xué)習(xí)方法能夠在有效地收集樣本的同時(shí)對(duì)樣本類別進(jìn)行自動(dòng)標(biāo)注。另外,利用認(rèn)知心理學(xué)中負(fù)加速學(xué)習(xí)理論對(duì)迭代停止條件進(jìn)行自適應(yīng)調(diào)節(jié),在訓(xùn)練過程中通過樣本相似性度量對(duì)不同類別樣本設(shè)置不同的權(quán)值,并在迭代過程中更新,從而提高收斂速度。實(shí)驗(yàn)結(jié)果表明,該方法能夠提高樣本收集效率,用該方法收集的樣本訓(xùn)練分類器能夠提高分類性能。2.提出了語義層次擴(kuò)展場景分類方法,解決底層視覺特征存在語義鴻溝不能有效描述圖像高層語義的問題。通過引入抽象語義對(duì)詞包模型進(jìn)行多層次擴(kuò)展,提出語義保留方法在詞包模型構(gòu)造的初級(jí)視覺詞典基礎(chǔ)上生成具有高語義層級(jí)的視覺詞典。利用自底向上的方式逐層傳遞語義,訓(xùn)練上層語義分類器,從而提高詞包模型的描述能力。分類時(shí)采用自頂向下方式逐層判斷待測樣本的類別。實(shí)驗(yàn)結(jié)果表明,提出的方法與其他分類方法相比具有更好的分類性能。3.提出了一種室內(nèi)場景層次結(jié)構(gòu),解決不同類別室內(nèi)場景裝飾多變且類別間具有相似性的不利于分類器訓(xùn)練的問題。不同類別的室內(nèi)場景間具有相似性,而相同類別的室內(nèi)場景具有相異性。本文根據(jù)人類的認(rèn)知規(guī)律及室內(nèi)場景的特點(diǎn),提出了一種場景層次結(jié)構(gòu)。通過層次檢測方法自動(dòng)劃分層次結(jié)構(gòu)并用層次語義表述室內(nèi)場景的結(jié)構(gòu)。與已有分類方法相比,所提出的層次結(jié)構(gòu)能夠更好地描述室內(nèi)場景,從而能夠提高場景分類性能。4.在室內(nèi)場景結(jié)構(gòu)檢測的基礎(chǔ)上,提出高層知識(shí)庫構(gòu)建方法對(duì)室內(nèi)場景進(jìn)行分類。室內(nèi)場景分類是場景交互的前提,基于一階邏輯的方法在構(gòu)造知識(shí)庫的過程中忽略了普遍存在的層次結(jié)構(gòu)和視覺屬性。針對(duì)上述不足,提出一種基于馬爾科夫邏輯網(wǎng)的室內(nèi)場景知識(shí)表示與推理方法,通過引入上述場景層次結(jié)構(gòu)與視覺屬性構(gòu)造高層知識(shí)庫來提高知識(shí)庫的描述能力。實(shí)驗(yàn)結(jié)果表明,所構(gòu)造的知識(shí)庫具有魯棒性,并且能夠有效地對(duì)室內(nèi)場景進(jìn)行分類。本文針對(duì)場景分類問題,在樣本選擇、語義擴(kuò)展視覺詞包圖像描述、場景結(jié)構(gòu)分析和視覺屬性知識(shí)庫構(gòu)建等方面開展研究。提出的方法有機(jī)地構(gòu)成場景分類框架,提高了場景分類性能。
[Abstract]:Scene understanding is one of the goals of computer vision theory and technology applied to the challenges, including scene classification, image segmentation, target detection and labeling of many technologies, the scene classification is a prerequisite for scene understanding, plays an indispensable role in video surveillance, robot navigation and decision vision application. The research of scene classification is an important subject in the fields of computer vision, machine learning and pattern recognition. In recent years, with the rapid development of computing technology and image sensors, the way of image acquisition has been expanded and the development of visual field has been promoted. For example, the number of images stored on popular image sharing sites, such as Flickr, has exceeded six billion. The number of active users of well-known image social networking sites Instagram has exceeded one hundred million. At the same time, more and more devices have the ability to obtain images, set off a wave of the popularization of intelligent equipment, and expand the application scene and scope of the equipment. Rich image data can provide users with better information resources, but a large number of image data makes manual classification more and more difficult to meet the growing demand, and also does not conform to the trend of intelligent devices. Therefore, it is a necessary way to improve the efficiency of image retrieval and expand the application of visual intelligence by studying the classification of scene automatically. The existing scene classification methods mainly include the classification method based on the underlying visual features and the reasoning based on knowledge semantics. These classification methods use visual features to train visual classifiers to perform classification tasks, and usually have good results on small scale sample sets. The main disadvantage is that there is a semantic gap between the underlying visual features and the high-level semantics of human understanding, which cannot describe the image very well. The method based on knowledge semantics emphasizes the use of semantic attributes while ignoring the importance of visual attributes when constructing knowledge bases and reasoning. Aiming at the problem of scene classification, this paper proposes a complete theoretical system including image sample selection, semantic level expansion, visual word package image description, scene structure analysis and visual attribute knowledge base construction. The main innovative works are as follows: 1., from the perspective of visual cognition, a sample automatic collection method is proposed to solve the problem of uncertain sample based on active learning, without considering the distribution of sample classes, and the need to annotate the selected samples. The deterministic evaluation based on visual word package is introduced into the entropy based uncertainty measurement, so that the active learning method can effectively collect samples while automatically marking the sample categories. In addition, the negative acceleration learning theory in cognitive psychology is used to adaptively adjust the iteration stop condition. In training process, different weights are set for different classes of samples in the training process, and update them in the iteration process, so as to improve the convergence speed. The experimental results show that the method can improve the sample collection efficiency, and the sample training classifier collected by this method can improve the classification performance. 2. a semantic hierarchical extended scene classification method is proposed to solve the problem that the underlying semantic gap in the underlying visual features can not effectively describe the high level semantics of the image. By introducing abstract semantics, the word bag model is expanded at various levels, and a semantic retention method is proposed. Based on the primary visual dictionary constructed by the word bag model, a visual dictionary with high semantic level is generated. The semantic classifier is trained layer by layer to train the upper level semantic classifier, so as to improve the description ability of the word packet model. A top-down approach is used to determine the categories of the samples to be measured in a top-down manner. The experimental results show that the proposed method has better classification performance compared with other classification methods. 3., a hierarchical structure of indoor scenes is proposed to solve the problem of different classes of indoor scenes with varied decoration and similarity among categories, which is not conducive to the training of classifiers. Different categories of indoor scenes have similarities, while the same category of indoor scenes is different. Based on the human cognitive law and the characteristics of the indoor scene, this paper presents a hierarchical structure of the scene. The hierarchical structure is automatically divided by the hierarchical detection method and the level semantics is used to express the structure of the indoor scene. Compared with the existing classification methods, the proposed hierarchical structure can better describe the indoor scene, thus improving the performance of the scene classification. 4. on the basis of the detection of the indoor scene structure, this paper puts forward a high level knowledge base construction method to classify the indoor scene. Indoor scene classification is the premise of scene interaction. The first order logic method ignores the hierarchical structure and visual attributes in the process of constructing knowledge base. In view of these shortcomings, a knowledge representation and reasoning method of indoor scene based on Markoff logic network is proposed. By introducing the above scene hierarchical structure and visual attribute, we build high-level knowledge base to improve the description ability of knowledge base. The experimental results show that the constructed knowledge base is robust and can effectively classify the indoor scene. Aiming at scene classification problem, this paper researches on sample selection, semantic extension, visual word package image description, scene structure analysis and visual attribute knowledge base construction. The proposed method organically forms the scene classification framework to improve the performance of the scene classification.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前5條
1 黃凱奇;任偉強(qiáng);譚鐵牛;;圖像物體分類與檢測算法綜述[J];計(jì)算機(jī)學(xué)報(bào);2014年06期
2 張素蘭;郭平;張繼福;胡立華;;圖像語義自動(dòng)標(biāo)注及其粒度分析方法[J];自動(dòng)化學(xué)報(bào);2012年05期
3 張琳波;王春恒;肖柏華;邵允學(xué);;基于Bag-of-phrases的圖像表示方法[J];自動(dòng)化學(xué)報(bào);2012年01期
4 徐從富;郝春亮;蘇保君;樓俊杰;;馬爾可夫邏輯網(wǎng)絡(luò)研究[J];軟件學(xué)報(bào);2011年08期
5 危輝,潘云鶴;從知識(shí)表示到表示:人工智能認(rèn)識(shí)論上的進(jìn)步[J];計(jì)算機(jī)研究與發(fā)展;2000年07期
,本文編號(hào):1344346
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1344346.html
最近更新
教材專著