基于語義分析的場景分類方法研究
發(fā)布時間:2017-12-28 03:16
本文關鍵詞:基于語義分析的場景分類方法研究 出處:《哈爾濱工業(yè)大學》2017年博士論文 論文類型:學位論文
更多相關文章: 場景分類 自適應樣本選擇 認知模型 知識表示與推理
【摘要】:場景理解是計算機視覺理論研究和技術應用所要挑戰(zhàn)的目標之一,包括場景分類、圖像分割、目標檢測與標注等諸多技術,其中,場景分類是實現場景理解的先決條件,在視頻監(jiān)控、機器人導航與決策等視覺應用中有著不可或缺的作用。研究場景分類技術是計算機視覺、機器學習和模式識別等領域的重要課題。近年來,隨著計算技術及圖像傳感器的快速發(fā)展,拓展了圖像采集方式并促進了視覺領域的發(fā)展。例如,流行的圖像分享網站如Flickr存儲的圖像數量已超過六十億,知名圖像社交網站Instagram的活躍用戶數量突破了一億。與此同時,越來越多的設備具有了獲取圖像的能力,掀起了智能設備普及的浪潮,擴展了設備的應用場景和范圍。豐富的圖像數據可為用戶提供更優(yōu)質的信息資源,但大量的圖像數據使手工分類越來越難以滿足日益增長的需求,也不符合設備智能化的趨勢。因此研究場景分類方法實現類別自動標注,是提高圖像檢索效率、拓展視覺智能應用的必要途徑。現有場景分類方法主要包括基于底層視覺特征的分類方法和基于知識語義的推理方法。這些分類方法利用視覺特征訓練視覺分類器完成分類任務,通常在小規(guī)模樣本集上有較好的效果。主要不足在于,底層視覺特征與人類理解的高層語義間存在語義鴻溝,不能很好地描述圖像;基于知識語義的方法在構造知識庫與推理時偏重于采用語義屬性而忽視了視覺屬性的重要作用。本文針對場景分類問題,提出了包括圖像樣本選擇、語義層次擴展視覺詞包圖像描述、場景結構分析以及視覺屬性知識庫構建在內的一套完整的理論體系。主要創(chuàng)新性工作有:1.從視覺認知角度出發(fā),提出一種樣本自動收集方法,解決基于不確定性主動學習方法未考慮樣本類別分布,且需要對所選樣本進行額外標注的問題。將基于視覺詞包的確定性評價引入到基于熵的不確定性度量中,使主動學習方法能夠在有效地收集樣本的同時對樣本類別進行自動標注。另外,利用認知心理學中負加速學習理論對迭代停止條件進行自適應調節(jié),在訓練過程中通過樣本相似性度量對不同類別樣本設置不同的權值,并在迭代過程中更新,從而提高收斂速度。實驗結果表明,該方法能夠提高樣本收集效率,用該方法收集的樣本訓練分類器能夠提高分類性能。2.提出了語義層次擴展場景分類方法,解決底層視覺特征存在語義鴻溝不能有效描述圖像高層語義的問題。通過引入抽象語義對詞包模型進行多層次擴展,提出語義保留方法在詞包模型構造的初級視覺詞典基礎上生成具有高語義層級的視覺詞典。利用自底向上的方式逐層傳遞語義,訓練上層語義分類器,從而提高詞包模型的描述能力。分類時采用自頂向下方式逐層判斷待測樣本的類別。實驗結果表明,提出的方法與其他分類方法相比具有更好的分類性能。3.提出了一種室內場景層次結構,解決不同類別室內場景裝飾多變且類別間具有相似性的不利于分類器訓練的問題。不同類別的室內場景間具有相似性,而相同類別的室內場景具有相異性。本文根據人類的認知規(guī)律及室內場景的特點,提出了一種場景層次結構。通過層次檢測方法自動劃分層次結構并用層次語義表述室內場景的結構。與已有分類方法相比,所提出的層次結構能夠更好地描述室內場景,從而能夠提高場景分類性能。4.在室內場景結構檢測的基礎上,提出高層知識庫構建方法對室內場景進行分類。室內場景分類是場景交互的前提,基于一階邏輯的方法在構造知識庫的過程中忽略了普遍存在的層次結構和視覺屬性。針對上述不足,提出一種基于馬爾科夫邏輯網的室內場景知識表示與推理方法,通過引入上述場景層次結構與視覺屬性構造高層知識庫來提高知識庫的描述能力。實驗結果表明,所構造的知識庫具有魯棒性,并且能夠有效地對室內場景進行分類。本文針對場景分類問題,在樣本選擇、語義擴展視覺詞包圖像描述、場景結構分析和視覺屬性知識庫構建等方面開展研究。提出的方法有機地構成場景分類框架,提高了場景分類性能。
[Abstract]:Scene understanding is one of the goals of computer vision theory and technology applied to the challenges, including scene classification, image segmentation, target detection and labeling of many technologies, the scene classification is a prerequisite for scene understanding, plays an indispensable role in video surveillance, robot navigation and decision vision application. The research of scene classification is an important subject in the fields of computer vision, machine learning and pattern recognition. In recent years, with the rapid development of computing technology and image sensors, the way of image acquisition has been expanded and the development of visual field has been promoted. For example, the number of images stored on popular image sharing sites, such as Flickr, has exceeded six billion. The number of active users of well-known image social networking sites Instagram has exceeded one hundred million. At the same time, more and more devices have the ability to obtain images, set off a wave of the popularization of intelligent equipment, and expand the application scene and scope of the equipment. Rich image data can provide users with better information resources, but a large number of image data makes manual classification more and more difficult to meet the growing demand, and also does not conform to the trend of intelligent devices. Therefore, it is a necessary way to improve the efficiency of image retrieval and expand the application of visual intelligence by studying the classification of scene automatically. The existing scene classification methods mainly include the classification method based on the underlying visual features and the reasoning based on knowledge semantics. These classification methods use visual features to train visual classifiers to perform classification tasks, and usually have good results on small scale sample sets. The main disadvantage is that there is a semantic gap between the underlying visual features and the high-level semantics of human understanding, which cannot describe the image very well. The method based on knowledge semantics emphasizes the use of semantic attributes while ignoring the importance of visual attributes when constructing knowledge bases and reasoning. Aiming at the problem of scene classification, this paper proposes a complete theoretical system including image sample selection, semantic level expansion, visual word package image description, scene structure analysis and visual attribute knowledge base construction. The main innovative works are as follows: 1., from the perspective of visual cognition, a sample automatic collection method is proposed to solve the problem of uncertain sample based on active learning, without considering the distribution of sample classes, and the need to annotate the selected samples. The deterministic evaluation based on visual word package is introduced into the entropy based uncertainty measurement, so that the active learning method can effectively collect samples while automatically marking the sample categories. In addition, the negative acceleration learning theory in cognitive psychology is used to adaptively adjust the iteration stop condition. In training process, different weights are set for different classes of samples in the training process, and update them in the iteration process, so as to improve the convergence speed. The experimental results show that the method can improve the sample collection efficiency, and the sample training classifier collected by this method can improve the classification performance. 2. a semantic hierarchical extended scene classification method is proposed to solve the problem that the underlying semantic gap in the underlying visual features can not effectively describe the high level semantics of the image. By introducing abstract semantics, the word bag model is expanded at various levels, and a semantic retention method is proposed. Based on the primary visual dictionary constructed by the word bag model, a visual dictionary with high semantic level is generated. The semantic classifier is trained layer by layer to train the upper level semantic classifier, so as to improve the description ability of the word packet model. A top-down approach is used to determine the categories of the samples to be measured in a top-down manner. The experimental results show that the proposed method has better classification performance compared with other classification methods. 3., a hierarchical structure of indoor scenes is proposed to solve the problem of different classes of indoor scenes with varied decoration and similarity among categories, which is not conducive to the training of classifiers. Different categories of indoor scenes have similarities, while the same category of indoor scenes is different. Based on the human cognitive law and the characteristics of the indoor scene, this paper presents a hierarchical structure of the scene. The hierarchical structure is automatically divided by the hierarchical detection method and the level semantics is used to express the structure of the indoor scene. Compared with the existing classification methods, the proposed hierarchical structure can better describe the indoor scene, thus improving the performance of the scene classification. 4. on the basis of the detection of the indoor scene structure, this paper puts forward a high level knowledge base construction method to classify the indoor scene. Indoor scene classification is the premise of scene interaction. The first order logic method ignores the hierarchical structure and visual attributes in the process of constructing knowledge base. In view of these shortcomings, a knowledge representation and reasoning method of indoor scene based on Markoff logic network is proposed. By introducing the above scene hierarchical structure and visual attribute, we build high-level knowledge base to improve the description ability of knowledge base. The experimental results show that the constructed knowledge base is robust and can effectively classify the indoor scene. Aiming at scene classification problem, this paper researches on sample selection, semantic extension, visual word package image description, scene structure analysis and visual attribute knowledge base construction. The proposed method organically forms the scene classification framework to improve the performance of the scene classification.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:博士
【學位授予年份】:2017
【分類號】:TP391.41
【參考文獻】
相關期刊論文 前5條
1 黃凱奇;任偉強;譚鐵牛;;圖像物體分類與檢測算法綜述[J];計算機學報;2014年06期
2 張素蘭;郭平;張繼福;胡立華;;圖像語義自動標注及其粒度分析方法[J];自動化學報;2012年05期
3 張琳波;王春恒;肖柏華;邵允學;;基于Bag-of-phrases的圖像表示方法[J];自動化學報;2012年01期
4 徐從富;郝春亮;蘇保君;樓俊杰;;馬爾可夫邏輯網絡研究[J];軟件學報;2011年08期
5 危輝,潘云鶴;從知識表示到表示:人工智能認識論上的進步[J];計算機研究與發(fā)展;2000年07期
,本文編號:1344346
本文鏈接:http://sikaile.net/shoufeilunwen/xxkjbs/1344346.html
最近更新
教材專著