基于主角和卷積神經(jīng)網(wǎng)絡(luò)的視頻情感內(nèi)容分析方法研究
發(fā)布時(shí)間:2018-03-01 09:53
本文關(guān)鍵詞: 情感分析 視頻內(nèi)容分析 主角信息 卷積神經(jīng)網(wǎng)絡(luò) 光流 出處:《深圳大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著智能設(shè)備和科技的高速發(fā)展,每天有大量的視頻被分享至互聯(lián)網(wǎng)。為了方便地管理和組織這些海量視頻數(shù)據(jù),亟需一種自動(dòng)的視頻內(nèi)容分析方法。傳統(tǒng)基于內(nèi)容的視頻分析大多數(shù)關(guān)注于視頻中發(fā)生的事件和內(nèi)容,而很少去分析這些視頻內(nèi)容給觀看者帶來(lái)的心理感受。視頻情感內(nèi)容分析則是從觀看者的角度去分析預(yù)測(cè)視頻可能帶來(lái)的情感。情感識(shí)別是視頻內(nèi)容分析中一個(gè)重要且具有巨大挑戰(zhàn)性的課題。目前已經(jīng)存在的大多數(shù)視頻情感分析方法更多的集中于如何有效地提取更多特征用于情感分析。這其中有一些問題值得我們?nèi)パ芯?比如視頻中什么信息可以被用于傳遞情感,同時(shí)什么樣的信息能夠作用于觀看者并使之產(chǎn)生對(duì)應(yīng)的情感。目前大多數(shù)的方法僅僅是采用視頻的空間域信息進(jìn)行情感分析,而少有人考慮到視頻的時(shí)間域信息?紤]到上述問題,本文中我們提出一種新穎的基于主角和卷積神經(jīng)網(wǎng)絡(luò)的視頻情感內(nèi)容分析方法。本文所完成的研究工作主要包括:(1)典型的情感分析方法僅僅是考慮到了音頻等較為低層的特征,而忽略了視頻圖像這一重要的情感信息載體。本文通過(guò)提取視頻的關(guān)鍵幀,再通過(guò)卷積神經(jīng)網(wǎng)絡(luò)提取這些靜態(tài)關(guān)鍵幀的圖像特征并用于最終的視頻情感分析過(guò)程中?紤]到并非圖像所有的部分均可用于情感誘發(fā),因此本文基于SIFT關(guān)鍵點(diǎn)從視頻關(guān)鍵幀中提取圖像塊用于表征圖像的情感信息。在提取完圖像塊特征后,本文還探索了不同的特征融合方法對(duì)于情感分析的影響。(2)從觀眾大多時(shí)候關(guān)注點(diǎn)集中在視頻中的人物中得到啟發(fā),尤其是主演出現(xiàn)的場(chǎng)景,因此本文分別提出了基于人臉和主角的視頻情感分析方法。具體的在關(guān)鍵幀提取過(guò)程中將人臉檢測(cè)和人臉識(shí)別加入到關(guān)鍵幀提取步驟中。基于演員和主角的情感分析對(duì)視頻內(nèi)容有一定的要求,因此本文中還建立了一個(gè)帶情感標(biāo)注的視頻數(shù)據(jù)庫(kù)。(3)目前大多數(shù)的方法僅僅是在空間域上對(duì)視頻進(jìn)行情感內(nèi)容分析,本文提出將光流這一重要的時(shí)間域信息用于情感分析中,具體地我們將提取的光流信息轉(zhuǎn)化為RGB圖像,然后采用卷積神經(jīng)網(wǎng)絡(luò)從這些光流圖像中提取特征加入到情感分析方法中。通過(guò)視頻光流信息可以在一定程度上獲取視頻對(duì)應(yīng)的行為信息。而行為信息在一定程度上能夠很好的刺激人們產(chǎn)生對(duì)應(yīng)的情感。
[Abstract]:With the rapid development of intelligent equipment and technology, a large number of video sharing to the Internet every day. In order to facilitate the management and organization of these massive video data to an automatic video content analysis method. The traditional content based video analysis most focus on video events and content, and seldom go to the video content analysis bring to the viewer's psychological feelings. The analysis of video content is to analysis and prediction of emotional video may bring emotion from the viewer's perspective. Video content analysis in emotion recognition is an important and great challenge. Most of the existing methods for video emotion more focused on how to effectively extract more features for sentiment analysis. There are some problems worthy of our research, what information such as video can be used to transfer the same emotion. What kind of information to the viewer and produce the corresponding emotion. Most of the methods is just emotional analysis using the spatial domain information of the video, but few people consider the time domain information of the video. Considering the above problems, in this paper we propose a novel method of video affective content the protagonist and the convolutional neural network. Based on the research work done in this thesis mainly includes: (1) sentiment analysis methods of typical is only considering the characteristics of audio relatively low layer, while ignoring the video image as an important information carrier of emotion. This paper video key frame extraction, then through convolution neural network image feature extraction of these static key frames and for the final video emotion analysis process. Taking into account the image is not all the parts can be used for emotion induced, so this paper based on SIFT The key points are extracted from the video key frame image block is used to represent the image. In the end the emotional information extraction of image feature, this paper also explores the different characteristics of fusion methods influence on sentiment analysis. (2) get inspiration from the audience mostly concentrate on the characters in the video, especially in appearance the scene, so this paper put forward analysis method of face and emotion based on the protagonist of the video. Specific key frame extraction in face detection and face recognition process will be added to the key frame extraction step. Actors and characters of the emotion analysis has certain requirements for video based content, so this paper also set up a band emotion tagging video database. (3) most of the methods are only in the spatial domain for video affective content analysis, this paper put forward the important flow time domain information for love The sense of analysis, in particular we will extract the flow of information into the RGB image, then the convolution neural network from the optical flow image feature extraction to sentiment analysis method. The information through the video optical flow information can obtain the video corresponding to a certain extent. And behavior information can stimulate people to generate corresponding the emotion in a certain extent.
【學(xué)位授予單位】:深圳大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41;TP183
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 張良;周長(zhǎng)勝;;基于內(nèi)容的視頻語(yǔ)義分析關(guān)鍵技術(shù)[J];電子科技;2011年10期
2 張穎,羅森林;情感建模與情感識(shí)別[J];計(jì)算機(jī)工程與應(yīng)用;2003年33期
,本文編號(hào):1551430
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1551430.html
最近更新
教材專著