基于雙模態(tài)特征和支持向量機(jī)的視頻自動(dòng)分類算法研究
發(fā)布時(shí)間:2018-01-27 06:27
本文關(guān)鍵詞: 視頻分類 特征 雙模態(tài) 二次預(yù)測(cè) 支持向量機(jī) 出處:《上海交通大學(xué)》2010年碩士論文 論文類型:學(xué)位論文
【摘要】: 視頻內(nèi)容的自動(dòng)分類算法是計(jì)算機(jī)視覺領(lǐng)域中一個(gè)重要的研究課題,它為日益增加的視頻數(shù)據(jù)的管理提供了方便,基于內(nèi)容的視頻自動(dòng)分類作為視頻傳播控制的一類關(guān)鍵技術(shù)在對(duì)網(wǎng)絡(luò)媒體進(jìn)行有序管理的需求下至關(guān)重要;谝曨l自動(dòng)分類技術(shù)的應(yīng)用,媒體網(wǎng)站可以把海量的視頻內(nèi)容進(jìn)行自動(dòng)分類,從而進(jìn)行更有效的組織、存儲(chǔ)和檢索,還可實(shí)現(xiàn)對(duì)不良視頻信息,如恐怖暴力視頻的自動(dòng)初步篩選。 視頻自動(dòng)分類算法的性能極大地依賴于視頻特征的提取和分類算法的選取,本文從對(duì)視頻內(nèi)容和視頻風(fēng)格類型的角度出發(fā),提出了基于視覺和音頻雙模態(tài)特征組合的視頻特征提取方法,和改進(jìn)支持向量機(jī)(SVM)視頻分類算法,實(shí)現(xiàn)了對(duì)卡通、廣告、音樂(lè)、新聞、體育這五類常見視頻的自動(dòng)分類,以及對(duì)電影中恐怖暴力場(chǎng)景的自動(dòng)識(shí)別。 首先,在分析現(xiàn)有的視頻分類算法的基礎(chǔ)上,針對(duì)現(xiàn)有算法存在的問(wèn)題,通過(guò)分析五類常見視頻在視覺上的差異,本文提出了新的特征表達(dá)方案即MPEG-7視覺描述子組合模型,從顏色、紋理、形狀、運(yùn)動(dòng)四個(gè)方面提取了共九種描述子來(lái)構(gòu)成新的整體視覺特征,取得了較好的效果;在識(shí)別恐怖暴力場(chǎng)景時(shí),本文根據(jù)這些場(chǎng)景的特點(diǎn)采用了視覺和音頻兩種模態(tài)的特征,相比單一特征增加了場(chǎng)景模式匹配的準(zhǔn)確率,在有效性和區(qū)分度上達(dá)到了滿意的效果。 在選擇并提取了合適的特征后,針對(duì)目前統(tǒng)計(jì)方法中存在的通過(guò)小樣本集很難設(shè)計(jì)有效分類器的問(wèn)題,本文提出了基于支持向量機(jī)的視頻自動(dòng)分類算法,并對(duì)分類器的判決策略方法進(jìn)行了改進(jìn),提出了基于支持向量機(jī)1-1方法的二次預(yù)測(cè)機(jī)制,進(jìn)一步提高了支持向量機(jī)多分類方法的準(zhǔn)確率。 仿真實(shí)驗(yàn)的結(jié)果表明:本文算法在特征選擇方面突出了不同類別視頻的差異性,增強(qiáng)了待分視頻的區(qū)分能力;其次,改進(jìn)的二次預(yù)測(cè)機(jī)制提高了支持向量機(jī)的多視頻分類的性能;最后,與目前的相關(guān)類似算法進(jìn)行了對(duì)比實(shí)驗(yàn),五類視頻的分類實(shí)驗(yàn)和恐怖暴力場(chǎng)景的識(shí)別實(shí)驗(yàn)均證明了本文算法在視頻分類準(zhǔn)確率方面的優(yōu)越性能。
[Abstract]:The automatic classification algorithm of video content is an important research topic in the field of computer vision. It provides convenience for the increasing management of video data. As a kind of key technology of video propagation control, content-based video automatic classification is very important under the demand of orderly management of network media. Media websites can automatically classify large amounts of video content, so as to organize, store and retrieve more effectively, and can also realize the automatic preliminary screening of bad video information, such as terrorist violence video. The performance of the automatic video classification algorithm is greatly dependent on the feature extraction and classification algorithm selection. This paper starts from the point of view of video content and video style types. This paper proposes a video feature extraction method based on the combination of visual and audio features, and an improved support vector machine (SVM) video classification algorithm to achieve cartoon, advertising, music, news. The automatic classification of the five common sports videos and the automatic recognition of the scenes of horror and violence in movies. First of all, based on the analysis of the existing video classification algorithms, aiming at the existing problems, through the analysis of five kinds of common video in the visual differences. In this paper, a new feature representation scheme, MPEG-7 visual description sub-combination model, is proposed. Nine descriptors are extracted from color, texture, shape and motion to form a new overall visual feature. Good results have been achieved; According to the characteristics of these scenes, this paper adopts the features of visual and audio modes, which increases the accuracy of scene pattern matching compared with a single feature. Satisfactory results have been achieved in terms of effectiveness and differentiation. After selecting and extracting the appropriate features, aiming at the problem that it is difficult to design effective classifier through small sample set in the current statistical methods, this paper proposes an automatic video classification algorithm based on support vector machine. The decision strategy method of classifier is improved, and a quadratic prediction mechanism based on support vector machine (SVM) 1-1 method is proposed, which further improves the accuracy of SVM multi-classification method. The simulation results show that: the algorithm in feature selection highlights the differences of different types of video, and enhances the ability to distinguish the video to be divided; Secondly, the improved quadratic prediction mechanism improves the performance of support vector machine (SVM) multi-video classification. Finally, compared with the current similar algorithms, five kinds of video classification experiments and terrorist violence scene recognition experiments have proved the superiority of this algorithm in the accuracy of video classification.
【學(xué)位授予單位】:上海交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2010
【分類號(hào)】:TP391.41
【引證文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前1條
1 馮冰;基于時(shí)空特征和詞袋模型的多模態(tài)視頻內(nèi)容識(shí)別算法研究[D];上海交通大學(xué);2011年
,本文編號(hào):1467823
本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/1467823.html
最近更新
教材專著