機器學(xué)習(xí)在影視大數(shù)據(jù)分析中的研究及應(yīng)用

發(fā)布時間：2019-03-26 21:27

【摘要】：影視產(chǎn)業(yè)作為我國國民經(jīng)濟體系中新的突破口,廣受影視市場主導(dǎo)人員、電臺運營商、各大視頻網(wǎng)站運營機構(gòu)以及一些科研學(xué)者的關(guān)注。面對大數(shù)據(jù)時代的到來,影視行業(yè)的數(shù)據(jù)存儲、處理和分析等多個方面也面臨著巨大的挑戰(zhàn),傳統(tǒng)的數(shù)據(jù)存儲模式、數(shù)據(jù)處理方法和數(shù)據(jù)分析技術(shù)將無法滿足擁有海量數(shù)據(jù)的應(yīng)用需求。隨著數(shù)理統(tǒng)計理論及人工智能等諸多領(lǐng)域的不斷發(fā)展,基于機器學(xué)習(xí)的理論體系逐漸構(gòu)建起來,人們試圖應(yīng)用機器學(xué)習(xí)方法去對海量數(shù)據(jù)進行處理分析,以期從中提取出有用的知識和信息。因此,研究如何運用機器學(xué)習(xí)方法從海量影視大數(shù)據(jù)中挖掘出數(shù)據(jù)背后隱藏的特征和波動趨勢,是具有重大的現(xiàn)實指導(dǎo)意義的。本文主要是利用機器學(xué)習(xí)方法來對影視大數(shù)據(jù)進行處理與分析,同時結(jié)合智能影視大數(shù)據(jù)分析系統(tǒng)對海量的電視劇收視相關(guān)數(shù)據(jù)先后進行預(yù)處理、特征降維、圖表分析與收視預(yù)測,其增加了數(shù)據(jù)處理的效率和收視預(yù)測的準確性。因此,通過機器學(xué)習(xí)方法來解決影視大數(shù)據(jù)場景中的問題具有重要的意義,其給予了研究人員有效的應(yīng)用思路,也為影視企業(yè)贏取最終市場并獲得更高收視率創(chuàng)造了可能。本文的主要工作如下:[1]基于K-Means聚類算法對高維影視數(shù)據(jù)進行預(yù)處理。其針對篩選出的電視劇樣本數(shù)據(jù)進行屬性選擇、數(shù)據(jù)聚集和數(shù)據(jù)規(guī)范化,最后利用K-Means算法對數(shù)據(jù)進行補全操作。[2]基于因子分析法對高維影視數(shù)據(jù)進行降維處理。其針對高冗余、高維度的電視劇特征數(shù)據(jù),運用因子分析法來獲得低維的冗余性小的影響因子作為降維后的特征向量。[3]基于SVM算法和AdaBoost-BP算法對電視劇收視水平和收視率進行分類與預(yù)測。其使用降維后的電視劇特征數(shù)據(jù),運用SVM算法和AdaBoost-BP算法來建立收視預(yù)測模型。之后對相關(guān)數(shù)據(jù)進行預(yù)測分析。最后對比分析預(yù)測效果,總結(jié)出更具有效性的預(yù)測算法。[4]基于智能影視大數(shù)據(jù)分析系統(tǒng)對收視進行分析與展示。其針對處理后的電視劇收視相關(guān)數(shù)據(jù),多層次多角度地進行圖表關(guān)聯(lián)分析與直觀展示,并把文中提出的預(yù)測模型運用到影視大數(shù)據(jù)收視預(yù)測中,驗證了其有效性。
[Abstract]:As a new breakthrough in China's national economic system, the film and television industry is widely concerned by the leading personnel in the film and television market, radio operators, major video website operators and some scientific researchers. In the face of the arrival of big data's era, the film and television industry's data storage, processing and analysis are also facing enormous challenges, traditional data storage mode, Data processing methods and data analysis techniques will not meet the needs of applications with huge amounts of data. With the development of mathematical statistics theory and artificial intelligence and many other fields, the theoretical system based on machine learning is gradually constructed, and people try to use machine learning method to process and analyze massive data. In order to extract useful knowledge and information from it. Therefore, it is of great practical significance to study how to use the machine learning method to dig out the hidden features and fluctuating trends behind the data from the massive film and television big data. This article mainly uses the machine learning method to process and analyze the film and television big data, at the same time combines the intelligent film and television big data analysis system to pre-process the massive TV series ratings related data successively, and reduces the feature dimension. Chart analysis and ratings prediction increase the efficiency of data processing and the accuracy of ratings prediction. Therefore, it is of great significance to solve the problems in the film and television big data scene by means of machine learning, which gives researchers effective application ideas and creates the possibility for film and television enterprises to win the final market and obtain higher ratings. The main work of this paper is as follows: [1] pre-processing high-dimensional video data based on K-Means clustering algorithm. According to the selected TV series sample data for attribute selection, data aggregation and data normalization, finally using the K-Means algorithm to complete the data. [2] based on factor analysis for high-dimensional film and television data dimensionality reduction. For highly redundant, high-dimensional TV feature data, The factor analysis method is used to obtain the lower dimension redundancy factor as the feature vector after dimension reduction. [3] based on SVM algorithm and AdaBoost-BP algorithm, the ratings and ratings of TV series are classified and predicted. It uses the reduced dimension TV series feature data, uses the SVM algorithm and the AdaBoost-BP algorithm to establish the ratings prediction model. Then the related data are predicted and analyzed. Finally, the prediction results are compared and the more effective prediction algorithms are summarized. [4] based on the intelligent film and television big data analysis system, the analysis and display of the ratings are carried out. According to the related data of TV series after processing, multi-level and multi-angle graph correlation analysis and visual display are carried out, and the prediction model proposed in this paper is applied to the movie and television big data ratings prediction to verify its effectiveness.
【學(xué)位授予單位】：北京郵電大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2016
【分類號】：TP181;TP311.13

【參考文獻】

相關(guān)期刊論文前10條

1 張文;;大數(shù)據(jù)分析對中國影視業(yè)運營的意義[J];新聞世界;2015年04期

2 王賀;胡志堅;張翌暉;李晨;楊楠;王戰(zhàn)勝;;基于聚類經(jīng)驗?zāi)B(tài)分解和最小二乘支持向量機的短期風(fēng)速組合預(yù)測[J];電工技術(shù)學(xué)報;2014年04期

3 曹瑩;苗啟廣;劉家辰;高琳;;AdaBoost算法研究進展與展望[J];自動化學(xué)報;2013年06期

4 胡海青;張瑯;張道宏;;供應(yīng)鏈金融視角下的中小企業(yè)信用風(fēng)險評估研究——基于SVM與BP神經(jīng)網(wǎng)絡(luò)的比較研究[J];管理評論;2012年11期

5 吳俊利;張步涵;王魁;;基于Adaboost的BP神經(jīng)網(wǎng)絡(luò)改進算法在短期風(fēng)速預(yù)測中的應(yīng)用[J];電網(wǎng)技術(shù);2012年09期

6 徐雯;高建華;;基于Spring MVC及MyBatis的Web應(yīng)用框架研究[J];微型電腦應(yīng)用;2012年07期

7 柳玉;郭虎全;;基于AdaBoost與BP神經(jīng)網(wǎng)絡(luò)的風(fēng)速預(yù)測研究[J];電網(wǎng)與清潔能源;2012年02期

8 師洪濤;楊靜玲;丁茂生;王金梅;;基于小波—BP神經(jīng)網(wǎng)絡(luò)的短期風(fēng)電功率預(yù)測方法[J];電力系統(tǒng)自動化;2011年16期

9 陳盼;陳皓勇;葉榮;陳天恩;李丹;;基于小波包和支持向量回歸的風(fēng)速預(yù)測[J];電網(wǎng)技術(shù);2011年05期

10 寧連舉;李萌;;基于因子分析法構(gòu)建大中型工業(yè)企業(yè)技術(shù)創(chuàng)新能力評價模型[J];科研管理;2011年03期

相關(guān)會議論文前1條

1 陳紅麗;張國成;萬磊;;預(yù)測控制中的建模方法綜述[A];2011年中國智能自動化學(xué)術(shù)會議論文集（第一分冊）[C];2011年

相關(guān)博士學(xué)位論文前6條

1 樓巍;面向大數(shù)據(jù)的高維數(shù)據(jù)挖掘技術(shù)研究[D];上海大學(xué);2013年

2 朱林;基于特征加權(quán)與特征選擇的數(shù)據(jù)挖掘算法研究[D];上海交通大學(xué);2013年

3 彭柳青;高維高噪聲數(shù)據(jù)聚類中關(guān)鍵問題研究[D];西安電子科技大學(xué);2011年

4 蔣勝利;高維數(shù)據(jù)的特征選擇與特征提取研究[D];西安電子科技大學(xué);2011年

5 王國勝;支持向量機的理論與算法研究[D];北京郵電大學(xué);2008年

6 楊風(fēng)召;高維數(shù)據(jù)挖掘中若干關(guān)鍵問題的研究[D];復(fù)旦大學(xué);2003年

相關(guān)碩士學(xué)位論文前8條

1 江帆;基于因子分析法的區(qū)域物流競爭力研究[D];南京大學(xué);2013年

2 康永為;大數(shù)據(jù)環(huán)境下高維數(shù)據(jù)處理若干問題[D];廣西師范大學(xué);2013年

3 王穎;基于神經(jīng)網(wǎng)絡(luò)的數(shù)據(jù)挖掘方法的研究和應(yīng)用[D];中國地質(zhì)大學(xué)(北京);2012年

4 崔丹丹;K-Means聚類算法的研究與改進[D];安徽大學(xué);2012年

5 任天峰;影響電視劇受眾收視行為的需求因素分析[D];東華大學(xué);2007年

6 周正林;基于人工神經(jīng)網(wǎng)絡(luò)交通流量預(yù)測模型的研究[D];哈爾濱工程大學(xué);2007年

7 關(guān)大偉;數(shù)據(jù)挖掘中的數(shù)據(jù)預(yù)處理[D];吉林大學(xué);2006年

8 李曉明;k-means類型變量加權(quán)聚類算法的研究與實現(xiàn)[D];哈爾濱工業(yè)大學(xué);2006年

，

本文編號：2447947

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2447947.html

上一篇：信息系統(tǒng)安全專題前言
下一篇：信息技術(shù)環(huán)境下數(shù)學(xué)問題解決的實踐研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

機器學(xué)習(xí)在影視大數(shù)據(jù)分析中的研究及應(yīng)用