Web視頻信息提取研究
[Abstract]:In this era of information, the amount of information on the network has increased dramatically. General search engines such as Baidu and Google have increasingly realized the pressure of slow search speed and high hardware requirements brought by large databases. In addition, they are looking for accuracy and storing them uniformly. In the unified display aspect, the common search engine also has the big difficulty, in this environment, the vertical search engine that focuses on the specific domain flourishes, it does not have the breadth which the general search engine has, but it avoids the above shortcoming. In recent years, video websites have sprung up in front of Internet users. Because the display styles and video databases of different video websites are different, how convenient are they? Accurate feedback to users needs video is the problem that needs to be solved today. In addition, some illegal businesses, users on the Internet to distribute distorted fact video or pornographic video, which has a negative impact on the public, relevant management departments need to unify the retrieval of network video tools. Although some search engines now cooperate with some video websites to achieve unified video retrieval by means of transmitting video related information, they are all involved in the cooperation of larger video websites. Therefore, to achieve a wider range of video retrieval, we need to use Web video information extraction. As a vertical search engine and Web video retrieval, intersecting Web video information extraction has been paid more attention to and will play a more important role. However, in the process of implementation, Some existing web page classification methods and page purification methods do not fully take into account the characteristics of Web video pages, which has resulted in a difficult situation. Starting from the reality of the web video website, this paper first analyzes the classification of the web page on the video website, and draws the conclusion that the information extraction of the video playing page can get a good effect. Then, according to the characteristics of video playing pages, the paper describes the methods of classifying web pages by template, visual features, feature scripts, etc. Finally, in the aspect of page purification, the noise of video playing pages can be divided into three categories: background noise, and so on. Random noise and residual noise can be eliminated by template, page structure and semantic analysis respectively. Through experimental comparison and analysis, it is also proved that the methods of web page classification and page purification described in this paper can achieve good results in Web video information extraction.
【學(xué)位授予單位】:武漢理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前6條
1 胡軍偉;秦奕青;張偉;;正則表達(dá)式在Web信息抽取中的應(yīng)用[J];北京信息科技大學(xué)學(xué)報(bào)(自然科學(xué)版);2011年06期
2 黃子越;萬(wàn)常選;;XML檢索中基于聚類的查詢?cè)~擴(kuò)展[J];電子科技大學(xué)學(xué)報(bào);2009年S1期
3 張?chǎng)?陳梅;王翰虎;王嫣然;;基于視覺(jué)特征和領(lǐng)域本體的Web信息抽取[J];計(jì)算機(jī)技術(shù)與發(fā)展;2011年02期
4 陳旭春 ,趙明生;分布式多搜索引擎系統(tǒng)的研究與實(shí)現(xiàn)[J];微計(jì)算機(jī)信息;2005年20期
5 李志義;;網(wǎng)絡(luò)爬蟲(chóng)的優(yōu)化策略探略[J];現(xiàn)代情報(bào);2011年10期
6 易榮鋒;朱六璋;尹文科;;互聯(lián)網(wǎng)視頻摘要信息自動(dòng)抽取[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2010年10期
相關(guān)碩士學(xué)位論文 前4條
1 張瑞雪;基于DOM樹(shù)的網(wǎng)頁(yè)相似度研究與應(yīng)用[D];大連理工大學(xué);2011年
2 李少波;支持語(yǔ)義的分布式視頻檢索系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];中國(guó)科學(xué)技術(shù)大學(xué);2011年
3 呂韓飛;主題(topical)crawler及其應(yīng)用——主題搜索引擎[D];浙江大學(xué);2005年
4 袁宇麗;基于HTML網(wǎng)頁(yè)的Web信息提取研究[D];電子科技大學(xué);2006年
,本文編號(hào):2130240
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2130240.html