關(guān)于“中文網(wǎng)頁自動(dòng)分類競賽”結(jié)果的分析

發(fā)布時(shí)間：2018-06-29 11:15

本文選題：計(jì)算機(jī)應(yīng)用 + 中文信息處理��；參考：《中文信息學(xué)報(bào)》2003年05期

【摘要】：在最近召開的"全國搜索引擎與網(wǎng)上信息挖掘?qū)W術(shù)研討會(huì)"上,舉辦了一場"中文網(wǎng)頁自動(dòng)分類競賽",共有來自全國各地的10個(gè)隊(duì)參加。本文在介紹本次競賽活動(dòng)規(guī)則和過程的基礎(chǔ)上,詳細(xì)分析了競賽的結(jié)果,從而使我們對(duì)于目前中文網(wǎng)頁自動(dòng)分類技術(shù)的現(xiàn)狀有了一種具體的認(rèn)識(shí):目前已有分類器的性能沒有呈現(xiàn)出明顯的差距,中文網(wǎng)頁的分類比普通文本的分類要困難的多。同時(shí),本文還嘗試推出一個(gè)標(biāo)準(zhǔn)的中文網(wǎng)頁分類的實(shí)例樣本集,希望通過不斷完善,最終作為中文網(wǎng)頁分類技術(shù)研究的基本語料。
[Abstract]:At the "National Symposium on search engines and online Information Mining", a "Chinese Page automatic Classification Competition" was held, involving 10 teams from all over the country. On the basis of introducing the rules and process of the competition, this paper analyzes the results of the competition in detail. So that we have a specific understanding of the current situation of Chinese web page automatic classification technology: the performance of the existing classifiers has not shown a clear gap, the classification of Chinese web pages is much more difficult than the ordinary text classification. At the same time, this paper also tries to develop a standard sample set of Chinese web page classification, hoping that it can be used as the basic language data for the research of Chinese web page classification technology.
【作者單位】：北京大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)系北京大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)系
【基金】：國家973重大基礎(chǔ)研究項(xiàng)目資助(G1999032706)
【分類號(hào)】：TP393.09
，

本文編號(hào)：2081934

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2081934.html

上一篇：廣東省立中山圖書館的數(shù)字化資源建設(shè)與利用
下一篇：一個(gè)基于鏈接分析的相關(guān)度排序算法及其在專題搜索引擎中應(yīng)用

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

關(guān)于“中文網(wǎng)頁自動(dòng)分類競賽”結(jié)果的分析