天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于多特征信息融合的WEB廣告聚類方法研究

發(fā)布時(shí)間:2018-04-20 14:51

  本文選題:Web廣告 + 多特征; 參考:《哈爾濱工業(yè)大學(xué)》2014年碩士論文


【摘要】:伴隨著互聯(lián)網(wǎng)的快速發(fā)展,Web廣告已成為網(wǎng)絡(luò)服務(wù)提供商盈利的重要途徑,同時(shí)也是許多傳統(tǒng)行業(yè)宣傳自身品牌和產(chǎn)品的有效渠道。海量Web廣告數(shù)據(jù)中隱藏著高價(jià)值的信息和知識(shí),因此如何對(duì)其進(jìn)行有效的挖掘已經(jīng)成為許多互聯(lián)網(wǎng)應(yīng)用的關(guān)鍵問題。在Web廣告數(shù)據(jù)挖掘中,聚類分析是一項(xiàng)重要的基礎(chǔ)技術(shù),不僅可以用于分析競(jìng)爭(zhēng)對(duì)手,并且能夠輔助政府以及評(píng)估機(jī)構(gòu)對(duì)經(jīng)濟(jì)發(fā)展?fàn)顩r進(jìn)行評(píng)估和預(yù)測(cè)。Web廣告數(shù)據(jù)包含多種特征,但每一種特征都不能全面描述Web廣告對(duì)象。通過多種特征的融合,將能夠全面描述Web廣告對(duì)象。因此,本課題研究了基于多特征信息融合的Web廣告聚類方法。主要完成以下工作:(1)分析Web廣告特點(diǎn),搜集、構(gòu)建相關(guān)數(shù)據(jù)集。研究了面向Web廣告數(shù)據(jù)的特征提取方法,實(shí)現(xiàn)了一種基于模糊匹配的文本特征提取方法和四種圖像特征提取方法;(2)Web廣告數(shù)據(jù)的特征空間有高維稀疏的特點(diǎn),而決定兩個(gè)簇分離的往往是極少數(shù)特征。為了區(qū)分這些極少數(shù)特征的重要程度,本文在EW-kmeans的基礎(chǔ)上改進(jìn)了目標(biāo)函數(shù),綜合考慮了簇間距離和簇內(nèi)距離對(duì)聚類效果的影響,提出了基于鑒別子空間的三階張量加權(quán)k-means方法(Dkmeans),并給出相關(guān)理論證明。實(shí)驗(yàn)結(jié)果表明:與最新的相關(guān)聚類算法相比,Dkmeans算法在6個(gè)公開數(shù)據(jù)集上均取得了更好的聚類效果;(3)針對(duì)Web廣告中的不同特征,搭配不同組合進(jìn)行融合實(shí)驗(yàn)。通過實(shí)驗(yàn),發(fā)現(xiàn)不同組合的特征融合,對(duì)Web廣告聚類效果均有不同程度的提高。其中,組合全部特征融合,可以得到最好的聚類效果,從而驗(yàn)證了多特征融合可以提高Web廣告的聚類效果。
[Abstract]:With the rapid development of the Internet, Web advertising has become an important way for Internet service providers to make profits, and it is also an effective channel for many traditional industries to propagate their own brands and products. Huge amount of Web advertising data hides high value information and knowledge, so how to mine it effectively has become the key problem of many Internet applications. In Web advertising data mining, clustering analysis is an important basic technology, not only can be used to analyze competitors, And it can assist the government and evaluation agencies to evaluate and predict the economic development. The web advertising packet contains many features, but each feature can not fully describe the object of Web advertising. Through the fusion of various features, it will be able to describe the Web advertising object in a comprehensive way. Therefore, this paper studies the Web advertising clustering method based on multi-feature information fusion. Analyze the characteristics of Web advertising, collect and construct related data sets. In this paper, the feature extraction method for Web advertising data is studied, and a text feature extraction method based on fuzzy matching and four image feature extraction methods are implemented. The separation of the two clusters is often determined by a very small number of features. In order to distinguish the importance of these few features, the objective function is improved on the basis of EW-kmeans, and the influence of the distance between clusters and within clusters on the clustering effect is considered synthetically. In this paper, a third order Zhang Liang weighted k-means method based on discriminant subspace is proposed and the relevant theoretical proof is given. The experimental results show that compared with the latest correlation clustering algorithm, the DK means algorithm achieves a better clustering effect on 6 open datasets. Through experiments, it is found that the feature fusion of different combinations can improve the clustering effect of Web advertising to varying degrees. Among them, the best clustering effect can be obtained by combining all features fusion, which verifies that multi-feature fusion can improve the clustering effect of Web advertising.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP393.09;TP391.41;TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前4條

1 韓普;王東波;劉艷云;蘇新寧;;詞性對(duì)中英文文本聚類的影響研究[J];中文信息學(xué)報(bào);2013年02期

2 高燕;;關(guān)鍵詞自動(dòng)標(biāo)引方法綜述[J];電子世界;2012年06期

3 周楊;苗奪謙;岳曉冬;;基于自適應(yīng)權(quán)重的粗糙K均值聚類算法[J];計(jì)算機(jī)科學(xué);2011年06期

4 奉國(guó)和;鄭偉;;國(guó)內(nèi)中文自動(dòng)分詞技術(shù)研究綜述[J];圖書情報(bào)工作;2011年02期

相關(guān)碩士學(xué)位論文 前1條

1 樓佳;中文文本聚類的評(píng)價(jià)與改進(jìn)研究[D];杭州電子科技大學(xué);2009年

,

本文編號(hào):1778216

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1778216.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶3b288***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com