基于深度學(xué)習(xí)的CTR預(yù)測(cè)研究

發(fā)布時(shí)間：2018-09-04 06:27

【摘要】：伴隨著互聯(lián)網(wǎng)、云計(jì)算、物聯(lián)網(wǎng)等技術(shù)的迅猛發(fā)展,網(wǎng)絡(luò)的數(shù)據(jù)規(guī)模也在急劇增長(zhǎng),信息社會(huì)已經(jīng)慢慢步入“大數(shù)據(jù)”時(shí)代。網(wǎng)絡(luò)廣告投放系統(tǒng)架構(gòu)于大數(shù)據(jù)的基礎(chǔ)上,系統(tǒng)利用機(jī)器學(xué)習(xí)對(duì)海量用戶行為進(jìn)行分析挖掘并向用戶實(shí)時(shí)地推送合適的廣告。點(diǎn)擊率(Click Through Rate,CTR)預(yù)測(cè)是網(wǎng)絡(luò)廣告投放系統(tǒng)的核心技術(shù),對(duì)于提升系統(tǒng)的運(yùn)作效率意義重大。CTR的精準(zhǔn)預(yù)測(cè)是制定科學(xué)的電子商務(wù)市場(chǎng)營銷決策的關(guān)鍵,直接影響用戶的網(wǎng)絡(luò)體驗(yàn),直接關(guān)系到互聯(lián)網(wǎng)公司的運(yùn)營成本。因此,CTR的預(yù)測(cè)具有很高的商業(yè)價(jià)值和研究?jī)r(jià)值。面對(duì)網(wǎng)絡(luò)廣告投放系統(tǒng)的高精準(zhǔn)度和高時(shí)效的要求,本文從淺層學(xué)習(xí)和深度學(xué)習(xí)兩個(gè)角度開展特征選擇、特征學(xué)習(xí)、分類預(yù)測(cè)和應(yīng)用技術(shù)研究。以網(wǎng)絡(luò)廣告真實(shí)的數(shù)據(jù)集為實(shí)驗(yàn)對(duì)象,分別構(gòu)建淺層學(xué)習(xí)模型和深度學(xué)習(xí)模型。為了全面驗(yàn)證深度學(xué)習(xí)模型,本次研究通過多視角的綜合對(duì)比實(shí)驗(yàn)來證實(shí)深度學(xué)習(xí)的巨大潛力。綜合考慮,具體的研究工作主要包括以下五個(gè)方面:(1)開展數(shù)據(jù)處理和特征工程技術(shù)研究。從真實(shí)數(shù)據(jù)集出發(fā)探索研究類別不平衡性對(duì)預(yù)測(cè)模型的影響機(jī)理,不平衡數(shù)據(jù)的重采樣技術(shù)。(2)針對(duì)數(shù)據(jù)特征的高度非線性特點(diǎn),開展淺層學(xué)習(xí)和深度學(xué)習(xí)理論及應(yīng)用技術(shù)對(duì)比研究。為了克服淺層模型對(duì)復(fù)雜問題學(xué)習(xí)能力受限問題,構(gòu)建深度學(xué)習(xí)模型,實(shí)驗(yàn)通過算法實(shí)現(xiàn)證實(shí)了相對(duì)比淺層學(xué)習(xí),深度學(xué)習(xí)的預(yù)測(cè)效果提升了約21%,具有很強(qiáng)的優(yōu)勢(shì)。(3)為消除類別不平衡對(duì)預(yù)測(cè)模型的影響,提出了一種深度神經(jīng)網(wǎng)絡(luò)(Deep Neural Network,DNN)的改進(jìn)模型——SDNN(Deep Neural Network based on Sampling,SDNN)�；贕PU的并行計(jì)算,通過構(gòu)建模型和實(shí)現(xiàn)算法驗(yàn)證了在不影響預(yù)測(cè)效果的同時(shí),SDNN預(yù)測(cè)模型訓(xùn)練時(shí)間縮短了約73.28%,大幅度的提升了DNN的運(yùn)算效率。針對(duì)系統(tǒng)的精準(zhǔn)度和時(shí)效性的高要求,SDNN被證實(shí)是一種面向大數(shù)據(jù)更為高效的預(yù)測(cè)方法。(4)研究Sigmoid激活函數(shù)和Relu激活函數(shù)對(duì)DNN預(yù)測(cè)模型的影響機(jī)理。通過分別構(gòu)建DNN和SDNN模型和算法的實(shí)現(xiàn),證實(shí)了相對(duì)比Sigmoid激活函數(shù),Relu激活函數(shù)更適合于層次較深的網(wǎng)絡(luò)模型,基于Relu激活函數(shù)的DNN和SDNN更適合復(fù)雜問題的建模。(5)為了避免單一SDNN訓(xùn)練的局限性提升模型的泛化能力,開展關(guān)鍵參數(shù)dropout敏感性分析研究。
[Abstract]:With the rapid development of Internet, cloud computing, Internet of things and other technologies, the data scale of the network is also growing rapidly, the information society has entered the "big data" era. Based on big data, the system uses machine learning to analyze and mine massive user behavior and push appropriate advertisements to users in real time. The prediction of click rate (Click Through Rate,CTR) is the core technology of the network advertisement delivery system. It is of great significance to improve the operational efficiency of the system. The accurate prediction is the key to making scientific electronic commerce marketing decisions, which directly affects the network experience of users. Directly related to the operating costs of Internet companies. Therefore, the prediction of CTR has high commercial value and research value. In the face of the requirement of high precision and high efficiency in the network advertising system, this paper carries out the research of feature selection, feature learning, classification prediction and application technology from the two angles of shallow learning and deep learning. Taking the real data set of network advertisement as the experimental object, the shallow learning model and the depth learning model are constructed respectively. In order to fully verify the depth learning model, this study verifies the great potential of depth learning through comprehensive comparative experiments from multiple perspectives. Considering synthetically, the concrete research work mainly includes the following five aspects: (1) carry out the research of data processing and feature engineering technology. Based on the real data set, this paper explores the influence mechanism of class imbalance on prediction model, and the resampling technique of unbalanced data. (2) aiming at the highly nonlinear characteristics of data features, To carry out a comparative study of shallow and deep learning theories and applied techniques. In order to overcome the problem of limited learning ability of shallow model for complex problems and to construct a deep learning model, the experiment proves that the learning ability of shallow model is compared with that of shallow learning. The prediction effect of depth learning is improved by about 21%, which has a strong advantage. (3) in order to eliminate the influence of class imbalance on prediction model, an improved model of depth neural network (Deep Neural Network,DNN) is proposed. Based on the parallel computation of GPU, it is verified that the training time of prediction model is reduced by 73.28%, and the efficiency of DNN is greatly improved by constructing model and implementing algorithm. It has been proved that SDNN is a more efficient prediction method for big data in view of the high requirement of accuracy and timeliness of the system. (4) the influence mechanism of Sigmoid activation function and Relu activation function on DNN prediction model is studied. By constructing DNN and SDNN models and algorithms, it is proved that compared with the Sigmoid activation function, Relu activation function is more suitable for the deeper network model. DNN and SDNN based on Relu activation function are more suitable for modeling complex problems. (5) in order to avoid the limitation of single SDNN training to improve the generalization ability of the model, the key parameter dropout sensitivity analysis is carried out.
【學(xué)位授予單位】：重慶工商大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：TP181

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 奚雪峰;周國棟;;面向自然語言處理的深度學(xué)習(xí)研究[J];自動(dòng)化學(xué)報(bào);2016年10期

2 劉萬軍;梁雪劍;曲海成;;不同池化模型的卷積神經(jīng)網(wǎng)絡(luò)學(xué)習(xí)性能研究[J];中國圖象圖形學(xué)報(bào);2016年09期

3 張蕾;章毅;;大數(shù)據(jù)分析的無限深度神經(jīng)網(wǎng)絡(luò)方法[J];計(jì)算機(jī)研究與發(fā)展;2016年01期

4 陳巧紅;余仕敏;賈宇波;;廣告點(diǎn)擊率預(yù)估技術(shù)綜述[J];浙江理工大學(xué)學(xué)報(bào);2015年11期

5 朱志北;李斌;劉學(xué)軍;胡平;;基于LDA的互聯(lián)網(wǎng)廣告點(diǎn)擊率預(yù)測(cè)研究[J];計(jì)算機(jī)應(yīng)用研究;2016年04期

6 王山海;景新幸;楊海燕;;基于深度學(xué)習(xí)神經(jīng)網(wǎng)絡(luò)的孤立詞語音識(shí)別的研究[J];計(jì)算機(jī)應(yīng)用研究;2015年08期

7 張鵬;黃毅;阮雅端;陳啟美;;基于稀疏特征的交通流視頻檢測(cè)算法[J];南京大學(xué)學(xué)報(bào)(自然科學(xué));2015年02期

8 徐培;蔡小路;何文偉;謝易道;;基于深度自編碼網(wǎng)絡(luò)的運(yùn)動(dòng)目標(biāo)檢測(cè)[J];計(jì)算機(jī)應(yīng)用;2014年10期

9 劉建偉;劉媛;羅雄麟;;深度學(xué)習(xí)研究進(jìn)展[J];計(jì)算機(jī)應(yīng)用研究;2014年07期

10 余凱;賈磊;陳雨強(qiáng);徐偉;;深度學(xué)習(xí)的昨天、今天和明天[J];計(jì)算機(jī)研究與發(fā)展;2013年09期

相關(guān)碩士學(xué)位論文前1條

1 霍艷;網(wǎng)絡(luò)廣告投放算法的研究[D];東北大學(xué);2013年

，

本文編號(hào)：2221286

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2221286.html

上一篇：Android平臺(tái)動(dòng)態(tài)惡意行為檢測(cè)系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)
下一篇：快速著水潤麥控制系統(tǒng)研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于深度學(xué)習(xí)的CTR預(yù)測(cè)研究