天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于機(jī)器學(xué)習(xí)的問(wèn)答推薦系統(tǒng)問(wèn)題推薦模型研究

發(fā)布時(shí)間:2018-09-12 06:40
【摘要】:本文所描述的問(wèn)題推薦模型是基于某互動(dòng)中文問(wèn)答平臺(tái)所開(kāi)發(fā)個(gè)性化推薦系統(tǒng)。該中文問(wèn)答平臺(tái)上存在著大量未被回答的問(wèn)題,個(gè)性化推薦系統(tǒng)能夠根據(jù)用戶(hù)的注冊(cè)信息以及其在該互動(dòng)問(wèn)答平臺(tái)上的登錄、瀏覽和回答等行為,為用戶(hù)推薦相關(guān)問(wèn)題,以降低用戶(hù)找到能夠回答的待解決問(wèn)題的成本,提高問(wèn)題的回答量,更好地進(jìn)行知識(shí)分享。 該問(wèn)題推薦系統(tǒng)的推薦模型采用的是基于機(jī)器學(xué)習(xí)技術(shù)構(gòu)建的基于內(nèi)容的推薦算法,借鑒了精準(zhǔn)定向廣告系統(tǒng)的思路,以推薦問(wèn)題的點(diǎn)擊率作為系統(tǒng)的優(yōu)化目標(biāo),結(jié)合中文分詞[76,77,78,79]、關(guān)鍵詞提取、命名實(shí)體識(shí)別(Named Entity Recognition,NER)[81,82,83,84]等技術(shù),建立點(diǎn)擊率(CTR)預(yù)估模型來(lái)匹配用戶(hù)與問(wèn)題。點(diǎn)擊率預(yù)估模型計(jì)算條件概率P(click=true|user=uid, question=qid),即以新問(wèn)題被用戶(hù)點(diǎn)擊的概率作為用戶(hù)與新問(wèn)題匹配程序的度量,并使用最大熵(Max Entropy)模型來(lái)擬合上述條件概率。 原始版本的問(wèn)題推薦模型存在以下兩點(diǎn)不足:首先是推薦模型僅使用了非常少量的特征。特征的維度少導(dǎo)致模型容易出現(xiàn)欠擬合的現(xiàn)象。其次,靜態(tài)的推薦模型無(wú)法適應(yīng)數(shù)據(jù)分布的變化所造成的影響。 本文的工作在于改進(jìn)了原始版本的問(wèn)題推薦模型,,具體而言包括以下兩個(gè)方面的工作: 1.通過(guò)在問(wèn)題推薦模型中引入語(yǔ)義特征、組合特征以及偏置項(xiàng)等,結(jié)合模型選擇與正則化技術(shù),提高了推薦模型的準(zhǔn)確率。改進(jìn)后的模型使用了概率潛在語(yǔ)義分析(probability Latent Semantic Analysis,pLSA)技術(shù)提取問(wèn)題文本的語(yǔ)義特征。在語(yǔ)義層面對(duì)文本進(jìn)行處理能夠獲得比在詞匯層面更好的效果。原有推薦模型在基準(zhǔn)數(shù)據(jù)集上的準(zhǔn)確率為88%,改進(jìn)后的模型在基準(zhǔn)數(shù)據(jù)集上的準(zhǔn)確率為95%。 2.設(shè)計(jì)并實(shí)現(xiàn)了問(wèn)題推薦模型的離線(xiàn)訓(xùn)練系統(tǒng)。該系統(tǒng)能夠完成基礎(chǔ)數(shù)據(jù)自動(dòng)下載、特征提取、模型訓(xùn)練與模型選擇等功能,能夠?qū)崿F(xiàn)問(wèn)題推薦模型的離線(xiàn)訓(xùn)練與定期更新。設(shè)計(jì)離線(xiàn)訓(xùn)練系統(tǒng)的目的在于定期產(chǎn)出新的推薦模型。實(shí)驗(yàn)結(jié)果證明問(wèn)題推薦模型的數(shù)據(jù)分布具有時(shí)序性,使用靜態(tài)模型無(wú)法適應(yīng)數(shù)據(jù)分布變化的影響。 改進(jìn)后的問(wèn)題推薦模型以及離線(xiàn)訓(xùn)練系統(tǒng)已經(jīng)上線(xiàn),為該互動(dòng)中文問(wèn)答系統(tǒng)的用戶(hù)提供更加準(zhǔn)確的個(gè)性化問(wèn)題推薦服務(wù)。
[Abstract]:The question recommendation model described in this paper is based on a personalized recommendation system developed by an interactive Chinese question answering platform. There are a large number of unanswered questions on the Chinese question answering platform. The personalized recommendation system can recommend the relevant questions to the user according to the user's registration information and their login, browse and answer behavior on the interactive question answering platform. In order to reduce the cost of users to find the problem to be answered, improve the number of answers, better knowledge sharing. The recommendation model of the problem recommendation system adopts the content-based recommendation algorithm based on the machine learning technology, and draws lessons from the idea of the precision directed advertising system, and takes the click rate of the recommendation problem as the optimization goal of the system. Combined with the techniques of Chinese word segmentation [76 / 77/ 7/ 78/ 78/ 79], keyword extraction and named entity recognition (Named Entity Recognition,NER) [81 / 82/ 83/ 84], a (CTR) prediction model of click rate was established to match the user and the problem. The conditional probability P (click=true user=uid, question=qid) is calculated by using the prediction model of click rate, that is, the probability of the new problem being clicked by the user is taken as the measure of the matching program between the user and the new problem, and the maximum entropy (Max Entropy) model is used to fit the conditional probability. The original version of the problem recommendation model has the following two shortcomings: the first is that the recommendation model only uses a very small number of features. The lack of feature dimension leads to the underfitting of the model. Secondly, the static recommendation model can not adapt to the change of data distribution. The work of this paper is to improve the original version of the problem recommendation model, specifically including the following two aspects of work: 1. By introducing semantic features, combination features and bias items into the problem recommendation model, the accuracy of the recommendation model is improved by combining model selection and regularization techniques. The improved model uses probabilistic latent semantic analysis (probability Latent Semantic Analysis,pLSA) technique to extract semantic features of problem text. Text processing at the semantic level can achieve better results than at the lexical level. The accuracy of the original recommendation model on the datum data set is 88 and that of the improved model on the datum data set is 95. 2. 2. An offline training system for problem recommendation model is designed and implemented. The system can automatically download basic data, feature extraction, model training and model selection, and can realize offline training and periodic updating of problem recommendation model. The purpose of designing an offline training system is to produce a new recommendation model on a regular basis. The experimental results show that the data distribution of the problem recommendation model is time-series, and the static model can not adapt to the influence of the change of the data distribution. The improved question recommendation model and the offline training system have been launched to provide a more accurate personalized question recommendation service for the users of the interactive Chinese question answering system.
【學(xué)位授予單位】:中山大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP181;TP391.3

【共引文獻(xiàn)】

相關(guān)期刊論文 前10條

1 楊緒兵,韓自存;ε不敏感的核Adaline算法及其在圖像去噪中的應(yīng)用[J];安徽工程科技學(xué)院學(xué)報(bào)(自然科學(xué)版);2003年04期

2 陶秀鳳,唐詩(shī)忠,周鳴爭(zhēng);基于支持向量機(jī)的軟測(cè)量模型及應(yīng)用[J];安徽工程科技學(xué)院學(xué)報(bào)(自然科學(xué)版);2004年02期

3 王東雷;;基于單純形算法的優(yōu)化設(shè)計(jì)與實(shí)現(xiàn)[J];安徽農(nóng)業(yè)科學(xué);2007年36期

4 許高程;張文君;王衛(wèi)紅;;支持向量機(jī)技術(shù)在遙感影像滑坡體提取中的應(yīng)用[J];安徽農(nóng)業(yè)科學(xué);2009年06期

5 郭立萍;唐家奎;米素娟;張成雯;趙理君;;基于支持向量機(jī)遙感圖像融合分類(lèi)方法研究進(jìn)展[J];安徽農(nóng)業(yè)科學(xué);2010年17期

6 ;A Preliminary Application of the Differential Evolution Algorithm to Calculate the CNOP[J];Atmospheric and Oceanic Science Letters;2009年06期

7 馮學(xué)軍;;最小二乘支持向量機(jī)的研究與應(yīng)用[J];安慶師范學(xué)院學(xué)報(bào)(自然科學(xué)版);2009年01期

8 鄒心遙;姚若河;;基于LSSVM的威布爾分布形狀參數(shù)估計(jì)(英文)[J];半導(dǎo)體技術(shù);2008年06期

9 鄒心遙;姚若河;;基于LSSVM的小子樣元器件壽命預(yù)測(cè)[J];半導(dǎo)體技術(shù);2011年09期

10 李卓遠(yuǎn),吳為民,王e

本文編號(hào):2238205


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/2238205.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)afbc7***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
综合久综合久综合久久| 国产一区二区三区四区中文| 东京不热免费观看日本| 免费观看日韩一级黄色大片| 成在线人免费视频一区二区 | 国产精品激情对白一区二区| 国产精品激情对白一区二区| 国产精品午夜一区二区三区| 五月天综合网五月天综合网| 深夜视频在线观看免费你懂 | 中文字日产幕码三区国产| 九九热视频免费在线视频| 午夜精品国产一区在线观看| 伊人国产精选免费观看在线视频| 大胆裸体写真一区二区| 国产精品一区二区三区日韩av| 国产欧美日本在线播放| 99久久精品午夜一区| 久久本道综合色狠狠五月| 国产91麻豆精品成人区| 91老熟妇嗷嗷叫太91| 亚洲欧美国产中文色妇| 成人午夜爽爽爽免费视频| 不卡中文字幕在线免费看| 久久热麻豆国产精品视频| 日韩人妻少妇一区二区| 亚洲天堂有码中文字幕视频| 极品熟女一区二区三区| 在线一区二区免费的视频| 午夜久久精品福利视频| 91偷拍裸体一区二区三区| 香蕉尹人视频在线精品| 91亚洲国产日韩在线| 亚洲高清中文字幕一区二区三区| 欧美又黑又粗大又硬又爽| 亚洲欧美日本国产不卡| 日韩视频在线观看成人| 日本免费一级黄色录像| 日本午夜一本久久久综合| 在线观看国产午夜福利| 国产精品免费视频视频|