天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 軟件論文 >

基于文本分類與主題模型的用戶偏好分析

發(fā)布時(shí)間:2018-06-16 20:14

  本文選題:用戶偏好分析 + 文本分類 ; 參考:《青島科技大學(xué)》2017年碩士論文


【摘要】:用戶偏好是指用戶通過(guò)對(duì)商品或服務(wù)的考量后,所做出的理性的具有傾向性的選擇。對(duì)用戶偏好進(jìn)行分析的主要目的是為了從海量的信息中,篩選出用戶感興趣的信息,從而為用戶提供更個(gè)性化的服務(wù)。因此用戶偏好分析是構(gòu)建個(gè)性化服務(wù)的基礎(chǔ)。然而,現(xiàn)有的用戶偏好分析方法還存在著許多問(wèn)題。一方面,現(xiàn)有的方法大多是對(duì)用戶的固有屬性進(jìn)行分析,很難挖掘出用戶更細(xì)粒度的偏好;另一方面,現(xiàn)有的方法在對(duì)用戶細(xì)粒度偏好進(jìn)行分析時(shí),其算法準(zhǔn)確率和算法效率上都有所不足。用戶偏好可以通過(guò)挖掘用戶的行為得到,通過(guò)對(duì)用戶瀏覽的內(nèi)容進(jìn)行細(xì)粒度的分類、聚類,就可以得到用戶的細(xì)粒度偏好。首先,標(biāo)簽是一種比類別更加細(xì)粒度的表示,并且一個(gè)內(nèi)容可以對(duì)應(yīng)有多個(gè)標(biāo)簽,在對(duì)內(nèi)容進(jìn)行不同層面的標(biāo)簽標(biāo)注可以為用戶偏好分析提供不同層面的偏好特征;其次,根據(jù)用戶的主動(dòng)意圖進(jìn)行聚類,從用戶角度出發(fā),根據(jù)用戶的潛在認(rèn)知,把同類內(nèi)容聚合到一起,為用戶偏好分析提供用戶行為層面的偏好特征;谏鲜龇治,本文提出了兩種對(duì)文本進(jìn)行標(biāo)簽標(biāo)注的算法和一種無(wú)向圖層次聚類優(yōu)化算法:首先,提出一種加權(quán)的有監(jiān)督LDA算法(WLLDA),該算法采用卡方校驗(yàn)的方法對(duì)文本特征進(jìn)行降維。采用一種新的加權(quán)詞袋模型,對(duì)原有詞袋中對(duì)主題分類有意義的詞進(jìn)行提權(quán),增大主題間的分歧,提高分類準(zhǔn)確率。采用多模型集成的方法,針對(duì)不同頻次的主題進(jìn)行采樣訓(xùn)練,解決單一模型因語(yǔ)料不均勻造成的互相干擾。提出一種新的主題貼近度計(jì)算方法,在原有主題概率的基礎(chǔ)上,綜合考慮了關(guān)鍵詞命中頻率、頻次以及標(biāo)簽支持度這三個(gè)方面的因素來(lái)計(jì)算主題貼近度,從而提高主題預(yù)測(cè)的準(zhǔn)確度。其次,提出一種基于word2vec的標(biāo)簽標(biāo)注算法,該算法利用CRF對(duì)文本進(jìn)行關(guān)鍵詞提取,使用word2vec產(chǎn)生的詞向量和LR對(duì)關(guān)鍵詞進(jìn)行關(guān)鍵詞聚類并構(gòu)建標(biāo)簽集合,避免了人工標(biāo)簽庫(kù)歸納覆蓋不全的問(wèn)題。最后通過(guò)對(duì)文本進(jìn)行去噪提取文本主干,通過(guò)比較文本主干詞的詞向量和標(biāo)簽詞向量的相似度為文本進(jìn)行標(biāo)簽標(biāo)注。第三,提出一種無(wú)向圖層次聚類并行化優(yōu)化算法,該算法把用戶主動(dòng)搜索意圖行為抽象為無(wú)向圖。通過(guò)對(duì)多邊節(jié)點(diǎn)進(jìn)行分裂,減弱了衰減因子對(duì)多邊節(jié)點(diǎn)帶來(lái)的負(fù)面影響,同時(shí)使無(wú)向圖圖聚類可以以并行的方式進(jìn)行計(jì)算,在準(zhǔn)確率和計(jì)算效率上都有了大幅度提升。本文通過(guò)上述三種算法,把用戶對(duì)內(nèi)容的偏好程度轉(zhuǎn)變?yōu)橛脩魧?duì)標(biāo)簽的偏好,最終刻畫(huà)出用戶細(xì)粒度的偏好特征,從而達(dá)到對(duì)用戶偏好進(jìn)行分析的目的。
[Abstract]:User preference refers to the rational and tendentious choice made by the user through the consideration of goods or services. The main purpose of analyzing users' preferences is to screen out the information that users are interested in from a large amount of information, so as to provide users with more personalized services. Therefore, user preference analysis is the basis of building personalized services. However, there are still many problems in the existing methods of user preference analysis. On the one hand, most of the existing methods analyze the inherent properties of the user, so it is difficult to mine the user's finer grained preferences. On the other hand, the existing methods are used to analyze the user's fine-grained preferences. Its algorithm accuracy and algorithm efficiency are insufficient. The user preference can be obtained by mining the user's behavior, and the user's fine-grained preference can be obtained by the fine-grained classification and clustering of the content viewed by the user. First, tags are a more granular representation than categories, and a content can correspond to multiple tags. Label tagging at different levels of content can provide different levels of preferences for user preference analysis. Clustering according to the active intention of users, from the point of view of users, according to the potential cognition of users, the same content is aggregated together to provide user preference analysis with user preference characteristics at behavioral level. Based on the above analysis, this paper proposes two algorithms for tagging text and an undirected graph hierarchical clustering optimization algorithm. A weighted supervised LDA algorithm (WLLDAA) is proposed. The algorithm uses chi-square check to reduce the dimension of text features. A new weighted lexical bag model is used to raise the weight of the words in the original lexical bag to increase the differences between themes and to improve the accuracy of classification. The method of multi-model integration is used to train samples for different frequency topics to solve the interferences caused by uneven corpus in a single model. A new method for calculating topic closeness is proposed. Based on the original topic probability, the key word hit frequency, frequency and label support are considered comprehensively to calculate the subject closeness. In order to improve the accuracy of topic prediction. Secondly, a label tagging algorithm based on word2vec is proposed, in which the keywords are extracted from the text, the word vectors and LR generated by word2vec are used to cluster the keywords and the tag set is constructed. Avoid the problem of incomplete inductive coverage of human tag library. Finally, the text trunk is extracted by de-noising the text, and the similarity between the word vector of the main word and the label vector is compared to label the text. Thirdly, an undirected graph hierarchical clustering parallel optimization algorithm is proposed, which abstracts the user's active search intention behavior into undirected graph. By splitting the multilateral nodes, the negative effects of the attenuation factor on the multilateral nodes are reduced, and the undirected graph clustering can be computed in parallel, which greatly improves the accuracy and computational efficiency. In this paper, the degree of user's preference for content is transformed into user's preference for label by the three algorithms mentioned above, and the fine granularity of user's preference is depicted finally, so as to achieve the purpose of analyzing user's preference.
【學(xué)位授予單位】:青島科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前1條

1 劉齊平;;電子商務(wù)領(lǐng)域用戶偏好研究綜述[J];湖北第二師范學(xué)院學(xué)報(bào);2015年02期

相關(guān)碩士學(xué)位論文 前1條

1 張友強(qiáng);基于選擇性集成學(xué)習(xí)的離群點(diǎn)檢測(cè)研究[D];青島科技大學(xué);2016年

,

本文編號(hào):2027968

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2027968.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶0b394***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com
欧美尤物在线视频91| 日韩特级黄片免费在线观看 | 国产欧美日韩精品一区二| 欧美成人免费视频午夜色| 深夜视频在线观看免费你懂| 成人综合网视频在线观看| 日韩一区二区三区在线欧洲| 99国产成人免费一区二区| 国产中文字幕久久黄色片| 成人免费在线视频大香蕉| 日本 一区二区 在线| 欧美午夜色视频国产精品| 亚洲一区二区三区四区性色av| 这里只有九九热精品视频| 国产传媒中文字幕东京热| 免费一级欧美大片免费看| 久久99精品国产麻豆婷婷洗澡| 国产午夜精品在线免费看| 青青久久亚洲婷婷中文网| 中文字幕乱子论一区二区三区| 欧美色婷婷综合狠狠爱| 91免费精品国自产拍偷拍| 男人和女人黄 色大片| 国产精品二区三区免费播放心| 国产视频在线一区二区| 亚洲国产一级片在线观看| 色婷婷国产精品视频一区二区保健| 国产91人妻精品一区二区三区| 日本欧美一区二区三区就| 中文字幕精品少妇人妻| 免费精品国产日韩热久久| 国产日韩熟女中文字幕| 免费观看一级欧美大片| 国产精品99一区二区三区| 中国黄色色片色哟哟哟哟哟哟| 亚洲欧美国产中文色妇| 欧美黑人黄色一区二区| 久草国产精品一区二区| 日本三区不卡高清更新二区| 国产又粗又爽又猛又黄的| 日本人妻中出在线观看|