基于譜聚類的個(gè)性化推薦系統(tǒng)研究
發(fā)布時(shí)間:2019-05-10 14:25
【摘要】:隨著web2.0和電子商務(wù)的快速發(fā)展,信息資源正在指數(shù)型增長(zhǎng)。目前,解決信息過載的一種有效方法就是采用推薦系統(tǒng),而協(xié)同過濾是推薦系統(tǒng)中運(yùn)用最廣泛的算法,但是其依然存在數(shù)據(jù)稀疏性、可擴(kuò)展性以及冷啟動(dòng)等問題。與此同時(shí),大多數(shù)個(gè)性化推薦系統(tǒng)往往忽略用戶本身的一些特征屬性,比如年齡、性別和職業(yè),在用戶-項(xiàng)目評(píng)分?jǐn)?shù)據(jù)難以獲得的情況下,會(huì)嚴(yán)重影響個(gè)性化推薦系統(tǒng)的推薦精度。在分析比較各種常用個(gè)性化推薦算法及相關(guān)技術(shù)之后,本文以數(shù)據(jù)稀疏性和冷啟動(dòng)問題為立足點(diǎn),旨在提高個(gè)性化推薦系統(tǒng)的推薦精度并降低推薦算法的時(shí)間復(fù)雜性,對(duì)基于譜聚類的個(gè)性化推薦系統(tǒng)進(jìn)行了研究,具體研究?jī)?nèi)容包括:(1)將譜聚類引進(jìn)到個(gè)性化推薦系統(tǒng)中,利用加權(quán)核模糊聚類和初始質(zhì)心選擇算法對(duì)譜聚類進(jìn)行改進(jìn),并修正Person相關(guān)性。最后將改進(jìn)的譜聚類和相似性度量方法與協(xié)同過濾結(jié)合,得到了兩種改進(jìn)的基于用戶譜聚類的協(xié)同過濾推薦算法。在MovieLens 100K數(shù)據(jù)集上,上述兩種算法的平均絕對(duì)誤差(Mean Absolutre Error, MAE)以及均方根誤差(Root Mean Square Error, RMSE)較傳統(tǒng)的K-means聚類協(xié)同過濾算法至少降低了4%,運(yùn)行時(shí)間至少減少了一半;在MovileLens 1M數(shù)據(jù)集上,MAE與RMSE值至少改善了2%,運(yùn)行時(shí)間減少了80%。(2)基于用戶特征屬性,提出了用戶年齡、性別、職業(yè)的預(yù)處理方式,獲得用戶特征屬性矩陣后,提出了基于用戶特征屬性譜聚類協(xié)同過濾算法。(3)針對(duì)偏差奇異值分解(Bias Singular Value Decomposition, BSVD)算法存在的過擬合問題,綜合利用用戶特征屬性和用戶-項(xiàng)目歷史評(píng)分記錄,將上述所提出的基于用戶特征屬性譜聚類與BSVD模型相結(jié)合,并在模型中增加了一個(gè)新用戶判斷來解決冷啟動(dòng)問題,最后得到了一種改進(jìn)的推薦算法。在MovieLens 100K數(shù)據(jù)上,該算法與BSVD分解算法相比較,其MAE和RMSE值至少減少了6%,在數(shù)據(jù)集MovieLens 1M上,MAE與RMSE值至少降低了2%。實(shí)驗(yàn)表明,該算法不僅提高了推薦準(zhǔn)確率并具有一定的可擴(kuò)展性。(4)利用已有的數(shù)據(jù)集合設(shè)計(jì)多個(gè)實(shí)驗(yàn),將提出的算法與傳統(tǒng)的算法進(jìn)行驗(yàn)證比較,通過實(shí)驗(yàn)可以得出,將譜聚類運(yùn)用到個(gè)性化推薦系統(tǒng)中能夠大大地提高預(yù)測(cè)精度和系統(tǒng)的實(shí)時(shí)響應(yīng)速度,最終為企業(yè)和商家?guī)砀蟮慕?jīng)濟(jì)收益。
[Abstract]:With the rapid development of web2.0 and e-commerce, information resources are growing exponential. At present, one of the effective methods to solve information overload is to use recommendation system, and collaborative filtering is the most widely used algorithm in recommendation system, but it still has some problems, such as data sparsity, scalability and cold start. At the same time, most personalized recommendation systems tend to ignore some of the user's own characteristics, such as age, gender and occupation, when user-project rating data is difficult to obtain. It will seriously affect the recommendation accuracy of personalized recommendation system. After analyzing and comparing various commonly used personalized recommendation algorithms and related technologies, this paper takes the data sparsity and cold start problem as the foothold, in order to improve the recommendation accuracy of personalized recommendation system and reduce the time complexity of recommendation algorithm. In this paper, the personalized recommendation system based on spectral clustering is studied. The main contents are as follows: (1) the spectral clustering is introduced into the personalized recommendation system, and the weighted kernel fuzzy clustering and initial centroid selection algorithm are used to improve the spectral clustering. The correlation of Person was corrected. Finally, the improved spectral clustering and similarity measurement are combined with collaborative filtering, and two improved collaborative filtering recommendation algorithms based on user spectral clustering are obtained. On MovieLens 100K datasets, the average absolute error (Mean Absolutre Error, MAE) and root mean square error (Root Mean Square Error, RMSE) of the above two algorithms are at least 4% lower than those of the traditional K-means clustering collaborative filtering algorithm. Running time has been reduced by at least half; On the MovileLens 1m data set, the MAE and RMS values are improved by at least 2%, and the run time is reduced by 80%. (2) based on the user feature attributes, the pre-processing method of the user's age, gender and occupation is proposed, and the user feature attribute matrix is obtained. A cooperative filtering algorithm based on user feature attribute spectrum clustering is proposed. (3) aiming at the problem of over-fitting existing in deviation singular value decomposition (Bias Singular Value Decomposition, BSVD) algorithm, the user feature attribute and user-project history score record are comprehensively utilized. The proposed spectral clustering based on user characteristics is combined with the BSVD model, and a new user judgment is added to the model to solve the cold start problem. Finally, an improved recommendation algorithm is obtained. On MovieLens 100K data, compared with BSVD decomposition algorithm, the MAE and RMSE values of this algorithm are reduced by at least 6%, and the MAE and RMSE values are reduced by at least 2% on the dataset MovieLens 1m. The experimental results show that the algorithm not only improves the recommendation accuracy and has a certain degree of scalability. (4) using the existing data sets to design a number of experiments, the proposed algorithm is compared with the traditional algorithm, and the experimental results can be obtained. The application of spectral clustering to personalized recommendation system can greatly improve the prediction accuracy and real-time response speed of the system, and finally bring greater economic benefits to enterprises and businesses.
【學(xué)位授予單位】:福建農(nóng)林大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.3
[Abstract]:With the rapid development of web2.0 and e-commerce, information resources are growing exponential. At present, one of the effective methods to solve information overload is to use recommendation system, and collaborative filtering is the most widely used algorithm in recommendation system, but it still has some problems, such as data sparsity, scalability and cold start. At the same time, most personalized recommendation systems tend to ignore some of the user's own characteristics, such as age, gender and occupation, when user-project rating data is difficult to obtain. It will seriously affect the recommendation accuracy of personalized recommendation system. After analyzing and comparing various commonly used personalized recommendation algorithms and related technologies, this paper takes the data sparsity and cold start problem as the foothold, in order to improve the recommendation accuracy of personalized recommendation system and reduce the time complexity of recommendation algorithm. In this paper, the personalized recommendation system based on spectral clustering is studied. The main contents are as follows: (1) the spectral clustering is introduced into the personalized recommendation system, and the weighted kernel fuzzy clustering and initial centroid selection algorithm are used to improve the spectral clustering. The correlation of Person was corrected. Finally, the improved spectral clustering and similarity measurement are combined with collaborative filtering, and two improved collaborative filtering recommendation algorithms based on user spectral clustering are obtained. On MovieLens 100K datasets, the average absolute error (Mean Absolutre Error, MAE) and root mean square error (Root Mean Square Error, RMSE) of the above two algorithms are at least 4% lower than those of the traditional K-means clustering collaborative filtering algorithm. Running time has been reduced by at least half; On the MovileLens 1m data set, the MAE and RMS values are improved by at least 2%, and the run time is reduced by 80%. (2) based on the user feature attributes, the pre-processing method of the user's age, gender and occupation is proposed, and the user feature attribute matrix is obtained. A cooperative filtering algorithm based on user feature attribute spectrum clustering is proposed. (3) aiming at the problem of over-fitting existing in deviation singular value decomposition (Bias Singular Value Decomposition, BSVD) algorithm, the user feature attribute and user-project history score record are comprehensively utilized. The proposed spectral clustering based on user characteristics is combined with the BSVD model, and a new user judgment is added to the model to solve the cold start problem. Finally, an improved recommendation algorithm is obtained. On MovieLens 100K data, compared with BSVD decomposition algorithm, the MAE and RMSE values of this algorithm are reduced by at least 6%, and the MAE and RMSE values are reduced by at least 2% on the dataset MovieLens 1m. The experimental results show that the algorithm not only improves the recommendation accuracy and has a certain degree of scalability. (4) using the existing data sets to design a number of experiments, the proposed algorithm is compared with the traditional algorithm, and the experimental results can be obtained. The application of spectral clustering to personalized recommendation system can greatly improve the prediction accuracy and real-time response speed of the system, and finally bring greater economic benefits to enterprises and businesses.
【學(xué)位授予單位】:福建農(nóng)林大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2016
【分類號(hào)】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 楊藝芳;王宇平;;基于核模糊相似度度量的譜聚類算法[J];儀器儀表學(xué)報(bào);2015年07期
2 居斌;錢l勌,
本文編號(hào):2473717
本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/2473717.html
最近更新
教材專著