基于個(gè)性化預(yù)測的推送算法研究
[Abstract]:It is one of the key research points of information service industry to screen the key information of high reliability and interest from mass information and data efficiently and accurately. Search engine based pull service and information push service are the two main channels to obtain information. The level of economic development in rural areas in China is backward and the cultural quality of farmers is generally low. It is not realistic to adopt the way of obtaining information based on search engine, and the information push service is more suitable for rural areas. "Personalization" is the basic starting point of push model. By selecting K nearest neighbor samples and constructing prediction model, the purpose of pushing specific information for target users is realized. It is a key and difficult point to realize K-nearest neighbor selection, measure similarity of samples and determine the size of K-value. This study starts from the above two aspects and improves them. The results are as follows. In order to construct the push model, we first need to select a nearest neighbor set for the target user, which is composed of K user samples with the highest similarity measure. The commonly used similarity measures include Pearson correlation coefficient similarity and (Mean Squared Differences, MSD), Euclidean distance, but the above relation measures can not reflect the complex nonlinear relationship between two users, which leads to the inaccuracy of the nearest neighbor set. In this paper, the maximum mutual information coefficient (maximal mutual information coefficient,MIC) is introduced as the similarity measure between users. Compared with traditional mutual information mics, by dividing superclusters of variables and obtaining the optimal piecewise points of each variable based on stepwise optimization, this paper maximizes the mutual information of two variables and is suitable for any form of nonlinear function or even superposition function. It can effectively reflect the complex nonlinear relationship between two users, make the nearest neighbor set more accurate, and improve the prediction accuracy of the push model. It is another key point of the push model to predict the target user's ungraded items based on the nearest neighbor set. The prediction score of the project directly determines whether to push the project to the target user. An incorrect prediction can cause the wrong message to be pushed. It is crucial to construct a high-precision project score prediction model and select suitable training samples. The nearest neighbor set is based on the similarity calculation of all the graded items, but in predicting a particular item for a particular user, due to the existence of time differences, regional differences, cultural differences, etc. Using all nearest neighbor samples as training samples is not always the best prediction result. The selection of k optimal samples from all nearest neighbor sets is the core of a k-nearest neighbor selection problem. In this study, geostatistics is introduced to analyze the structure of the nearest neighbor set of each item to be predicted, a common variable range a is given, and k training samples with a distance less than a are selected for each user. The personalized prediction of each user is realized. Based on the improvement of neighbor selection and training sample selection, MovieLens score data set is taken as an example, and a project score prediction model based on support vector mechanism is built, which greatly improves the prediction accuracy of item score.
【學(xué)位授予單位】:湖南農(nóng)業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 馬建華;李本星;黃靜;陳武凡;;基于Minkowski距離最小化的多模態(tài)圖像配準(zhǔn)[J];電路與系統(tǒng)學(xué)報(bào);2008年05期
2 李聰;梁昌勇;馬麗;;基于領(lǐng)域最近鄰的協(xié)同過濾推薦算法[J];計(jì)算機(jī)研究與發(fā)展;2008年09期
3 張著英;黃玉龍;王翰虎;;一個(gè)高效的KNN分類算法[J];計(jì)算機(jī)科學(xué);2008年03期
4 王普,劉斌,戴瓊海,張大力;非對稱數(shù)據(jù)廣播系統(tǒng)的研究與應(yīng)用[J];計(jì)算機(jī)工程;1999年05期
5 李明,陳蘇,張雨,張根度;計(jì)算機(jī)網(wǎng)絡(luò)中的Push技術(shù)[J];計(jì)算機(jī)工程;2000年06期
6 ;Back-propagation network improved by conjugate gradient based on genetic algorithm in QSAR study on endocrine disrupting chemicals[J];Chinese Science Bulletin;2008年01期
7 梅虎,梁桂兆,周原,李志良;支持向量機(jī)用于定量構(gòu)效關(guān)系建模的研究[J];科學(xué)通報(bào);2005年16期
8 田真;陳曉芳;;寧夏農(nóng)業(yè)科技信息服務(wù)現(xiàn)狀分析研究[J];圖書館理論與實(shí)踐;2008年06期
9 房桂芝;董禮剛;;關(guān)于農(nóng)業(yè)科技信息服務(wù)現(xiàn)狀的調(diào)查與思考——以青島地區(qū)為例[J];農(nóng)業(yè)科技管理;2009年05期
10 邢廣成;強(qiáng)天偉;;人工神經(jīng)網(wǎng)絡(luò)的發(fā)展與應(yīng)用[J];科技風(fēng);2012年15期
,本文編號:2258405
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2258405.html