基于協(xié)同過(guò)濾的習(xí)題推薦系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)
本文選題:協(xié)同過(guò)濾 + Hadoop; 參考:《長(zhǎng)江大學(xué)》2017年碩士論文
【摘要】:近幾年來(lái),隨著互聯(lián)網(wǎng)、移動(dòng)設(shè)備、傳感器設(shè)備等信息技術(shù)快速發(fā)展,各行各業(yè)的數(shù)據(jù)量正以TB甚至PB級(jí)的數(shù)量迅猛增長(zhǎng)。推薦系統(tǒng)的產(chǎn)生幫助用戶在眾多的數(shù)據(jù)中,尋找對(duì)他們感興趣或是重要的信息,提高了用戶辦事的效率,也讓他們有了明確的方向。在電子商務(wù)時(shí)代,推薦系統(tǒng)被廣泛應(yīng)用于商品購(gòu)物、新聞、飲食、閱讀和音樂等網(wǎng)站,給人們的生活帶來(lái)了方便,提高了用戶的體驗(yàn)。在教育領(lǐng)域,傳統(tǒng)的數(shù)字校園建設(shè)模式已經(jīng)滿足不了現(xiàn)在的需求,物聯(lián)網(wǎng)、大數(shù)據(jù)、云計(jì)算隨之興起,為教育信息化建設(shè)帶來(lái)了嶄新的思路。近年來(lái)提出“智慧校園”的概念,將教育和先進(jìn)的科學(xué)技術(shù)結(jié)合起來(lái),不管是從安全、教育、學(xué)習(xí)、生活方面,都有很大的提高。在線考試系統(tǒng)的產(chǎn)生,方便了老師和同學(xué),減輕了老師的工作量,同時(shí)也方便了學(xué)生在線答題。但是,對(duì)于個(gè)性化推薦服務(wù),還有待加強(qiáng)。信息技術(shù)的發(fā)展徹底打破了教學(xué)的時(shí)空界限,將推薦系統(tǒng)應(yīng)用到教育領(lǐng)域,也是一種有益的嘗試。推薦系統(tǒng)的產(chǎn)生解決了用戶如何在海量數(shù)據(jù)中快速尋找有用的信息,推薦系統(tǒng)的產(chǎn)生為消費(fèi)者和使用者提供了更友好的用戶界面,提升了體驗(yàn)度。但是推薦系統(tǒng)也有其不足,會(huì)出現(xiàn)推薦不精準(zhǔn)或是錯(cuò)誤的推薦。推薦系統(tǒng)的發(fā)展主要在用戶信息的獲取、處理、推薦算法研究、推薦的效果和影響這5個(gè)方面。而在推薦算法中,協(xié)同過(guò)濾(Collaborative Filtering,簡(jiǎn)稱CF)是最常用的算法之一,但是冷啟動(dòng)和稀疏問(wèn)題一直是協(xié)同過(guò)濾需要解決的難題。在推薦系統(tǒng)發(fā)展至今,比較流行的推薦算法主要有協(xié)同過(guò)濾推薦算法、基于內(nèi)容的推薦算法、混合推薦算法等,它們各有其利弊。綜合學(xué)者的論文可以發(fā)現(xiàn),基于推薦算法的優(yōu)化,用得最多的就是混合推薦,將不同的算法取長(zhǎng)補(bǔ)短,優(yōu)化推薦效果。本文在前人的基礎(chǔ)上,針對(duì)推薦系統(tǒng)中用戶首次登錄后由于沒有歷史數(shù)據(jù),如何為新用戶推薦合適的習(xí)題內(nèi)容進(jìn)行非個(gè)性化的推薦,以及如何為老用戶推薦合適的習(xí)題進(jìn)行個(gè)性化的推薦展開研究。分析影響用戶和習(xí)題的主要因素,根據(jù)這些因素進(jìn)行合理的處理,通過(guò)對(duì)協(xié)同過(guò)濾算法的優(yōu)化,把相對(duì)有效的習(xí)題推薦給用戶。首先,根據(jù)用戶的歷史記錄收集數(shù)據(jù)并將數(shù)據(jù)作數(shù)字化、分類、歸一化和減噪處理。找到影響習(xí)題推薦的影響因素,并通過(guò)一個(gè)權(quán)重公式,通過(guò)皮爾遜相關(guān)系數(shù)求解相似度。接著為了解決協(xié)同過(guò)濾的冷啟動(dòng)問(wèn)題,即新用戶首次登錄系統(tǒng),由于沒有歷史數(shù)據(jù)作為推薦依據(jù)進(jìn)行個(gè)性化推薦,根據(jù)用戶的學(xué)號(hào)將用戶進(jìn)行分組,找到用戶所在的班級(jí)和專業(yè),并根據(jù)所在班級(jí)和專業(yè)的歷史數(shù)據(jù)進(jìn)行分析,得到易錯(cuò)題排行榜、易錯(cuò)知識(shí)點(diǎn)排行榜、考試常見習(xí)題排行榜等,為用戶進(jìn)行非個(gè)性化的推薦。然后為了解決協(xié)同過(guò)濾的數(shù)據(jù)稀疏問(wèn)題,將協(xié)同過(guò)濾的推薦方法進(jìn)行優(yōu)化,通過(guò)對(duì)基于用戶的推薦和基于內(nèi)容的推薦結(jié)果進(jìn)行合并,將具有相同結(jié)果的值優(yōu)先推薦給用戶,把不同結(jié)果的值根據(jù)相似度進(jìn)行排序,依次推薦給用戶。這種推薦方法綜合兩種推薦算法的結(jié)果,為用戶進(jìn)行推薦,提高了準(zhǔn)確性。在數(shù)據(jù)稀疏的情況下,這種方法的優(yōu)點(diǎn)就更為突出。通過(guò)對(duì)協(xié)同過(guò)濾算法的優(yōu)化,得到最終的推薦結(jié)果并保存到HDFS文件系統(tǒng)中。為了能將結(jié)果進(jìn)行展示,使用Sqoop連接傳統(tǒng)數(shù)據(jù)庫(kù)MySQL與HDFS分布式文件系統(tǒng),使它們能做到自動(dòng)傳輸數(shù)據(jù)。最后將MySQL中的結(jié)果進(jìn)行讀取,并使用WEB的方式展示,形成最終的習(xí)題推薦系統(tǒng)。將推薦系統(tǒng)應(yīng)用到習(xí)題中,從教育領(lǐng)域上說(shuō),具有一定的激勵(lì)作用,提高了教學(xué)的靈活性。從老師的角度上說(shuō),大大減少了老師的工作量。從學(xué)生的角度來(lái)說(shuō),免去了從海量信息中選擇合適題目的時(shí)間,提高了學(xué)生的學(xué)習(xí)效率和學(xué)習(xí)的熱情,解決了學(xué)生之間的差異問(wèn)題。將高新科技與教育相結(jié)合,體現(xiàn)了教育的公平、高效性和多樣性,具有一定的實(shí)際意義。
[Abstract]:In recent years, with the rapid development of information technology such as Internet, mobile devices and sensor devices, the amount of data in all walks of life is growing rapidly in the number of TB and even PB. In the era of e-commerce, the recommendation system is widely used in commodity shopping, news, diet, reading and music, which bring convenience to people's life and improve the experience of users. In the field of education, the traditional digital campus construction model is not full of the needs of the present, the Internet of things, big data, cloud computing In recent years, the concept of "intelligent campus" has been put forward to combine education with advanced science and technology, regardless of its safety, education, learning and life. The production of the online examination system has facilitated the teachers and students and reduced the teacher's workload. At the same time, it is also convenient for students to answer the questions online. However, the personalized recommendation service still needs to be strengthened. The development of information technology has completely broken the time and space limit of teaching, and it is also a useful attempt to apply the recommended system to the field of education. The production of the recommended system solves the user how to find useful information in massive data quickly. The recommendation system provides a more friendly user interface for consumers and users and improves the degree of experience. However, the recommendation system has its shortcomings, and there will be recommendations for inaccuracy or error. The development of the recommendation system is mainly in the 5 aspects of user information acquisition, processing, recommendation algorithm research, recommended effects and effects. In the recommendation algorithm, Collaborative Filtering (CF) is one of the most commonly used algorithms, but cold start and sparsity has been a difficult problem to be solved by collaborative filtering. In the development of the recommended system, the popular recommendation algorithms are mainly collaborative filtering recommendation algorithm, content based recommendation algorithm, mixed recommendation algorithm. They have their own advantages and disadvantages. The thesis of the comprehensive scholars can find that the best use of the optimization based on the recommendation algorithm is the mixed recommendation, which makes the different algorithms make up the short and optimizes the recommendation effect. The content of the appropriate exercises is not personalized, and how to recommend the appropriate exercises for the old users to carry out the personalized recommendation. The main factors that affect the user and the problem are analyzed. According to these factors, the relative effective exercises are recommended to the users through the optimization of the collaborative filtering algorithm. According to the historical records of the user, the data is collected and the data are digitization, classification, normalization and noise reduction. The influence factors that affect the recommendation of the exercises are found, and a weight formula is used to solve the similarity through the Pearson correlation coefficient. Then, to solve the cold start problem of collaborative filtering, the new user first login system is not a calendar. Historical data as a recommendation basis for personalized recommendation, according to the user's school number of users to group, find the class and specialty of the user, and according to the class and professional historical data analysis, get the wrong list, the wrong list of mistakes, the list of common exam questions, etc., for the user to do non personalized In order to solve the problem of data sparsity in collaborative filtering, the recommendation method of collaborative filtering is optimized. By merging the user based recommendation and content based recommendation results, the values of the same results are recommended to the user first, and the values of the results are sorted according to the similarity, which is recommended to the users in turn. The recommendation method combines the results of two recommended algorithms to improve the accuracy of the user. In the case of sparse data, the advantages of this method are more prominent. By optimizing the collaborative filtering algorithm, the final recommendation results are obtained and saved to the HDFS file system. In order to display the results, use the Sqoop connection. The traditional database MySQL and HDFS distributed file system enable them to automatically transmit data. Finally, it reads the results in MySQL and displays it with WEB, and forms the final exercise recommendation system. The recommended system is applied to the exercises. In the field of education, it has a certain incentive effect and improves the flexibility of teaching. From the teacher's point of view, the teacher's workload is greatly reduced. From the point of view of the students, the time to choose the right topic from the mass information is avoided, the students' learning efficiency and enthusiasm are improved and the difference between the students is solved. The combination of high and new technology and education embodies the fairness, efficiency and more of education. It has certain practical significance.
【學(xué)位授予單位】:長(zhǎng)江大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.3
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 徐義峰;陳春明;徐云青;;一種基于分類的協(xié)同過(guò)濾算法[J];計(jì)算機(jī)系統(tǒng)應(yīng)用;2007年01期
2 楊風(fēng)召;;一種基于特征表的協(xié)同過(guò)濾算法[J];計(jì)算機(jī)工程與應(yīng)用;2007年06期
3 王嵐;翟正軍;;基于時(shí)間加權(quán)的協(xié)同過(guò)濾算法[J];計(jì)算機(jī)應(yīng)用;2007年09期
4 曾子明;張李義;;基于多屬性決策和協(xié)同過(guò)濾的智能導(dǎo)購(gòu)系統(tǒng)[J];武漢大學(xué)學(xué)報(bào)(工學(xué)版);2008年02期
5 張富國(guó);;用戶多興趣下基于信任的協(xié)同過(guò)濾算法研究[J];小型微型計(jì)算機(jī)系統(tǒng);2008年08期
6 侯翠琴;焦李成;張文革;;一種壓縮稀疏用戶評(píng)分矩陣的協(xié)同過(guò)濾算法[J];西安電子科技大學(xué)學(xué)報(bào);2009年04期
7 廖新考;;基于用戶特征和項(xiàng)目屬性的混合協(xié)同過(guò)濾推薦[J];福建電腦;2010年07期
8 沈磊;周一民;李舟軍;;基于心理學(xué)模型的協(xié)同過(guò)濾推薦方法[J];計(jì)算機(jī)工程;2010年20期
9 徐紅;彭黎;郭艾寅;徐云劍;;基于用戶多興趣的協(xié)同過(guò)濾策略改進(jìn)研究[J];計(jì)算機(jī)技術(shù)與發(fā)展;2011年04期
10 焦晨斌;王世卿;;基于模型填充的混合協(xié)同過(guò)濾算法[J];微計(jì)算機(jī)信息;2011年11期
相關(guān)會(huì)議論文 前10條
1 沈杰峰;杜亞軍;唐俊;;一種基于項(xiàng)目分類的協(xié)同過(guò)濾算法[A];第二十二屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(技術(shù)報(bào)告篇)[C];2005年
2 周軍鋒;湯顯;郭景峰;;一種優(yōu)化的協(xié)同過(guò)濾推薦算法[A];第二十一屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(研究報(bào)告篇)[C];2004年
3 董全德;;基于雙信息源的協(xié)同過(guò)濾算法研究[A];全國(guó)第20屆計(jì)算機(jī)技術(shù)與應(yīng)用學(xué)術(shù)會(huì)議(CACIS·2009)暨全國(guó)第1屆安全關(guān)鍵技術(shù)與應(yīng)用學(xué)術(shù)會(huì)議論文集(上冊(cè))[C];2009年
4 張光衛(wèi);康建初;李鶴松;劉常昱;李德毅;;面向場(chǎng)景的協(xié)同過(guò)濾推薦算法[A];中國(guó)系統(tǒng)仿真學(xué)會(huì)第五次全國(guó)會(huì)員代表大會(huì)暨2006年全國(guó)學(xué)術(shù)年會(huì)論文集[C];2006年
5 李建國(guó);姚良超;湯庸;郭歡;;基于認(rèn)知度的協(xié)同過(guò)濾推薦算法[A];第26屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(B輯)[C];2009年
6 王明文;陶紅亮;熊小勇;;雙向聚類迭代的協(xié)同過(guò)濾推薦算法[A];第三屆全國(guó)信息檢索與內(nèi)容安全學(xué)術(shù)會(huì)議論文集[C];2007年
7 胡必云;李舟軍;王君;;基于心理測(cè)量學(xué)的協(xié)同過(guò)濾相似度方法(英文)[A];NDBC2010第27屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(B輯)[C];2010年
8 林麗冰;師瑞峰;周一民;李月雷;;基于雙聚類的協(xié)同過(guò)濾推薦算法[A];2008'中國(guó)信息技術(shù)與應(yīng)用學(xué)術(shù)論壇論文集(一)[C];2008年
9 羅喜軍;王韜丞;杜小勇;劉紅巖;何軍;;基于類別的推薦——一種解決協(xié)同推薦中冷啟動(dòng)問(wèn)題的方法[A];第二十四屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集(研究報(bào)告篇)[C];2007年
10 黃創(chuàng)光;印鑒;汪靜;劉玉葆;王甲海;;不確定近鄰的協(xié)同過(guò)濾推薦算法[A];NDBC2010第27屆中國(guó)數(shù)據(jù)庫(kù)學(xué)術(shù)會(huì)議論文集A輯一[C];2010年
相關(guān)博士學(xué)位論文 前10條
1 紀(jì)科;融合上下文信息的混合協(xié)同過(guò)濾推薦算法研究[D];北京交通大學(xué);2016年
2 程殿虎;基于協(xié)同過(guò)濾的社會(huì)網(wǎng)絡(luò)推薦系統(tǒng)關(guān)鍵技術(shù)研究[D];中國(guó)海洋大學(xué);2015年
3 于程遠(yuǎn);基于QoS的Web服務(wù)推薦技術(shù)研究[D];上海交通大學(xué);2015年
4 段銳;融合文本內(nèi)容與情境信息的協(xié)同過(guò)濾推薦方法研究[D];合肥工業(yè)大學(xué);2017年
5 李聰;電子商務(wù)推薦系統(tǒng)中協(xié)同過(guò)濾瓶頸問(wèn)題研究[D];合肥工業(yè)大學(xué);2009年
6 郭艷紅;推薦系統(tǒng)的協(xié)同過(guò)濾算法與應(yīng)用研究[D];大連理工大學(xué);2008年
7 羅恒;基于協(xié)同過(guò)濾視角的受限玻爾茲曼機(jī)研究[D];上海交通大學(xué);2011年
8 薛福亮;電子商務(wù)協(xié)同過(guò)濾推薦質(zhì)量影響因素及其改進(jìn)機(jī)制研究[D];天津大學(xué);2012年
9 高e,
本文編號(hào):2055395
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2055395.html