商品搜索引擎產(chǎn)品排序模塊的設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-10-13 07:43
【摘要】:隨著互聯(lián)網(wǎng)和電子商務(wù)的發(fā)展,各類c2c和b2c網(wǎng)站為用戶提供的商品數(shù)量和種類也越來越多,并且各網(wǎng)購網(wǎng)站都為用戶提供了針對(duì)站內(nèi)商品的搜索服務(wù)。如果用戶需要對(duì)各個(gè)網(wǎng)站的商品進(jìn)行比較搜索,就需要一個(gè)全網(wǎng)商品搜索引擎。全網(wǎng)商品搜索引擎是一種收錄所有商家的商品信息,并且能夠提供各種維度檢索的垂直搜索引擎。對(duì)于全網(wǎng)商品搜索引擎,查詢返回的信息如果以商品維度呈現(xiàn)給用戶,會(huì)使得用戶淹沒在大量的數(shù)據(jù)中,對(duì)此人們提出了多種方法對(duì)查詢結(jié)果的展示進(jìn)行改進(jìn),其中把返回的結(jié)果從商品維度向上歸約(Reduce)到產(chǎn)品維度來進(jìn)行展現(xiàn)是一種比較好的辦法。產(chǎn)品是商品的泛化概念,用戶搜索可以先定位到產(chǎn)品后再對(duì)產(chǎn)品下的商品進(jìn)行比較和選擇。因此產(chǎn)品節(jié)點(diǎn)排序的權(quán)威性在一定程度上反映商品搜索引擎排序的權(quán)威性。基于以上背景和結(jié)合在實(shí)習(xí)公司的工作內(nèi)容,本文設(shè)計(jì)和實(shí)現(xiàn)了一個(gè)用于商品搜索引擎的產(chǎn)品排序模塊。該模塊產(chǎn)生的產(chǎn)品靜態(tài)分?jǐn)?shù)作為商品搜索引擎產(chǎn)品排序的依據(jù)已經(jīng)在線上使用。 本文設(shè)計(jì)的產(chǎn)品排序模塊主要分為兩大子模塊:產(chǎn)品靜態(tài)分?jǐn)?shù)離線計(jì)算模塊和數(shù)據(jù)監(jiān)控模塊。產(chǎn)品靜態(tài)分?jǐn)?shù)離線計(jì)算采用Hadoop技術(shù),使其能處理海量的商品信息數(shù)據(jù)。而且該模塊的設(shè)計(jì)具有可擴(kuò)展性,能根據(jù)運(yùn)營的要求對(duì)不同類目下產(chǎn)品的計(jì)算標(biāo)準(zhǔn)進(jìn)行更改、對(duì)分?jǐn)?shù)異常的產(chǎn)品節(jié)點(diǎn)進(jìn)行特殊處理等功能。數(shù)據(jù)監(jiān)控模塊為開發(fā)人員提供對(duì)產(chǎn)品分?jǐn)?shù)的波動(dòng)和產(chǎn)品特征的情況進(jìn)行監(jiān)控的功能,可以根據(jù)產(chǎn)出的報(bào)表直觀的觀察和追蹤排序分?jǐn)?shù)異常的原因。該模塊采用Django框架,結(jié)合Django的MTV開發(fā)模式,把系統(tǒng)自頂向下分為模板層、視圖層和模型層。其中模板層和邏輯處理層(視圖層)、模型層的分離,使得開發(fā)人員更加容易開發(fā)數(shù)據(jù)驅(qū)動(dòng)型的web程序。 本文首先介紹了項(xiàng)目的背景,接著對(duì)項(xiàng)目所使用的技術(shù)和框架做了簡介。然后針對(duì)產(chǎn)品排序模塊的需求做了分析,根據(jù)需求分析詳細(xì)闡述了產(chǎn)品靜態(tài)分?jǐn)?shù)產(chǎn)生模塊和數(shù)據(jù)監(jiān)控模塊的設(shè)計(jì)和實(shí)現(xiàn),其中的核心是靜態(tài)分?jǐn)?shù)的MapReduce程序的設(shè)計(jì)和實(shí)現(xiàn)。最后對(duì)項(xiàng)目進(jìn)行了總結(jié)和展望。
[Abstract]:With the development of Internet and e-commerce, more and more products are provided to users by all kinds of C2C and B2C websites. If users need to compare the products of each website, they need a web-wide commodity search engine. The whole web commodity search engine is a vertical search engine which can collect the merchandise information of all the merchants and can search all kinds of dimensions. For the whole web commodity search engine, if the information returned by the query is presented to the user in the commodity dimension, it will cause the user to be submerged in a large amount of data. In this paper, many methods have been proposed to improve the display of the query results. It is a better way to display the returned results from the commodity dimension to the product dimension. Product is a general concept of commodity. User search can locate the product first and then compare and select the product under the product. Therefore, the authority of product node sorting reflects the authority of commodity search engine sorting to a certain extent. Based on the above background and the work content of the internship company, this paper designs and implements a product sorting module for commodity search engine. The product static score generated by this module has been used online as the basis for product ranking of commodity search engines. The product sorting module designed in this paper is mainly divided into two sub-modules: the static fraction off-line calculation module and the data monitoring module. Hadoop technology is used to calculate the static fraction of products, which can deal with a large amount of commodity information data. Moreover, the design of the module is extensible, and it can change the calculation standard of different categories of products according to the operational requirements, and make special processing of product nodes with abnormal scores. The data monitoring module provides developers with the function of monitoring the fluctuation of product scores and the situation of product characteristics, and can intuitively observe and track the causes of abnormal ranking scores according to the output reports. This module adopts the Django framework and combines the MTV development mode of Django to divide the system from top to bottom into three layers: template layer, view layer and model layer. The separation of template layer and logical processing layer (view layer) and model layer makes it easier for developers to develop data-driven web programs. This article first introduces the background of the project, and then gives a brief introduction to the technology and framework used in the project. Then the requirement of the product sorting module is analyzed and the design and implementation of the static score generation module and the data monitoring module are described in detail according to the demand analysis. The core of the module is the design and implementation of the MapReduce program of the static score. Finally, the project is summarized and prospected.
【學(xué)位授予單位】:南京大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP311.52
本文編號(hào):2267817
[Abstract]:With the development of Internet and e-commerce, more and more products are provided to users by all kinds of C2C and B2C websites. If users need to compare the products of each website, they need a web-wide commodity search engine. The whole web commodity search engine is a vertical search engine which can collect the merchandise information of all the merchants and can search all kinds of dimensions. For the whole web commodity search engine, if the information returned by the query is presented to the user in the commodity dimension, it will cause the user to be submerged in a large amount of data. In this paper, many methods have been proposed to improve the display of the query results. It is a better way to display the returned results from the commodity dimension to the product dimension. Product is a general concept of commodity. User search can locate the product first and then compare and select the product under the product. Therefore, the authority of product node sorting reflects the authority of commodity search engine sorting to a certain extent. Based on the above background and the work content of the internship company, this paper designs and implements a product sorting module for commodity search engine. The product static score generated by this module has been used online as the basis for product ranking of commodity search engines. The product sorting module designed in this paper is mainly divided into two sub-modules: the static fraction off-line calculation module and the data monitoring module. Hadoop technology is used to calculate the static fraction of products, which can deal with a large amount of commodity information data. Moreover, the design of the module is extensible, and it can change the calculation standard of different categories of products according to the operational requirements, and make special processing of product nodes with abnormal scores. The data monitoring module provides developers with the function of monitoring the fluctuation of product scores and the situation of product characteristics, and can intuitively observe and track the causes of abnormal ranking scores according to the output reports. This module adopts the Django framework and combines the MTV development mode of Django to divide the system from top to bottom into three layers: template layer, view layer and model layer. The separation of template layer and logical processing layer (view layer) and model layer makes it easier for developers to develop data-driven web programs. This article first introduces the background of the project, and then gives a brief introduction to the technology and framework used in the project. Then the requirement of the product sorting module is analyzed and the design and implementation of the static score generation module and the data monitoring module are described in detail according to the demand analysis. The core of the module is the design and implementation of the MapReduce program of the static score. Finally, the project is summarized and prospected.
【學(xué)位授予單位】:南京大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP311.52
【參考文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前3條
1 孫文禮;電子商務(wù)系統(tǒng)中的全文檢索及排序優(yōu)化算法[D];浙江大學(xué);2008年
2 張仁愛;產(chǎn)品庫平臺(tái)系統(tǒng)的研究和實(shí)現(xiàn)[D];浙江大學(xué);2010年
3 王黎;搜索引擎的相關(guān)性排序算法研究[D];中國科學(xué)技術(shù)大學(xué);2010年
,本文編號(hào):2267817
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2267817.html
最近更新
教材專著