天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁(yè) > 科技論文 > 搜索引擎論文 >

聚焦搜索引擎研究及其在社區(qū)信息化中的應(yīng)用

發(fā)布時(shí)間:2018-11-01 20:02
【摘要】:“云計(jì)算”作為一種全新的商業(yè)模式,是在2006年由Google提出的。它的提出為產(chǎn)業(yè)界和學(xué)術(shù)界提供了一個(gè)全新的思路。山東大學(xué)信息科學(xué)與工程學(xué)院袁東風(fēng)教授團(tuán)隊(duì)迅速抓住了這一機(jī)遇,在基于云計(jì)算的新型信息化模式方面展開(kāi)了深入研究并取得了階段性成果。該團(tuán)隊(duì)已經(jīng)得到了兩個(gè)山東省自主創(chuàng)新成果轉(zhuǎn)化重大專項(xiàng)的支持,本文課題就是來(lái)源于第二個(gè)重大專項(xiàng)“低成本、低耗能、高可靠嵌入式終端與信息服務(wù)平臺(tái)”(2010ZHZX1A1001)。 在國(guó)家推行城鎮(zhèn)化的大趨勢(shì)下,針對(duì)農(nóng)村改造成社區(qū)并實(shí)行規(guī)模經(jīng)營(yíng)和集體經(jīng)濟(jì)已經(jīng)開(kāi)始啟動(dòng)。山東省農(nóng)村改造工作取得了較快的發(fā)展,本課題所屬的重大專項(xiàng)選擇的試點(diǎn)地區(qū)就是一個(gè)農(nóng)村改造成社區(qū)的典型。社區(qū)信息化建設(shè)也成為信息化建設(shè)非常重要的一部分,在《2006-2020年國(guó)家信息化發(fā)展戰(zhàn)略》中,將推進(jìn)社區(qū)信息化建設(shè)列為我國(guó)信息化發(fā)展的戰(zhàn)略重點(diǎn)之一。本項(xiàng)目團(tuán)隊(duì)在這樣的背景下,展開(kāi)了信息化關(guān)鍵技術(shù)研究,提出了“云計(jì)算服務(wù)器+寬帶網(wǎng)+瘦客戶端”這種完全摒棄PC的全新信息化模式。項(xiàng)目團(tuán)隊(duì)研發(fā)并批量生產(chǎn)了基于嵌入式架構(gòu)的瘦客戶端,成本和功耗都降低到了一個(gè)很低的水平;研發(fā)了云計(jì)算服務(wù)器集群,并針對(duì)社區(qū)用戶的調(diào)查結(jié)果開(kāi)發(fā)了用戶關(guān)注的應(yīng)用和信息服務(wù)。用這種模式取代傳統(tǒng)的以PC為核心的信息化道路,展開(kāi)了大規(guī)模的試點(diǎn)示范,并取得了良好的效果。 針對(duì)目標(biāo)用戶的使用要求,結(jié)合新型社區(qū)信息化模式的特點(diǎn),本文設(shè)計(jì)實(shí)現(xiàn)了針對(duì)淘寶購(gòu)物的聚焦搜索引擎,為社區(qū)信息化用戶提供方便快捷的購(gòu)物搜索和推薦。針對(duì)淘寶網(wǎng)商品種類繁多的特點(diǎn),設(shè)計(jì)實(shí)現(xiàn)了商品通用模型,達(dá)到新增商品的時(shí)候不用大規(guī)模更新數(shù)據(jù)表的效果。系統(tǒng)設(shè)計(jì)了網(wǎng)絡(luò)爬蟲(chóng)和信息搜索兩大模塊,其中網(wǎng)絡(luò)爬蟲(chóng)模塊實(shí)現(xiàn)了淘寶網(wǎng)商品信息抓取、索引文件的建立和商品詳細(xì)信息存入數(shù)據(jù)庫(kù)等操作,信息檢索模塊實(shí)現(xiàn)了用戶關(guān)鍵字查詢接口、索引文件查詢和數(shù)據(jù)庫(kù)查詢等,為用戶提供搜索結(jié)果列表顯示、詳細(xì)信息展示和信息推薦。 在爬蟲(chóng)模塊,為了應(yīng)對(duì)海量數(shù)據(jù)的抓取效率問(wèn)題,運(yùn)用java語(yǔ)言實(shí)現(xiàn)了基于hadoop的分布式網(wǎng)絡(luò)爬蟲(chóng)。本文首先在ubuntu9.10操作系統(tǒng)下搭建了hadoop分布式環(huán)境,然后設(shè)計(jì)了針對(duì)hadoop的分布式爬蟲(chóng)程序,實(shí)現(xiàn)了對(duì)淘寶網(wǎng)數(shù)據(jù)的抓;通過(guò)設(shè)計(jì)數(shù)據(jù)存儲(chǔ)策略實(shí)現(xiàn)了索引文件的建立;優(yōu)化了緩存策略,減少了物理空間占用率;針對(duì)淘寶網(wǎng)的數(shù)據(jù)特點(diǎn),設(shè)計(jì)了信息提取方法并實(shí)現(xiàn)了商品詳細(xì)信息存入數(shù)據(jù)庫(kù)的操作;針對(duì)網(wǎng)絡(luò)情況可能造成的系統(tǒng)運(yùn)行異常,設(shè)計(jì)了日志存儲(chǔ)規(guī)則;系統(tǒng)設(shè)計(jì)了用戶操作界面,可以對(duì)數(shù)據(jù)的抓取規(guī)則進(jìn)行設(shè)置。 在搜索模塊,實(shí)現(xiàn)了基于瀏覽器的信息搜索功能。搜索程序的核心是一個(gè)J2EE工程,它實(shí)現(xiàn)了索引文件查詢和數(shù)據(jù)庫(kù)查詢。系統(tǒng)首先實(shí)現(xiàn)了運(yùn)行環(huán)境配置功能,針對(duì)系統(tǒng)運(yùn)行的參數(shù)進(jìn)行設(shè)定;通過(guò)前臺(tái)頁(yè)面實(shí)現(xiàn)了用戶查詢接口,并對(duì)關(guān)鍵字進(jìn)行索引文件的檢索,得到目標(biāo)關(guān)鍵字的商品集合;根據(jù)商品集合中的數(shù)據(jù)庫(kù)入口信息,結(jié)合數(shù)據(jù)庫(kù)查詢得到結(jié)果集合;針對(duì)目標(biāo)用戶對(duì)價(jià)格敏感的特點(diǎn),實(shí)現(xiàn)了對(duì)結(jié)果集進(jìn)行價(jià)格排序;實(shí)現(xiàn)了商品詳細(xì)信息的查詢,可以顯示商品價(jià)格、標(biāo)題、描述信息、價(jià)格曲線,并且就相近價(jià)格區(qū)間的商品進(jìn)行推薦。
[Abstract]:Cloud Computing As a brand-new business model, it was proposed by Google in 2006. It offers a brand-new idea for industry and academia. The team of Dong Feng of Shandong University School of Technology and Engineering grasped this opportunity quickly, and carried out an in-depth study on the new information model based on cloud computing and made a phased achievement. The team has received the support of the transformation of the independent innovation achievements of two Shandong provinces. This paper aims to come from the second major special project Low-cost, low-consumption, high-reliability embedded terminal and information service platform (2010ZHZX1A1001). In the large trend of the country's urbanization, it has started to transform the countryside into a community and carry out large-scale operation and collective economy Starting with the rapid development of rural reconstruction in Shandong Province, the pilot area of the major special choice to which this project belongs is a rural transformation into a community. The construction of community informatization is also a very important part of informatization construction. In the National Information Development Strategy of 2006-2020, the construction of information construction of the community is listed as the strategic focus of China's information development. 1. In this background, the project team expands the key technology research of informatization, "Cloud Computing Server + Broadband Network + Thin Guest" is proposed Household End "This completely abandoned PC's brand-new informatization Pattern. The project team developed and mass-produced thin clients based on embedded architecture, reduced costs and power consumption to a very low level; developed cloud computing server clusters and developed user-focused applications and information for community users' findings Service. With this model, replace the traditional PC-centric informatization road, carry out a large-scale pilot demonstration, and have achieved good results According to the requirements of the target users and the characteristics of the new community information model, this paper designs a focus search engine for Taobao shopping, and provides convenient and convenient shopping for the community information users. Search and recommend. Aiming at the characteristics of the variety of products of Taobao, the general model of commodity is designed and realized, and the number of large-scale updating is not used when new goods are added. According to the effect of the table, the network crawler and the information searching module are designed in the system, wherein the network crawler module realizes the operation of the information retrieval module of the Taobao network, the establishment of the index file and the storage of the commodity detailed information into the database, and the information retrieval module realizes the key of the user. a word query interface, an index file query and a database query, and the like, provides a search result list display for a user, and detailed information display and information recommendation. In the crawler module, in order to deal with the grabbing efficiency of mass data, the java language is used to implement hadoop. In this paper, we set up the hadoop distributed environment under the operating system of ubuntu 9. 10, then designed the distributed crawler program directed to hadoop, which realized the grasping of the data of Taobao, and realized the establishment of the index file through the design data storage strategy. The caching strategy is optimized, the physical space occupation rate is reduced, the information extracting method is designed according to the data characteristics of the Taobao network, the operation of the commodity detailed information in the database is realized, the system running exception possibly caused by the network situation is abnormal, the log storage rule is designed, and the system is arranged. The user's operation interface is counted, which can be used for data. The capture rule is set. Based on the search module, the base is implemented. The core of the search program is a J2EE project, which realizes the information search function of the browser. The system firstly realizes the operation environment configuration function, sets the parameters for the system operation, realizes the user query interface through the foreground page, and indexes the keyword to search the index file to obtain the commodity collection of the target keyword; and according to the commodity, The database entry information in the collection is combined with the database query to obtain a result set; the price ordering is realized for the result set aiming at the characteristic of the target user on the price; the query of the commodity detailed information can be realized, and the commodity price and the mark can be displayed. Problem, description information, price curve, and simila
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.3

【參考文獻(xiàn)】

相關(guān)期刊論文 前7條

1 劉磊安;符志強(qiáng);;基于Lucene.net網(wǎng)絡(luò)爬蟲(chóng)的設(shè)計(jì)與實(shí)現(xiàn)[J];電腦知識(shí)與技術(shù);2010年08期

2 肖瓏;元數(shù)據(jù)格式在數(shù)字圖書(shū)館中的應(yīng)用[J];大學(xué)圖書(shū)館學(xué)報(bào);1999年04期

3 閻琦;;通用電子商品售后維修管理模塊的建模與實(shí)現(xiàn)[J];信息技術(shù);2012年09期

4 馬宏遠(yuǎn);王斌;;基于用戶特性的搜索引擎查詢結(jié)果緩存與預(yù)取[J];中文信息學(xué)報(bào);2012年06期

5 胡晟;;基于網(wǎng)絡(luò)爬蟲(chóng)的Web挖掘應(yīng)用[J];軟件;2012年07期

6 黨飛;江銘炎;袁東風(fēng);;基于KVM的B/S架構(gòu)虛擬化管理系統(tǒng)[J];計(jì)算機(jī)工程與設(shè)計(jì);2013年06期

7 梁弼;王光瓊;鄧小清;;基于Lucene的全文檢索系統(tǒng)模型的研究及應(yīng)用[J];微型機(jī)與應(yīng)用;2011年01期

相關(guān)碩士學(xué)位論文 前1條

1 陳玉鵬;基于語(yǔ)義網(wǎng)的web信息檢索研究[D];吉林大學(xué);2008年

,

本文編號(hào):2304952

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2304952.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶d1cde***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com