基于Web的碳素行業(yè)信息數(shù)據(jù)挖掘搜索引擎技術(shù)研究
發(fā)布時(shí)間:2018-05-02 21:16
本文選題:Web + 數(shù)據(jù)挖掘; 參考:《電子科技大學(xué)》2013年碩士論文
【摘要】:搜索引擎的出現(xiàn)改變了人們遨游網(wǎng)絡(luò)的方式,人們通過(guò)搜索引擎可以快速的獲取想要查詢的資源,但隨著網(wǎng)絡(luò)的不斷發(fā)展,網(wǎng)頁(yè)數(shù)量以驚人的速度不斷增長(zhǎng),導(dǎo)致其中的資源信息質(zhì)量良莠不齊,給人們的搜索與辨識(shí)帶來(lái)不小的困擾。盡管現(xiàn)有的大型搜索引擎如百度、谷歌等都通過(guò)各種方法處理返回給用戶的結(jié)果,爭(zhēng)取能滿足各類用戶的需求,但是要做到滿足各類人群的搜索需求還是有相當(dāng)?shù)碾y度。特別是針對(duì)某一行業(yè)信息的搜索,返回的信息往往不能得到滿意的結(jié)果。那么針對(duì)碳素行業(yè)信息的搜索,需要一種專門的搜索引擎來(lái)提高行業(yè)內(nèi)用戶的搜索體驗(yàn)。同時(shí)由于用戶之間的個(gè)體興趣差異,就算是對(duì)碳素行業(yè)信息進(jìn)行搜索,其關(guān)注的方面也有所不同,因而需要一種個(gè)性化搜索方式來(lái)優(yōu)化搜索引擎。首先對(duì)數(shù)據(jù)挖掘技術(shù)進(jìn)行了研究。包括數(shù)據(jù)挖掘的含義與功能、Web內(nèi)容挖掘、Web結(jié)構(gòu)挖掘、Web使用挖掘,依據(jù)搜索引擎所應(yīng)滿足的個(gè)性化需求,結(jié)合三種Web數(shù)據(jù)挖掘的方法,提高用戶搜索體驗(yàn)。接著對(duì)碳素行業(yè)用戶訪問興趣模型進(jìn)行了研究。包括碳素行業(yè)用戶訪問信息獲取方式的選擇,數(shù)據(jù)的準(zhǔn)備和訪問用戶的識(shí)別,以及對(duì)獲取到網(wǎng)頁(yè)信息進(jìn)行概念提取、概念關(guān)聯(lián)建立一種用戶訪問的興趣模型。然后對(duì)搜索引擎技術(shù)進(jìn)行了研究。經(jīng)過(guò)對(duì)網(wǎng)頁(yè)搜集以及分詞、消除重復(fù)網(wǎng)頁(yè)、評(píng)估網(wǎng)頁(yè)重要程度的預(yù)處理后為搜索用戶提供查詢服務(wù)。最后設(shè)計(jì)出一種簡(jiǎn)單的碳素行業(yè)信息搜索引擎,一定程度上實(shí)現(xiàn)了碳素行業(yè)信息的挖掘與個(gè)性化服務(wù)。
[Abstract]:The appearance of the search engine has changed the way people travel in the network. People can quickly obtain the resources that want to query through the search engine, but with the continuous development of the network, the number of web pages is increasing at an amazing speed, which leads to the quality of resources and information, which brings no small trouble to people's search and identification. The existing large search engines, such as Baidu and Google, all deal with the results returned to users through various methods, and strive to meet the needs of all types of users, but it is difficult to meet the search requirements of all types of people. In particular, the information returned is often not satisfied with the search for information of a certain industry. So the search for carbon industry information requires a special search engine to improve the user's search experience in the industry. At the same time, because of the individual interest differences between users, even if the information of the carbon industry is searched, its attention is different, so a personalized search method is needed to optimize the search. First, we studied the data mining technology, including the meaning and function of data mining, Web content mining, Web structure mining, Web usage mining, according to the personalized requirements that the search engine should meet, and combined with three kinds of Web data mining methods to improve the user's search experience. Then, the model of the interest of the carbon industry users was carried out. It includes the choice of access to information access in the carbon industry, the preparation of data and the identification of access to users, the concept extraction of the information obtained from the web page, and the establishment of an interest model for user access by concept association. Then the search engine technology is studied. Through the collection of web pages and participle, the repeating network is eliminated. Page, the assessment of the importance of web pages to provide search services for the search users. Finally, a simple carbon industry information search engine is designed to some extent, to a certain extent, it realizes the mining and personalized service of the carbon industry information.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP391.3;TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 馬亞娜,錢煥延,孫亞民;用Cookie構(gòu)建Web安全的實(shí)現(xiàn)[J];計(jì)算機(jī)工程;2002年11期
2 張巍,李志蜀;基于PageRank算法的搜索引擎優(yōu)化策略[J];計(jì)算機(jī)應(yīng)用;2005年07期
相關(guān)碩士學(xué)位論文 前1條
1 李仁義;數(shù)據(jù)挖掘中聚類分析算法的研究與應(yīng)用[D];電子科技大學(xué);2012年
,本文編號(hào):1835531
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1835531.html
最近更新
教材專著