基于Web日志挖掘的用戶信息需求識(shí)別研究

發(fā)布時(shí)間：2019-05-08 04:40

【摘要】：當(dāng)今時(shí)代,信息爆炸和信息迷向是所有信息用戶所面臨的現(xiàn)狀之一。面對(duì)互聯(lián)網(wǎng)我們渴望能通過(guò)搜索引擎從海量的信息中找到自己所真正需要的信息。由于用戶自身的知識(shí)、背景以及所處的環(huán)境等各種因素,用戶提交給搜索引擎的查詢(xún)?cè)~往往不能準(zhǔn)確的表達(dá)其信息需求。目前有學(xué)者單獨(dú)研究用戶基于搜索引擎的信息行為規(guī)律,期望能從用戶的行為中發(fā)現(xiàn)用戶的興趣；也有學(xué)者考慮通過(guò)網(wǎng)絡(luò)的形式進(jìn)行問(wèn)卷調(diào)查獲取用戶信息需求。本文所不同的是將用戶的信息行為特征結(jié)合數(shù)據(jù)挖掘技術(shù)來(lái)建立識(shí)別用戶信息需求的模型,以此來(lái)自動(dòng)獲取用戶的信息需求,并期望將該模型用于提高搜索引擎的效率。本文側(cè)重在通過(guò)用戶的信息行為特征來(lái)挖掘用戶的查詢(xún)?nèi)罩?建立用戶信息需求的自動(dòng)分類(lèi)模型。本文首先對(duì)Web日志挖掘和用戶信息需求兩個(gè)方面的理論進(jìn)行研究與分析,闡述了本文研究的理論基礎(chǔ),并提出要研究的問(wèn)題。其次針對(duì)日志挖掘的數(shù)據(jù)預(yù)處理階段做了詳細(xì)的描述,介紹了本文數(shù)據(jù)的來(lái)源,數(shù)據(jù)的格式以及日志數(shù)據(jù)的清洗轉(zhuǎn)換、用戶識(shí)別等預(yù)處理操作過(guò)程。然后對(duì)用戶的信息搜索行為進(jìn)行分類(lèi),主要是針對(duì)用戶的潛在搜索行為,利用簡(jiǎn)單的統(tǒng)計(jì)方法總結(jié)出搜索引擎用戶一些基本的行為特征和規(guī)律。最后將基于搜索引擎的用戶信息需求進(jìn)行劃分,分別為導(dǎo)航類(lèi)信息需求和信息事務(wù)類(lèi)信息需求,并利用用戶的信息行為特征建立用戶信息需求的自動(dòng)分類(lèi)模型。
[Abstract]:Nowadays, information explosion and information confusion are one of the current situations faced by all information users. In the face of the Internet, we are eager to find the information we really need from the vast amount of information through search engines. Due to various factors such as users' own knowledge, background and environment, the query words submitted by users to search engines are often unable to express their information requirements accurately. At present, some scholars study the rules of users' information behavior based on search engine alone, hoping to find the user's interest from the user's behavior, and some scholars consider obtaining users' information needs through questionnaire survey through the form of network. What is different in this paper is that the information behavior characteristics of users are combined with data mining technology to establish a model to identify the information requirements of users so as to automatically obtain the information requirements of users and expect this model to be used to improve the efficiency of search engines. This paper focuses on mining the user's query log through the characteristics of user's information behavior, and establishes the automatic classification model of user's information requirement. Firstly, this paper studies and analyzes the theory of Web log mining and user information requirement, expounds the theoretical basis of this research, and puts forward the problems to be studied. Secondly, the data pre-processing stage of log mining is described in detail, and the data source, data format, cleaning and transformation of log data, user identification and other pre-processing procedures are introduced in this paper. Then the information search behavior of users is classified, mainly aiming at the potential search behavior of users, using simple statistical method to summarize some basic behavior characteristics and rules of users in search engine. Finally, the user information requirements based on search engine are divided into navigation information requirements and information transaction information requirements, and an automatic classification model of user information requirements is established by using the information behavior characteristics of users.
【學(xué)位授予單位】：華中師范大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2012
【分類(lèi)號(hào)】：G350

【引證文獻(xiàn)】

相關(guān)期刊論文前1條

1 倪曼蒂;覃擁軍;;基于Web日志挖掘的用戶模式識(shí)別研究[J];現(xiàn)代計(jì)算機(jī);2013年16期

相關(guān)碩士學(xué)位論文前1條

1 李曼;Web日志挖掘技術(shù)在心靈家園網(wǎng)中的應(yīng)用研究[D];河南大學(xué);2013年

，

本文編號(hào)：2471626

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2471626.html

上一篇：多維度特征的用戶查詢(xún)意圖自動(dòng)分類(lèi)
下一篇：丁同學(xué)不是孫悟空

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于Web日志挖掘的用戶信息需求識(shí)別研究