基于挖掘用戶生成內(nèi)容的產(chǎn)品搜索引擎設(shè)計
發(fā)布時間:2018-12-20 10:40
【摘要】:隨著電子商務(wù)平臺的普及,雖然網(wǎng)絡(luò)環(huán)境的豐富信息為消費者在購買時提供了便利,但同時也增加了消費者的認(rèn)知負載。不同客戶選擇合適的產(chǎn)品僅僅基于他/她自己的經(jīng)驗,產(chǎn)品圖片和產(chǎn)品的基本信息。越來越多的人傾向于在網(wǎng)絡(luò)上自由表達觀點,大量的用戶生成內(nèi)容能夠幫助用戶更全面地了解產(chǎn)品,做出理性的決策,但閱讀搜索結(jié)果所有產(chǎn)品的所有可用的評論是一個又費時又艱巨的任務(wù)。通常情況下,客戶閱讀產(chǎn)品評論有兩個原因,要么是想找一種產(chǎn)品它的功能或相關(guān)服務(wù)有最好的評語;或為了尋找一個與特定的產(chǎn)品特性有評語的評論,例如一個關(guān)于手機電池壽命的評論。因此,本研究的主要目的是通過建立基于用戶評價的產(chǎn)品搜索引擎而自動化普通評論搜索。在本論文中,我們提出了一種面向中文的用戶評論產(chǎn)品搜索排序機制。在本論文中,我們設(shè)計成功了一個以產(chǎn)品特性為基礎(chǔ)的排序系統(tǒng),名字叫“t Search”。t Search系統(tǒng)通過挖掘客戶的評論能夠為用戶推薦最有價值的產(chǎn)品。但為了提高系統(tǒng)的搜索功能,系統(tǒng)在搜索過程中將考慮用戶所感興趣的產(chǎn)品特征,并立足于之前的消費者對產(chǎn)品的意見的搜索結(jié)果,而不僅通過產(chǎn)品的銷售者提供的產(chǎn)品基本信息。此外,該系統(tǒng)為每個產(chǎn)品還提供了一個視覺觀點總結(jié),以幫助客戶獲得有關(guān)整體意見的總體思路,并確定顯示獲得了最多關(guān)注的特定產(chǎn)品。系統(tǒng)設(shè)計時,所采用的數(shù)據(jù)都從中國領(lǐng)先的電子商務(wù)網(wǎng)站taobao.com爬取的。該數(shù)據(jù)集包含從5個不同領(lǐng)域(手機,數(shù)碼相機,電飯煲,豆?jié){機和筆記本電腦)作為產(chǎn)品類。每一個產(chǎn)品類包含了前三名熱門品牌的產(chǎn)品,而每個產(chǎn)品僅屬于一個產(chǎn)品類。爬取的信息包括16071產(chǎn)品頁,一共有537638評論。每一個產(chǎn)品頁所爬取的信息包括:產(chǎn)品標(biāo)題,產(chǎn)品ID,產(chǎn)品基本參數(shù)和產(chǎn)品的所有評論,然后為了快速獲取搜索結(jié)果建立了基于產(chǎn)品標(biāo)題的搜索索引。系統(tǒng)不僅能夠接受中文關(guān)鍵詞,而且英文寫的關(guān)鍵詞同時也可以處理的。系統(tǒng)支持逗號分隔的字符串,其中的第一部分視為搜索關(guān)鍵詞,而其余部分視為產(chǎn)品功能或特性用戶對此感興趣。搜索結(jié)果的排序都是根據(jù)每個產(chǎn)品計算出的基于產(chǎn)品評論的評分。關(guān)于產(chǎn)品特征提取,我們使用了斯坦福句子語法類型依存表示。斯坦福句子語法類型的主要特點是提供一個簡單的在一個句子中的語法關(guān)系表示,可以很容易被理解并有效地被沒有自然語言處理技術(shù)的研究員利用與應(yīng)用。雖然中文的句子語法類型的設(shè)計類似于英文句子語法類型,但大多數(shù)語法結(jié)構(gòu)只存在于中文。在本論文中,我們考慮了五種類型依存來提取產(chǎn)品特征,它們是:nsubj,dobj,ccomp,nn與attr。提取的產(chǎn)品特征分為兩類:從屬特征與獨立特征。從屬特征是描述產(chǎn)品本身或其組成部分。獨立特征是功能描述相關(guān)服務(wù)。在該方法中,每個類別給分配了不同的權(quán)重并假定客戶會為從屬特征更為關(guān)注。此外,為了方便使用和管理我們提出的系統(tǒng),前臺和后臺網(wǎng)站也被建立了。前臺的主頁用來接收用戶輸入的搜索關(guān)鍵詞。點擊“搜索”按鈕后,搜索關(guān)鍵詞將被提交給系統(tǒng)后臺,最后系統(tǒng)將生成與查詢關(guān)鍵詞匹配的產(chǎn)品列表并根據(jù)評價評分而排序。本系統(tǒng)不僅提供按關(guān)鍵詞搜索,而且還提供按照產(chǎn)品的特定功能或特性搜索。系統(tǒng)的評估過程通過了兩個層次:第一個是測量產(chǎn)品特征提取和分類的準(zhǔn)確性,而第二個是確定搜索結(jié)果的效率和視覺總結(jié)的可用性。實驗顯示在特征看來對提取高水平的精度,并與排名和匯總高水平的參與者滿意。本論文的主要貢獻可以歸納在以下四個方面:第一:為產(chǎn)品搜索引擎提出了基于用戶生產(chǎn)內(nèi)容的排序方法,提取的產(chǎn)品特征第一:為產(chǎn)品搜索引擎提出了基于用戶生產(chǎn)內(nèi)容的排序方法,提取的產(chǎn)品特征分為兩類:從屬特征與獨立特征第二:從挖掘角度,提出的系統(tǒng)在切面級別剖析,但我們所知的其他方法都在文檔或句子級別進行剖析的。第三:提供可視化產(chǎn)品特征總結(jié)。用戶可從中清晰地看到產(chǎn)品的優(yōu)勢。第四:所提出的系統(tǒng)能夠?qū)崟r處理和分類產(chǎn)品評價,并能夠捕獲新添加的產(chǎn)品特征同時確定它們的極性。
[Abstract]:With the popularization of e-commerce platform, while the rich information of the network environment provides convenience for consumers to purchase, the cognitive load of the consumer is also increased. different customers select the appropriate product based solely on his/ her own experience, product pictures and the basic information of the product. An increasing number of people tend to freely express their views on the network, and a large number of user-generated content can help users understand the products more fully and make rational decisions, but all the available comments of all the products that read the search results are a time-consuming and arduous task. in general, that customer's review of the product comment has two reasons, either to find a product whose function or related service has the best comment, or to find a comment on the specific product characteristics, such as a comment on the battery life of the mobile phone battery. Therefore, the main purpose of this study is to automate the general comment search by establishing a user-based product search engine. In this paper, we propose a Chinese-oriented user comment product search and sorting mechanism. In this paper, we design a sequencing system based on product characteristics, called the 鈥渢 Search鈥,
本文編號:2387882
[Abstract]:With the popularization of e-commerce platform, while the rich information of the network environment provides convenience for consumers to purchase, the cognitive load of the consumer is also increased. different customers select the appropriate product based solely on his/ her own experience, product pictures and the basic information of the product. An increasing number of people tend to freely express their views on the network, and a large number of user-generated content can help users understand the products more fully and make rational decisions, but all the available comments of all the products that read the search results are a time-consuming and arduous task. in general, that customer's review of the product comment has two reasons, either to find a product whose function or related service has the best comment, or to find a comment on the specific product characteristics, such as a comment on the battery life of the mobile phone battery. Therefore, the main purpose of this study is to automate the general comment search by establishing a user-based product search engine. In this paper, we propose a Chinese-oriented user comment product search and sorting mechanism. In this paper, we design a sequencing system based on product characteristics, called the 鈥渢 Search鈥,
本文編號:2387882
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2387882.html
最近更新
教材專著