天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

網(wǎng)購用戶評論中隱式評價對象的提取方法研究

發(fā)布時間:2019-03-16 11:20
【摘要】:在我國電子商務得到快速發(fā)展的同時,網(wǎng)購已經(jīng)深入人們?nèi)粘I?由于信息的不對稱性,使得消費者難以了解到商品的真實情況,而在線用戶評論為用戶的購買決策提供了參考意見,針對在線評論的意見挖掘也得到了廣大學者的青睞。評價對象作為意見挖掘領域的一個方面,也得到了廣泛的研究,而現(xiàn)有針對評價對象的研究主要集中在顯式評價對象的研究,很少有學者將隱式評價對象納入研究的考慮范圍。在研究領域,對于學者來說,針對隱式評價對象的研究能夠提高評價對象研究的準確率;對于企業(yè)來說,充分挖掘隱式評價對象,能夠使企業(yè)關注到隱藏在消費者評論中的意見對象,更為全面地認識到消費者對產(chǎn)品各個方面的使用體驗;對于消費者個人來說,電子商務平臺通過對隱式評價對象的抽取,使得展示或推薦給用戶的有效評論更加真實,消費者能夠獲得其他用戶對商品各方面更加精確的評論意見;诖,本文對用戶評論中的隱式評價對象進行了挖掘研究,主要工作包括以下幾方面:(1)數(shù)據(jù)預處理。通過數(shù)據(jù)抓取工具從淘寶網(wǎng)站上抓取用戶評論的真實數(shù)據(jù),然后對此文本數(shù)據(jù)進行分句、分詞、特征選擇、向量表示等處理。針對初始文本特征詞空間維度較高的問題,采用基于模擬退火的粒子群優(yōu)化算法對特征集進行二次特征提取,從而降低特征詞空間維度。實驗結果表明,采用該方法后,特征詞空間維度由425維降低到296維,該方法能夠進行有效的特征選擇。(2)顯式評價句的聚類分析。本文將評價句分為顯式評價句和隱式評價句,并對顯式評價句進行文本聚類研究。在用特征詞對評價句進行表示后,得到的文本向量空間維度依然很高,因此,本文采用適用于高維數(shù)據(jù)集的FCM聚類算法。針對FCM算法容易陷入局部最優(yōu)的特點,本文提出了基于模擬退火的FCM改進算法,通過對FCM算法迭代過程的控制,有效避免了算法陷入局部最優(yōu)。通過實驗將顯式評價句聚為9類,給每個類別設定類別名稱。實驗結果表明,基于模擬退火的FCM改進算法能夠對文本進行合理聚類。(3)隱式評價句評價對象提取。在對顯式評價句進行文本聚類之后,將同類別評價句歸為一個文檔集。由于評價句的評價對象、評價詞及類別之間存在某種映射關系,本文采用關聯(lián)規(guī)則算法來挖掘不同文檔集的關聯(lián)規(guī)則,并建立類別、評價對象、評價詞之間的關聯(lián)規(guī)則表,在該表的基礎上對隱式評價對象進行提取研究。通過對比實驗驗證,本文所提出的隱式評價對象提取方法的準確率達到75.26%,能夠有效提高文本分類的準確率。
[Abstract]:With the rapid development of e-commerce in China, online shopping has gone deep into people's daily life. Because of the asymmetry of information, it is difficult for consumers to understand the real situation of goods. The online user comments provide a reference for the purchase decision of users, and the opinion mining of online comments has also been favored by the majority of scholars. As an aspect of opinion mining, evaluation object has also been extensively studied, and the existing research on evaluation object is mainly focused on explicit evaluation object, and few scholars take implicit evaluation object into consideration. In the research field, for the scholars, the research on implicit evaluation object can improve the accuracy of the evaluation object research; For enterprises, fully mining implicit evaluation objects can make enterprises pay attention to the opinion objects hidden in consumers' comments, and realize consumers' experience in all aspects of products more comprehensively. For consumers, by extracting implicit evaluation objects, e-commerce platform makes the effective comments displayed or recommended to users more realistic, and consumers can obtain more accurate comments from other users on all aspects of goods. Based on this, this paper has carried on the mining research to the implicit evaluation object in the user comment. The main work includes the following aspects: (1) data preprocessing. The real data of user comments is captured from Taobao website by data crawling tool, and then the text data is processed such as sentence segmentation, word segmentation, feature selection, vector representation and so on. In order to solve the problem of high spatial dimension of feature words in initial text, particle swarm optimization (PSO) algorithm based on simulated annealing is used to extract the second feature of feature set, so as to reduce the dimension of feature space. The experimental results show that the spatial dimension of feature words is reduced from 425 dimension to 296 dimension, and this method can be used to select features effectively. (2) clustering analysis of explicit evaluation sentences. In this paper, evaluation sentences are divided into explicit evaluation sentences and implicit evaluation sentences, and text clustering of explicit evaluation sentences is carried out. After the evaluation sentence is represented by feature words, the dimension of text vector space is still very high. Therefore, the FCM clustering algorithm suitable for high-dimensional data sets is adopted in this paper. In view of the characteristic that FCM algorithm is easy to fall into local optimization, this paper proposes an improved FCM algorithm based on simulated annealing. By controlling the iterative process of FCM algorithm, the algorithm can effectively avoid falling into local optimization. Through experiments, explicit evaluation sentences are grouped into 9 categories, and each category is given a category name. The experimental results show that the improved FCM algorithm based on simulated annealing can reasonably cluster the text. (3) implicit evaluation object extraction. After text clustering of explicit evaluation sentences, the same category evaluation sentences are classified into a document set. Because there is some mapping relationship among the evaluation object, the evaluation word and the category of the evaluation sentence, this paper uses the association rule algorithm to mine the association rules of different document sets, and establishes the association rules table among the categories, the evaluation objects and the evaluation words. On the basis of this table, the implicit evaluation objects are extracted. The experimental results show that the accuracy of the implicit evaluation object extraction method proposed in this paper is 75.26%, which can effectively improve the accuracy of text classification.
【學位授予單位】:北京交通大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.1

【引證文獻】

相關期刊論文 前1條

1 韓忠明;李夢琪;劉雯;張夢玫;段大高;于重重;;網(wǎng)絡評論方面級觀點挖掘方法研究綜述[J];軟件學報;2018年02期



本文編號:2441230

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/2441230.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶457cd***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
亚洲av日韩av高潮无打码| 久久99夜色精品噜噜亚洲av| 亚洲精品小视频在线观看| 国产日韩欧美国产欧美日韩| 欧美国产日本免费不卡| 欧美同性视频免费观看| 男人大臿蕉香蕉大视频| 免费精品一区二区三区| 日韩一区二区三区久久| 精品香蕉国产一区二区三区| 亚洲一区二区三区一区| 美女被后入视频在线观看| 黄色国产一区二区三区| 日韩欧美好看的剧情片免费| 亚洲男人的天堂色偷偷| 亚洲最大福利在线观看| 五月的丁香婷婷综合网| 亚洲欧美日韩综合在线成成| 国产精品日韩欧美一区二区| 人妻内射在线二区一区| 亚洲欧美日韩国产综合在线| 国产户外勾引精品露出一区 | 国产人妻熟女高跟丝袜| 亚洲午夜福利视频在线| 69老司机精品视频在线观看| 色丁香之五月婷婷开心| 成年人免费看国产视频| 中文字幕精品人妻一区| 五月天婷亚洲天婷综合网| 沐浴偷拍一区二区视频| 亚洲国产成人爱av在线播放下载| 久久精品少妇内射毛片| 99久久婷婷国产亚洲综合精品 | 91在线播放在线播放观看| 亚洲黑人精品一区二区欧美| 91精品蜜臀一区二区三区| 国产精品一区二区视频大全| 91福利视频日本免费看看| 五月天丁香亚洲综合网| 欧美国产日产综合精品| 黄片三级免费在线观看|