基于SVM和概率神經(jīng)網(wǎng)絡(luò)多特征組合的在線產(chǎn)品評(píng)論情感信息挖掘
本文選題:SVM 切入點(diǎn):概率神經(jīng)網(wǎng)絡(luò) 出處:《江蘇大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著互聯(lián)網(wǎng)的普及和電商技術(shù)的快速發(fā)展,人們?cè)絹碓较矚g網(wǎng)上購物。相比與線下購物,網(wǎng)購具有便攜性,節(jié)省時(shí)間成本,受時(shí)間和空間的影響較小等特性。消費(fèi)者在網(wǎng)上購買商品前一般會(huì)瀏覽商品下方的評(píng)論信息,在購買商品后,發(fā)表對(duì)商品或服務(wù)的評(píng)價(jià)。在線產(chǎn)品評(píng)論的出現(xiàn)使得企業(yè)改進(jìn)產(chǎn)品質(zhì)量的時(shí)間點(diǎn)也發(fā)生了變化。傳統(tǒng)工業(yè)工程領(lǐng)域,企業(yè)改變產(chǎn)品質(zhì)量的時(shí)間點(diǎn)是在產(chǎn)品離開生產(chǎn)線之前,現(xiàn)在,企業(yè)可以在用戶使用產(chǎn)品之后,得到用戶對(duì)產(chǎn)品的反饋信息,或者在產(chǎn)品制造之前,提前了解用戶的真實(shí)需求,從而幫助企業(yè)理解消費(fèi)者,改善產(chǎn)品質(zhì)量。相比一些學(xué)者使用機(jī)器學(xué)習(xí)的方法來計(jì)算產(chǎn)品特征的情感值,本文更加關(guān)注文本評(píng)論的情感傾向,即識(shí)別文本所屬的情感類別,是正向的情感還是負(fù)向的情感。本文所處理的評(píng)論級(jí)別是子句級(jí),最終使用SVM和概率神經(jīng)網(wǎng)絡(luò)兩種方法來識(shí)別子句的情感傾向,并比較結(jié)果。然后使用概率神經(jīng)網(wǎng)絡(luò)方法來預(yù)測(cè)子句的情感傾向,提取子句的產(chǎn)品屬性,進(jìn)行分類,得到消費(fèi)者在各產(chǎn)品屬性分類上情感分布情況。首先,以亞馬遜網(wǎng)站上華為honor暢玩版4X手機(jī)為例,設(shè)定其在線產(chǎn)品評(píng)論數(shù)據(jù)抓取規(guī)則,然后使用八爪魚采集器抓取在線評(píng)論數(shù)據(jù)。對(duì)抓取的數(shù)據(jù)進(jìn)行向量化處理。識(shí)別每條評(píng)論中的有效子句,對(duì)有效子句進(jìn)行分詞、去掉停用詞等預(yù)處理操作。根據(jù)相應(yīng)的詞典提取子句中情感詞、否定詞、程度副詞和特殊符號(hào)等特征。然后,根據(jù)以上特征組合構(gòu)建文本向量,使用SVM和概率神經(jīng)網(wǎng)絡(luò)兩種方法來來建模,并驗(yàn)證模型的表現(xiàn)性能,判斷概率神經(jīng)網(wǎng)絡(luò)是否可以用于文本情感識(shí)別。每種方法中,根據(jù)特征的不同組合,又分為五組實(shí)驗(yàn),通過不同的實(shí)驗(yàn)組合,根據(jù)實(shí)驗(yàn)結(jié)果分析特征對(duì)文本情感識(shí)別的作用。最后,實(shí)驗(yàn)結(jié)果表明:子句中情感詞數(shù)量和否定詞數(shù)量對(duì)文本的情感識(shí)別作用很強(qiáng),而程度副詞和特殊符號(hào)的作用比較微弱;其次,從模型的準(zhǔn)確度和運(yùn)行時(shí)間兩方面來分析,概率神經(jīng)網(wǎng)絡(luò)方法可以用于文本情感識(shí)別。接著,選用概率神經(jīng)網(wǎng)絡(luò)模型對(duì)實(shí)驗(yàn)數(shù)據(jù)進(jìn)行分類預(yù)測(cè),提取子句的產(chǎn)品屬性,對(duì)其進(jìn)行分類,得到消費(fèi)者在各產(chǎn)品屬性分類上情感分布情況,得到實(shí)驗(yàn)結(jié)果表明:該手機(jī)在相機(jī)和屏幕兩個(gè)方面表現(xiàn)較差,企業(yè)可以在下代產(chǎn)品上改進(jìn)這兩方面。
[Abstract]:With the popularity of the Internet and the rapid development of e-commerce technology, people are more and more like online shopping. Compared with offline shopping, online shopping is portable and saves time cost. Less affected by time and space. Consumers generally browse the comments below the goods before buying them online, and after buying the goods, The appearance of online product reviews has also changed the point in which companies improve product quality. In traditional industrial engineering, the point in which companies change product quality is before the product leaves the production line. Now, enterprises can get feedback from users after they use the products, or they can understand the real needs of the users in advance before the products are manufactured, so as to help the enterprises understand the consumers. Improving product quality. Compared with some scholars using machine learning method to calculate the emotional value of product characteristics, this paper pays more attention to the emotional tendency of text review, that is, to identify the emotional category of text. The comment level is clause level, SVM and probabilistic neural network are used to identify the emotional tendency of clause. Then we use probabilistic neural network method to predict the emotional tendency of clauses, extract the product attributes of clauses, classify them, and get the distribution of consumers' emotions in the classification of product attributes. Take Huawei honor's 4X mobile phone on Amazon's website as an example, setting rules for its online product review data capture. Then we use the octopus collector to capture the online comment data. We vectorize the captured data. We identify the valid clauses in each comment, and segment the valid clauses. Remove preprocessing operations such as stop words. Extract features such as affective words, negative words, degree adverbs and special symbols in clauses according to the corresponding dictionaries. Then, construct text vectors according to the combination of the above features. SVM and probabilistic neural network are used to model the model, to verify the performance of the model, and to judge whether the probabilistic neural network can be used in text emotion recognition. In each method, according to the different combinations of features, it is divided into five groups of experiments. According to the experimental results, the effect of feature on text emotion recognition is analyzed through different experimental combinations. Finally, the experimental results show that the number of emotional words and the number of negative words in a clause have a strong effect on the emotional recognition of text. The function of degree adverb and special symbol is weak. Secondly, the probabilistic neural network method can be used in text emotion recognition from two aspects of model accuracy and running time. The probabilistic neural network model is used to classify and predict the experimental data, extract the product attributes of clauses, classify them, and obtain the distribution of consumer emotion in the classification of product attributes. The experimental results show that the performance of the mobile phone is poor in both camera and screen, and enterprises can improve these two aspects in the next generation of products.
【學(xué)位授予單位】:江蘇大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP183;F713.36
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 唐曉波;朱娟;楊豐華;;基于情感本體和kNN算法的在線評(píng)論情感分類研究[J];情報(bào)理論與實(shí)踐;2016年06期
2 丁晟春;王穎;李霄;;基于SVM的中文微博情緒分析研究[J];情報(bào)資料工作;2016年03期
3 李湘東;劉康;丁叢;高凡;;基于《知網(wǎng)》的多種類型文獻(xiàn)混合自動(dòng)分類研究[J];現(xiàn)代圖書情報(bào)技術(shù);2016年02期
4 郭順利;張向先;;面向中文圖書評(píng)論的情感詞典構(gòu)建方法研究[J];現(xiàn)代圖書情報(bào)技術(shù);2016年02期
5 王冠群;田雪;黃德根;張婧;;中文微博觀點(diǎn)句識(shí)別及要素抽取研究[J];數(shù)據(jù)采集與處理;2016年01期
6 王明文;付翠琴;徐凡;洪歡;;基于詞項(xiàng)共現(xiàn)關(guān)系圖模型的中文觀點(diǎn)句識(shí)別研究[J];中文信息學(xué)報(bào);2015年06期
7 黃挺;姬東鴻;;基于圖模型和多分類器的微博情感傾向性分析[J];計(jì)算機(jī)工程;2015年04期
8 李光敏;許新山;熊旭輝;;Web文本情感分析研究綜述[J];現(xiàn)代情報(bào);2014年05期
9 李壽山;黃居仁;;基于Stacking組合分類方法的中文情感分類研究[J];中文信息學(xué)報(bào);2010年05期
10 趙妍妍;秦兵;劉挺;;文本情感分析[J];軟件學(xué)報(bào);2010年08期
相關(guān)碩士學(xué)位論文 前2條
1 李杏杏;B2C網(wǎng)站商品評(píng)論挖掘技術(shù)的研究[D];北京交通大學(xué);2014年
2 譚龍遠(yuǎn);基于領(lǐng)域的網(wǎng)絡(luò)爬蟲技術(shù)的研究與實(shí)現(xiàn)[D];武漢理工大學(xué);2009年
,本文編號(hào):1644313
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1644313.html