天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

B2C網(wǎng)站商品評論挖掘技術(shù)的研究

發(fā)布時間:2018-05-30 15:03

  本文選題:商品評論 + 評論挖掘。 參考:《北京交通大學(xué)》2014年碩士論文


【摘要】:隨著B2C市場規(guī)模的增大,消費(fèi)者在互聯(lián)網(wǎng)上對商品的評論數(shù)量也呈爆炸式增長。由于這些商品評論中隱藏許多對商家和消費(fèi)者有價值的信息,因此準(zhǔn)確高效地識別這些信息并加以利用會帶來巨大的經(jīng)濟(jì)效益和廣闊的應(yīng)用前景,這使得商品評論的挖掘與分析成為近年來研究的熱點(diǎn)。本文以大型B2C網(wǎng)站京東商城的手機(jī)評論為研究對象,對商品評論文本的情感分類和情感極性分析兩方面進(jìn)行了研究,主要工作如下: 使用支持向量機(jī)方法和樸素貝葉斯方法對商品評論文本的情感分類進(jìn)行研究。首先對網(wǎng)上獲取的評論進(jìn)行人工選擇獲得訓(xùn)練集,然后利用NLPIR分詞系統(tǒng)預(yù)處理語料,并用TF-IDF方法計算特征詞的權(quán)重。最后,使用MI、IG、CHI特征選擇方法在分類器SVM、NB上進(jìn)行實驗對比分析。實驗結(jié)果表明,使用CHI特征提取方法,SVM和NB的分類效果能達(dá)到80%以上。另外,在同一特征提取方法上,SVM的分類效果要優(yōu)于NB,正確率可到83%。 采用基于鄰近原則的“雙向迭代法”對商品評論文本進(jìn)行細(xì)粒度情感極性分析。首先利用PMI-IR算法構(gòu)建情感種子集,然后利用基于鄰近原則的“雙向迭代法”獲取特征詞-情感詞關(guān)聯(lián)關(guān)系對,以此提出了一種情感詞典的構(gòu)建方法,構(gòu)建了一個基于HowNet的三元組情感詞典Tri-HowNet,并且通過實驗對比分析了基于HowNet極性詞典與基于Tri-HowNet情感詞典的兩種極性判定方法。實驗結(jié)果表明,后者在判定多語義情感詞極性時表現(xiàn)優(yōu)于前者。 設(shè)計并實現(xiàn)了基于SSH框架的評論挖掘系統(tǒng)。該系統(tǒng)主要包括詞典維護(hù)、評論收集、評論分類、評論情感分析和可視化展示等5個模塊。首先,利用開源:Java類庫Crawler4j提供的接口,通過post模擬登陸的方法來獲取評論。其次,由文本情感分類和情感分析兩個方向出發(fā),對商品評論進(jìn)行研究分析。最后,將結(jié)果存入商品的分析庫中,并能夠以3D柱狀圖的形式展現(xiàn),方便用戶查詢與使用。
[Abstract]:With the increase of B2C market scale, the number of consumers commenting on goods on the Internet is also increasing explosively. Because much valuable information is hidden in these commodity reviews, accurate and efficient identification and utilization of such information will bring great economic benefits and broad application prospects. This makes the mining and analysis of commodity reviews become the focus of research in recent years. This paper takes the mobile phone reviews of JingDong Mall, a large B2C website, as the research object, and studies the affective classification and the affective polarity analysis of the commodity review texts. The main work is as follows: Support vector machine (SVM) and naive Bayes method are used to study the emotion classification of commodity comment text. Firstly, the training set is obtained by manually selecting the comments obtained on the net, then the corpus is preprocessed by using the NLPIR word segmentation system, and the weight of the feature words is calculated by using the TF-IDF method. Finally, the feature selection method is used to compare and analyze the classifier SVMNB. The experimental results show that the classification effect of CHI and NB can reach more than 80%. In addition, the classification effect of SVM in the same feature extraction method is better than that of NB.The accuracy rate can reach 83%. A bidirectional iterative method based on proximity principle is used to analyze the fine-grained affective polarity of commodity review texts. Firstly, PMI-IR algorithm is used to construct the emotion seed set, then the "bidirectional iterative method" based on the proximity principle is used to obtain the associative pairs of feature words and affective words. A triple emotion dictionary Tri-HowNet based on HowNet is constructed, and two polarity determination methods based on HowNet polarity dictionary and Tri-HowNet emotion dictionary are compared and analyzed through experiments. The experimental results show that the latter performs better than the former in determining polarity of multi-semantic affective words. A comment mining system based on SSH framework is designed and implemented. The system mainly includes five modules: dictionary maintenance, comment collection, comment classification, comment emotion analysis and visual display. First of all, using the interface provided by the open source: Java class library Crawler4j, the method of simulating login by post is used to obtain comments. Secondly, from the two aspects of text emotion classification and emotion analysis, the article makes a research and analysis on commodity comment. Finally, the results are stored in the commodity analysis database, and can be displayed as 3D histogram, which is convenient for users to query and use.
【學(xué)位授予單位】:北京交通大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP391.1;TP393.092

【參考文獻(xiàn)】

相關(guān)期刊論文 前5條

1 婁德成;姚天f ;;漢語句子語義極性分析和觀點(diǎn)抽取方法的研究[J];計算機(jī)應(yīng)用;2006年11期

2 唐慧豐;譚松波;程學(xué)旗;;基于監(jiān)督學(xué)習(xí)的中文情感分類技術(shù)比較研究[J];中文信息學(xué)報;2007年06期

3 徐軍;丁宇新;王曉龍;;使用機(jī)器學(xué)習(xí)方法進(jìn)行新聞的情感自動分類[J];中文信息學(xué)報;2007年06期

4 郗亞輝;張明;袁方;王煜;;產(chǎn)品評論挖掘研究綜述[J];山東大學(xué)學(xué)報(理學(xué)版);2011年05期

5 仇光;鄭淼;卜佳俊;史源;陳純;;基于傳播的產(chǎn)品屬性抽取[J];浙江大學(xué)學(xué)報(工學(xué)版);2010年11期



本文編號:1955723

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1955723.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶9de55***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
91麻豆精品欧美一区| 97人妻人人揉人人躁人人| 欧美丰满人妻少妇精品| 国产日韩中文视频一区| 久久精品国产99精品最新| 日韩精品在线观看一区| 午夜国产成人福利视频| 免费亚洲黄色在线观看| 暴力三级a特黄在线观看| 国产一区二区精品丝袜| 九九热精品视频免费在线播放| 九九视频通过这里有精品| 国产成人精品午夜福利av免费| 日韩人妻少妇一区二区| 久久精品亚洲情色欧美| 欧美不雅视频午夜福利| 亚洲综合激情另类专区老铁性| 日韩精品一区二区三区射精| 极品少妇嫩草视频在线观看| 好骚国产99在线中文| 久久国产精品熟女一区二区三区| 国产肥妇一区二区熟女精品| 精品一区二区三区乱码中文| 欧美精品中文字幕亚洲| 韩国激情野战视频在线播放| 日韩精品一级片免费看| 亚洲欧美天堂精品在线| 日本国产欧美精品视频| 日本精品免费在线观看| 日韩不卡一区二区视频| 午夜国产成人福利视频| 欧美人禽色视频免费看 | 欧美在线观看视频免费不卡| 亚洲国产精品一区二区毛片| 久久精品国产99精品最新| 日韩在线一区中文字幕| 亚洲中文字幕人妻av| 日韩欧美好看的剧情片免费| 精品日韩中文字幕视频在线| 中文字幕高清免费日韩视频| 国产午夜福利一区二区|