基于遺傳算法的跨領域產(chǎn)品評論的虛假性分析研究
發(fā)布時間:2018-06-04 06:43
本文選題:虛假評價 + 跨領域; 參考:《云南大學》2016年碩士論文
【摘要】:隨著網(wǎng)絡電子商務的逐步成熟,網(wǎng)上購物成為了許多人的消費選擇。同時產(chǎn)品的評價會影響人們購買產(chǎn)品的決策,從而導致賣家為了提高產(chǎn)品的銷售量或打擊競爭對手故意編造一些虛假評價。因此,虛假評論分析研究成為目前文本情感分析的一個重要研究內容。然而,目前的虛假評論分析的復雜度高且識別準確度較低;其次,標注數(shù)據(jù)缺乏或者很少時,虛假分析是比較困難的。因而,本文基于遷移學習思想、遺傳算法和圖譜技術對跨領域的虛假評論進行分析研究。第一,針對跨領域的虛假產(chǎn)品評論,本文基于遺傳算法從已知的源領域虛假評論中選擇最優(yōu)特征集。首先,根據(jù)虛假評論的虛假特征,對評論進行數(shù)字化處理。其次,論文對結構化的評論數(shù)據(jù)進行染色體基因的編碼,基于邏輯回歸構建適應度函數(shù)和遺傳算法選擇最優(yōu)的特征集。最優(yōu)特征集合的選擇為降低虛假評論分析的復雜度提供支持。最后,本文通過實驗分析了真實評價與虛假評價在特征上存在的差異。第二,基于最優(yōu)的特征集,本文提出了基于遷移學習的跨領域虛假評論識別方法。該方法根據(jù)已知領域與未知領域間文檔相似度,定義二者的關聯(lián),再結合圖譜技術訓練情感分類器,并識別出未知領域的虛假評論。實驗結果證明出該算法對識別虛假評價上是可行且具有一定的優(yōu)勢。第三,基于本文提出的方法,本文設計并實現(xiàn)了虛假評價信息識別的原型系統(tǒng),為進一步研究虛假評論信息的識別方法提供了一個平臺并且為后續(xù)研究虛假評論信息的識別方法奠定基礎。
[Abstract]:With the gradual maturity of e-commerce, online shopping has become the consumer choice of many people. At the same time, the evaluation of products will affect people's decision to buy products, which will lead to sellers deliberately fabricate some false evaluation in order to increase the sales of products or attack competitors. Therefore, the research of false comment analysis has become an important research content of text emotion analysis. However, the current analysis of false comments has high complexity and low recognition accuracy. Secondly, when the labeled data is scarce or less, the false analysis is more difficult. Therefore, based on the idea of transfer learning, genetic algorithm and map technology, this paper analyzes and studies the false comments across domains. First, for cross-domain false product reviews, this paper selects the optimal feature collection from known source domain false reviews based on genetic algorithm. Firstly, according to the false features of false comments, the comments are processed digitally. Secondly, the structured comment data is encoded by chromosome gene, and the fitness function and genetic algorithm are constructed based on logical regression to select the optimal feature set. The selection of optimal feature sets provides support for reducing the complexity of false comment analysis. Finally, this paper analyzes the differences between true evaluation and false evaluation through experiments. Secondly, based on the optimal feature set, this paper proposes a cross-domain false comment recognition method based on transfer learning. According to the document similarity between known domain and unknown domain, this method defines the relationship between them, and then combines the graph technique to train the emotion classifier, and to recognize the false comment of unknown domain. The experimental results show that the algorithm is feasible and has some advantages in identifying false evaluation. Thirdly, based on the method proposed in this paper, a prototype system of false evaluation information recognition is designed and implemented. It provides a platform for further research on the identification method of false comment information and lays a foundation for further research on the identification method of false comment information.
【學位授予單位】:云南大學
【學位級別】:碩士
【學位授予年份】:2016
【分類號】:TP391.1;TP18
,
本文編號:1976388
本文鏈接:http://sikaile.net/jingjilunwen/dianzishangwulunwen/1976388.html
最近更新
教材專著