天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向微博情感分析的本體自動(dòng)抽取關(guān)鍵技術(shù)研究

發(fā)布時(shí)間:2018-05-10 13:29

  本文選題:微博 + 情感詞。 參考:《首都師范大學(xué)》2014年碩士論文


【摘要】:隨著新型互聯(lián)網(wǎng)應(yīng)用的迅猛發(fā)展,微博快速崛起,用戶(hù)數(shù)達(dá)到2.81億,使用率達(dá)到45.5%,每天數(shù)以千萬(wàn)人通過(guò)微博分享自己對(duì)各類(lèi)話(huà)題的觀點(diǎn)與情感,如何自動(dòng)感知微博主體的情感,并從宏觀上科學(xué)研判微博社區(qū)對(duì)特定話(huà)題的觀點(diǎn)傾向性,已經(jīng)成為微博計(jì)算與輿情分析亟待解決的基本科學(xué)問(wèn)題。 然而,以往的情感分析大都是基于整個(gè)傳統(tǒng)長(zhǎng)文本層面,并且由于微博內(nèi)容短小且不規(guī)范,碎片化與主體化特征日益凸顯,傳統(tǒng)的情感分析算法存在本質(zhì)缺陷,效率低下且效果很難滿(mǎn)足實(shí)際需求。利用情感詞典分析用戶(hù)產(chǎn)生內(nèi)容的情感傾向性是簡(jiǎn)單有效的方法。但由于情感詞典規(guī)模有限,同時(shí)網(wǎng)絡(luò)用語(yǔ)新詞層出不窮,語(yǔ)言使用不規(guī)范,人工整理耗時(shí)耗力,領(lǐng)域性強(qiáng)。為解決以上問(wèn)題,本文提出一種自動(dòng)挖掘潛在情感詞并計(jì)算其情感權(quán)重的算法,該算法與應(yīng)用領(lǐng)域無(wú)關(guān),具有良好的擴(kuò)展性。該方法基于貝葉斯原理和大數(shù)據(jù)挖掘,能夠挖掘未知的情感詞,并根據(jù)其情感權(quán)重值的大小判斷其情感極性及情感傾向性程度,可有效擴(kuò)展情感詞典,并豐富情感詞典的精細(xì)化使用,從而實(shí)現(xiàn)了情感詞庫(kù)的自動(dòng)挖掘與獲取。同時(shí),在此基礎(chǔ)之上,實(shí)現(xiàn)情感主體屬性的識(shí)別,包括觀點(diǎn)句識(shí)別、情感對(duì)象抽取及情感傾向性判斷,從而完成情感分析的本體自動(dòng)抽取。 本文在理論研究的基礎(chǔ)上進(jìn)行算法的實(shí)踐驗(yàn)證,同時(shí)為驗(yàn)證該方法能夠?qū)崿F(xiàn)跨領(lǐng)域,本文又分別針對(duì)京東商城、豆瓣、大眾點(diǎn)評(píng)三組評(píng)論語(yǔ)料做了實(shí)驗(yàn)。其結(jié)果的準(zhǔn)確率都基本在90%以上,驗(yàn)證了以上算法的有效性和實(shí)用性,為各種互聯(lián)網(wǎng)應(yīng)用,不僅僅是微博,提供了情感分析的基礎(chǔ)。
[Abstract]:With the rapid development of new Internet applications, Weibo has risen rapidly, with the number of users reaching 281 million and the utilization rate reaching 45.5. Tens of millions of people share their views and feelings on various topics through Weibo every day, and how to automatically perceive the emotions of Weibo subjects. It has become a basic scientific problem to be solved urgently to calculate and analyze public opinion from macroscopic view of Weibo community on specific topic. However, most of the previous emotional analysis is based on the whole traditional long text level, and because Weibo's content is short and non-standard, fragmentation and subjectivity feature is increasingly prominent, the traditional emotional analysis algorithm has essential defects. The efficiency is low and the effect is very difficult to meet the actual demand. It is a simple and effective method to use emotion dictionary to analyze the affective tendency of user generated content. However, due to the limited scale of emotion dictionary, the network neologisms emerge in endlessly, language use is not standardized, manual collation is time-consuming and consuming, and domain is strong. In order to solve the above problems, this paper proposes an algorithm for automatically mining latent emotion words and calculating their emotional weights. The algorithm is independent of the application field and has good expansibility. Based on Bayesian theory and big data mining, this method can mine unknown affective words, judge its emotional polarity and affective tendency according to the magnitude of its emotional weight, and can effectively expand the emotional dictionary. It also enriches the refined use of emotion dictionary, thus realizing the automatic mining and acquisition of emotion lexicon. At the same time, on the basis of this, the recognition of emotional subject attributes is realized, including viewpoint sentence recognition, emotional object extraction and emotional orientation judgment, so that the ontology of emotional analysis can be extracted automatically. In this paper, based on the theoretical research, the algorithm is verified in practice, and in order to verify that the method can achieve cross-domain, this paper respectively aimed at JingDong Mall, Douban, Dianping three groups of comment corpus to do experiments. The accuracy of the results is above 90%, which verifies the validity and practicability of the above algorithms, and provides a basis for emotional analysis for various Internet applications, not only Weibo.
【學(xué)位授予單位】:首都師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP393.092;TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前8條

1 張晶;朱波;梁琳琳;侯敏;滕永林;;基于情緒因子的中文微博情緒識(shí)別與分類(lèi)[J];北京大學(xué)學(xué)報(bào)(自然科學(xué)版);2014年01期

2 杜偉夫;譚松波;云曉春;程學(xué)旗;;一種新的情感詞匯語(yǔ)義傾向計(jì)算方法[J];計(jì)算機(jī)研究與發(fā)展;2009年10期

3 樊鵬翼;王暉;姜志宏;李沛;;微博網(wǎng)絡(luò)測(cè)量研究[J];計(jì)算機(jī)研究與發(fā)展;2012年04期

4 魏椺;向陽(yáng);陳千;;中文文本情感分析綜述[J];計(jì)算機(jī)應(yīng)用;2011年12期

5 朱嫣嵐;閔錦;周雅倩;黃萱菁;吳立德;;基于HowNet的詞匯語(yǔ)義傾向計(jì)算[J];中文信息學(xué)報(bào);2006年01期

6 謝麗星;周明;孫茂松;;基于層次結(jié)構(gòu)的多策略中文微博情感分析和特征抽取[J];中文信息學(xué)報(bào);2012年01期

7 陽(yáng)愛(ài)民;林江豪;周詠梅;;中文文本情感詞典構(gòu)建方法[J];計(jì)算機(jī)科學(xué)與探索;2013年11期

8 趙妍妍;秦兵;劉挺;;文本情感分析[J];軟件學(xué)報(bào);2010年08期

,

本文編號(hào):1869500

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1869500.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)57e26***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com