網(wǎng)絡(luò)輿論主題探測(cè)、追蹤與分析關(guān)鍵技術(shù)研究
本文關(guān)鍵詞: 網(wǎng)絡(luò)輿情 主題探測(cè) 主題追蹤 情感分析 出處:《山東財(cái)經(jīng)大學(xué)》2013年碩士論文 論文類(lèi)型:學(xué)位論文
【摘要】:隨著互聯(lián)網(wǎng)的高度普及化,互聯(lián)網(wǎng)對(duì)我們現(xiàn)實(shí)生活的影響也越來(lái)越大。面對(duì)互聯(lián)網(wǎng)中日益增長(zhǎng)的海量信息,當(dāng)用戶(hù)想要針對(duì)某一主題進(jìn)行及時(shí)的跟蹤和了解時(shí),,現(xiàn)存的互聯(lián)網(wǎng)搜索引擎不足以滿(mǎn)足用戶(hù)的需要,輿情監(jiān)控分析系統(tǒng)的出現(xiàn)能夠很好的幫助用戶(hù)對(duì)主題進(jìn)行探測(cè)、追蹤和分析。 本文主要了研究了網(wǎng)絡(luò)輿情主題探測(cè)與追蹤技術(shù)和網(wǎng)絡(luò)輿情情感分析技術(shù)。首先研究了這幾項(xiàng)技術(shù)的國(guó)內(nèi)外的研究現(xiàn)狀,然后對(duì)其中較為重要的幾項(xiàng)技術(shù)進(jìn)行了仔細(xì)的學(xué)習(xí)和比較,比如:文本表示模型、中文分析、特征權(quán)重計(jì)算方法和文本分類(lèi)等。并在此基礎(chǔ)上,提出了本文的創(chuàng)新點(diǎn),包括: (1)基于事件演化的主題探測(cè)和主題追蹤模型。針對(duì)目前輿情監(jiān)控分析系統(tǒng)存在的主題漂移現(xiàn)象帶來(lái)的問(wèn)題,利用種子事件和新穎事件的演化關(guān)系,提出了改進(jìn)的向量空間模型和文本分類(lèi)算法,經(jīng)過(guò)實(shí)驗(yàn)分析證明,本算法可以在一定程度上解決了由主題漂移帶來(lái)的主題探測(cè)與追蹤準(zhǔn)確率降低的問(wèn)題。 (2)基于事件多面性的情感傾向性分析。目前大多數(shù)的輿情監(jiān)控分析系統(tǒng)的情感分析模塊基本都是為了得到針對(duì)某一主題或事物評(píng)價(jià)信息整體的情感極性,而忽略了事物本身的多面性,本文通過(guò)抽取情感語(yǔ)句中的情感分析三元組實(shí)現(xiàn)對(duì)主題或事物評(píng)價(jià)信息的局部情感極性,使得情感傾向性分析功能模塊更加完善。 (3)基于Hadoop平臺(tái)的輿情監(jiān)控分析系統(tǒng)的設(shè)計(jì)。針對(duì)目前海量的互聯(lián)網(wǎng)信息,大數(shù)據(jù)的存儲(chǔ)和運(yùn)算成為輿情監(jiān)控分析系統(tǒng)實(shí)現(xiàn)的重要環(huán)節(jié)之一,鑒于Hadoop平臺(tái)在大數(shù)據(jù)存儲(chǔ)和運(yùn)算的優(yōu)越性,本文以此為開(kāi)發(fā)平臺(tái)對(duì)輿情監(jiān)控分析系統(tǒng)進(jìn)行了設(shè)計(jì)。 本文通過(guò)對(duì)主題探測(cè)與追蹤、情感傾向性分析技術(shù)的研究和探索,對(duì)兩項(xiàng)技術(shù)有了深入的了解,并在此基礎(chǔ)了針對(duì)目前已有的問(wèn)題提出了改進(jìn),為輿情監(jiān)控分析的研究工作提供了一定的技術(shù)支持,有著重要的理論意義。
[Abstract]:With the high popularity of the Internet, the impact of the Internet on our real life is growing. In the face of the growing mass of information in the Internet, when users want to track and understand a certain topic in a timely manner, The existing Internet search engine is not enough to meet the needs of users, the emergence of public opinion monitoring and analysis system can help users to detect, track and analyze topics. This paper mainly studies the technology of detecting and tracking the subject of network public opinion and the technology of emotion analysis of network public opinion. Firstly, the research status of these technologies at home and abroad is studied. Then several important techniques are carefully studied and compared, such as text representation model, Chinese analysis, feature weight calculation method and text classification. On this basis, the innovation of this paper is put forward, including:. 1) the theme detection and subject tracking model based on event evolution. Aiming at the problems caused by the topic drift phenomenon in the current public opinion monitoring and analysis system, the evolution relationship between seed events and novel events is used. An improved vector space model and a text classification algorithm are proposed. The experimental results show that the algorithm can solve the problem of low accuracy of topic detection and tracking caused by topic drift to some extent. (2) emotional orientation analysis based on multi-facets of events. At present, most of the emotion analysis modules of the monitoring and analyzing system of public opinion are basically to get the emotional polarity of the whole evaluation information for a certain subject or thing. In this paper, the emotion analysis triples are extracted from emotional sentences to realize the local emotional polarity of the subject or thing evaluation information, which makes the function module of emotional orientation analysis more perfect. The design of public opinion monitoring and analysis system based on Hadoop platform. According to the mass of Internet information, big data's storage and operation become one of the important links in the realization of public opinion monitoring and analysis system. In view of the superiority of Hadoop platform in big data's storage and operation, this paper designs a monitoring and analysis system for public opinion based on this platform. Through the research and exploration of theme detection and tracking, emotional orientation analysis technology, this paper has a deep understanding of the two technologies, and on the basis of this, some improvements have been put forward in view of the existing problems. It provides some technical support for the research of monitoring and analysis of public opinion, and has important theoretical significance.
【學(xué)位授予單位】:山東財(cái)經(jīng)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:TP393.09;TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 賈自艷 ,何清 ,張? ,李嘉佑 ,史忠植;一種基于動(dòng)態(tài)進(jìn)化模型的事件探測(cè)和追蹤算法[J];計(jì)算機(jī)研究與發(fā)展;2004年07期
2 熊德蘭;程菊明;田勝利;;基于HowNet的句子褒貶傾向性研究[J];計(jì)算機(jī)工程與應(yīng)用;2008年22期
3 趙華;趙鐵軍;趙霞;;時(shí)間信息在話(huà)題檢測(cè)中的應(yīng)用研究[J];計(jì)算機(jī)科學(xué);2008年01期
4 柴玉梅;熊德蘭;昝紅英;;Web文本褒貶傾向性分類(lèi)研究[J];計(jì)算機(jī)工程;2006年17期
5 何鳳英;;基于語(yǔ)義理解的中文博文傾向性分析[J];計(jì)算機(jī)應(yīng)用;2011年08期
6 潘淵;李弼程;張先飛;;LS-SVM:一種有效的新聞主題追蹤方法[J];計(jì)算機(jī)應(yīng)用研究;2008年09期
7 朱嫣嵐;閔錦;周雅倩;黃萱菁;吳立德;;基于HowNet的詞匯語(yǔ)義傾向計(jì)算[J];中文信息學(xué)報(bào);2006年01期
8 唐慧豐;譚松波;程學(xué)旗;;基于監(jiān)督學(xué)習(xí)的中文情感分類(lèi)技術(shù)比較研究[J];中文信息學(xué)報(bào);2007年06期
9 王素格;李德玉;魏英杰;宋曉雷;;基于同義詞的詞匯情感傾向判別方法[J];中文信息學(xué)報(bào);2009年05期
10 宋丹;王衛(wèi)東;陳英;;基于改進(jìn)向量空間模型的話(huà)題識(shí)別與跟蹤[J];計(jì)算機(jī)技術(shù)與發(fā)展;2006年09期
相關(guān)會(huì)議論文 前1條
1 姚天f ;聶青陽(yáng);李建超;李林琳;婁德成;陳珂;付宇;;一個(gè)用于漢語(yǔ)汽車(chē)評(píng)論的意見(jiàn)挖掘系統(tǒng)[A];中文信息處理前沿進(jìn)展——中國(guó)中文信息學(xué)會(huì)二十五周年學(xué)術(shù)會(huì)議論文集[C];2006年
相關(guān)碩士學(xué)位論文 前1條
1 龔海軍;網(wǎng)絡(luò)熱點(diǎn)話(huà)題自動(dòng)發(fā)現(xiàn)技術(shù)研究[D];華中師范大學(xué);2008年
本文編號(hào):1537834
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1537834.html