天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 軟件論文 >

基于互聯(lián)網(wǎng)文本情感分析的數(shù)值序列預(yù)測算法研究

發(fā)布時間:2018-08-22 18:23
【摘要】:在信息時代,互聯(lián)網(wǎng)已經(jīng)成為人們最為主要的交流溝通工具,尤其是“互聯(lián)網(wǎng)+”更是隨時隨刻都在改變著人們的生活方式。同時,隨著4G網(wǎng)絡(luò)逐步成熟,移動設(shè)備用戶也是日益增長,更多的網(wǎng)絡(luò)用戶樂意通過各式各樣的媒體渠道交互信息,表達(dá)自己對商品、社會事件以及服務(wù)等的意見和情感。由于網(wǎng)絡(luò)傳播范圍廣、速度快及用戶多,必然使得數(shù)據(jù)呈現(xiàn)爆炸式的增長。經(jīng)過長時間的發(fā)展和積累逐漸形成了社會的集體智慧,因而通過對互聯(lián)網(wǎng)大數(shù)據(jù)的挖掘,分析網(wǎng)絡(luò)用戶的情感狀態(tài)以及社交媒體表達(dá)情感導(dǎo)向,對許多社會活動具有預(yù)測能力。目前,在基于情感分析預(yù)測算法研究中,還有很多難題需要解決,如互聯(lián)網(wǎng)信息的采集、文本可信性分析,預(yù)測模型變量的選取及預(yù)測敏感度等。本文針對這些問題開發(fā)了數(shù)據(jù)采集、文本分類接口,提出了基于可信事件信息情感傾向的單變量和多變量預(yù)測模型,對商品價格以及房產(chǎn)股市進(jìn)行預(yù)測。本文提供了從搜索引擎中新聞數(shù)據(jù)獲取的通用采集接口、基于Scrapy框架的價格數(shù)據(jù)采集器和文本可信分類模型,解決在不同領(lǐng)域中采集文本數(shù)據(jù)的通用性以及動態(tài)網(wǎng)頁頁面信息的采集和文本的可信性等問題。研究者只需要針對自己研究的領(lǐng)域按照接口文檔的要求提供關(guān)鍵詞和價格數(shù)據(jù)采集Xpath路徑就可以方便的采集文本數(shù)據(jù)和相關(guān)的價格數(shù)據(jù)。將獲取到的網(wǎng)絡(luò)文本數(shù)據(jù)通過可信分類處理后可以計算得到情感傾向因素,并用以預(yù)測算法的研究中。在文本分析基礎(chǔ)上,本文提出了基于可信事件的情感傾向的單變量和多變量的價格預(yù)測算法。由于在時間序列算法中,數(shù)據(jù)樣本需要具有穩(wěn)定性和非趨勢性,因此對數(shù)據(jù)進(jìn)行擬合檢測并且采用差分法對樣本數(shù)據(jù)進(jìn)行趨勢性和平穩(wěn)性處理。在單變量預(yù)測模型中,結(jié)合可信文本情感傾向因素提出SSA-ARMA模型,經(jīng)過訓(xùn)練得到模型的回歸和移動的最佳周期數(shù)。通過多組實驗比較得到新模型的誤差變小了,預(yù)測效果明顯得到提升。為了進(jìn)一步解釋文本情感傾向的影響程度,提出的MSA-VAR多變量預(yù)測模型分析房產(chǎn)股市中多個變量的脈沖響應(yīng)和波動顯示情感因素對收盤價具有明顯作用,實驗表明MSA-VAR模型具有較好的預(yù)測效果和魯棒性。最后,本文應(yīng)用算法研究結(jié)果,實現(xiàn)了移動設(shè)備價格預(yù)測應(yīng)用軟件,具有較高的實用價值。
[Abstract]:In the information age, the Internet has become the most important communication tool, especially the Internet is changing people's way of life at any time. At the same time, with the gradual maturity of 4G network, mobile device users are also growing. More network users are willing to exchange information through various media channels to express their opinions and feelings on goods, social events and services. Because of the wide range of network transmission, fast and many users, the data will be explosive growth. After a long period of development and accumulation, the collective wisdom of the society has been gradually formed. Therefore, through the mining of the Internet big data, the analysis of the emotional state of network users and the expression of emotional orientation by social media, it has the ability to predict many social activities. At present, there are still many problems to be solved in the research of prediction algorithm based on emotion analysis, such as the collection of Internet information, the analysis of text credibility, the selection of prediction model variables and the prediction sensitivity and so on. In this paper, the interface of data collection and text classification is developed, and a single variable and multivariable prediction model based on emotional tendency of trusted event information is proposed to predict commodity price and real estate stock market. This paper provides a general collection interface for news data acquisition from search engines, a price data collector based on Scrapy framework and a text trusted classification model. It solves the problems of the universality of collecting text data in different fields, the collection of dynamic web page information and the credibility of text. Researchers only need to provide keyword and price data to collect Xpath path according to the requirements of interface documents in order to collect text data and related price data conveniently. The obtained network text data can be calculated by trusted classification and the affective tendency factors can be calculated and used in the research of prediction algorithm. On the basis of text analysis, this paper proposes a single-variable and multi-variable price prediction algorithm based on the emotional tendency of trusted events. In the time series algorithm, the data samples need to be stable and non-trend, so the fitting and detection of the data is carried out and the difference method is used to deal with the trend and stability of the sample data. In the univariate prediction model, the SSA-ARMA model is proposed by combining the affective tendency factors of the trusted text, and the best number of cycles for regression and movement of the model is obtained by training. The error of the new model is reduced and the prediction effect is improved obviously. In order to further explain the influence of the emotional tendency of the text, the MSA-VAR multivariable prediction model is proposed to analyze the impulse response and volatility of multiple variables in the real estate stock market, which shows that the emotional factors play a significant role in the closing price. Experiments show that the MSA-VAR model has good prediction effect and robustness. Finally, the application software of mobile device price prediction is realized by using the algorithm, which has high practical value.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1

【參考文獻(xiàn)】

相關(guān)期刊論文 前8條

1 趙麗;;工信部:中國4G基站規(guī)模超200萬個4G用戶數(shù)突破5億[J];郵電設(shè)計技術(shù);2016年06期

2 徐健;;基于網(wǎng)絡(luò)用戶情感分析的預(yù)測方法研究[J];中國圖書館學(xué)報;2013年03期

3 李正茂;;李正茂:2020年互聯(lián)網(wǎng)數(shù)據(jù)量將是目前的44倍[J];信息系統(tǒng)工程;2011年06期

4 趙妍妍;秦兵;劉挺;;文本情感分析[J];軟件學(xué)報;2010年08期

5 張紫瓊;葉強(qiáng);李一軍;;互聯(lián)網(wǎng)商品評論情感分析研究綜述[J];管理科學(xué)學(xué)報;2010年06期

6 況夯;羅軍;;基于遺傳FCM算法的文本聚類[J];計算機(jī)應(yīng)用;2009年02期

7 鄧琦;蘇一丹;曹波;閉劍婷;;中文文本體裁分類中特征選擇的研究[J];計算機(jī)工程;2008年23期

8 許高建;胡學(xué)鋼;路遙;涂立靜;;一種改進(jìn)的文本特征選擇方法的研究與設(shè)計[J];微型電腦應(yīng)用;2008年05期

,

本文編號:2197930

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2197930.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶d9109***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com