基于預期偏差的突發(fā)金融文本分類方法研究

發(fā)布時間：2018-07-21 10:21

【摘要】：伴隨著中國經濟的發(fā)展,金融市場與人們的生活越來越息息相關。研究表明突發(fā)金融信息會迅速給金融市場帶來強烈擾動影響,而隨著互聯網技術和社交網絡的快速發(fā)展,這種影響會被大幅放大。通常,對于利好信息,股票價格呈快速上揚的趨勢,對于利空信息,股票價格往往呈現下挫趨勢。然而近來,證券市場在面對重要利好信息時,價格反而呈現出總體下挫的趨勢,這對傳統(tǒng)基于金融信息挖掘的方法帶來較大沖擊。傳統(tǒng)文本分類方法在這種情況下并不能對金融新聞作出準確的分類。原因在于,傳統(tǒng)的分類方法通常將研究重點放在分類模型本身上,把文本特征作為模型輸入預測文本類標。針對這個問題,本文提出了基于預期偏差的金融文本分類方法。在提出預期偏差概念的基礎上,通過主題模型對文本做主題匹配,然后通過描述性詞典對新聞做預期偏差計算,最后得到基于預期偏差的分類模型對文本進行分類。本文主要研究工作及成果概況如下。首先,本文采用強擾動共振過濾及K-means文本聚類過濾的方法從大量新聞文本中抽取有效用的突發(fā)新聞,實現了新聞文本初篩過程。其次,針對常用文本分類方法分類效果較差的問題,本文提出了基于預期偏差的分類方法。通過分析LDA主題模型,提出了新聞文本主題之間匹配的概念。利用新聞文本主題聚類結果作為先驗分布,預測新聞文本的主題并計算新聞文本主題之間的相似度。在主題相似的基礎上,繼而提出基于詞典的新聞文本之間偏差程度的度量方法,度量新聞文本之間的偏差。最后,本文結合LDA新聞主題匹配以及新聞之間的偏差程度的度量兩方面內容,構造分類模型,用于對新聞文本的分類。實驗結果表明,在金融市場異常的情況下,通過本文提出的文本分類方法對新聞進行分類時,能夠獲得更準確的分類效果。
[Abstract]:With the development of Chinese economy, financial market is more and more closely related to people's life. Research shows that sudden financial information can bring a strong disturbance to the financial market quickly, but with the rapid development of Internet technology and social networks, this impact will be greatly amplified. Usually, for the positive information, stock prices tend to rise rapidly, and for bearish information, stock prices tend to decline. However, in the face of the important good information, the price of the securities market has shown an overall downward trend, which has a great impact on the traditional methods based on financial information mining. In this case, the traditional text classification method can not make an accurate classification of financial news. The reason is that the traditional classification methods usually focus on the classification model itself and use the text feature as the input of the model to predict the text class. To solve this problem, this paper proposes a financial text classification method based on expected deviation. On the basis of putting forward the concept of expected deviation, the text is matched by topic model, and then the expected deviation of news is calculated by descriptive dictionary. Finally, a classification model based on expected deviation is obtained to classify text. The main research work and results of this paper are as follows. Firstly, this paper uses strong perturbed resonance filtering and K-means text clustering filtering to extract useful burst news from a large number of news texts, and realizes the process of initial screening of news texts. Secondly, aiming at the poor classification effect of common text classification methods, this paper proposes a classification method based on expected deviation. Through the analysis of LDA topic model, the concept of topic matching between news texts is put forward. The topic clustering result of news text is used as a priori distribution to predict the topic of news text and calculate the similarity between the topics of news text. On the basis of the similarity of topics, a method of measuring the degree of deviation between news texts based on dictionaries is proposed to measure the deviation between news texts. Finally, combining LDA news topic matching and the measurement of news deviation degree, this paper constructs a classification model to classify news texts. The experimental results show that, in the case of financial market anomalies, a more accurate classification effect can be obtained when the text classification method proposed in this paper is used to classify news.
【學位授予單位】：哈爾濱工業(yè)大學
【學位級別】：碩士
【學位授予年份】：2017
【分類號】：TP391.1

【參考文獻】

相關期刊論文前2條

1 陳立中;趙萌;;證券投資基金的反饋交易行為:存在性檢驗及對股價波動的影響[J];金融經濟學研究;2013年01期

2 張秋麗;;淺議證券投資基金對證券市場的實際影響[J];經濟論壇;2011年07期

，

本文編號：2135209

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2135209.html

上一篇：基于詞三角的短文本主題模型算法
下一篇：化學種態(tài)分析軟件CHEMSPEC最新進展

論文發(fā)表

·知網|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于預期偏差的突發(fā)金融文本分類方法研究