基于預(yù)期偏差的突發(fā)金融文本分類方法研究
發(fā)布時間:2018-07-21 10:21
【摘要】:伴隨著中國經(jīng)濟(jì)的發(fā)展,金融市場與人們的生活越來越息息相關(guān)。研究表明突發(fā)金融信息會迅速給金融市場帶來強(qiáng)烈擾動影響,而隨著互聯(lián)網(wǎng)技術(shù)和社交網(wǎng)絡(luò)的快速發(fā)展,這種影響會被大幅放大。通常,對于利好信息,股票價(jià)格呈快速上揚(yáng)的趨勢,對于利空信息,股票價(jià)格往往呈現(xiàn)下挫趨勢。然而近來,證券市場在面對重要利好信息時,價(jià)格反而呈現(xiàn)出總體下挫的趨勢,這對傳統(tǒng)基于金融信息挖掘的方法帶來較大沖擊。傳統(tǒng)文本分類方法在這種情況下并不能對金融新聞作出準(zhǔn)確的分類。原因在于,傳統(tǒng)的分類方法通常將研究重點(diǎn)放在分類模型本身上,把文本特征作為模型輸入預(yù)測文本類標(biāo)。針對這個問題,本文提出了基于預(yù)期偏差的金融文本分類方法。在提出預(yù)期偏差概念的基礎(chǔ)上,通過主題模型對文本做主題匹配,然后通過描述性詞典對新聞做預(yù)期偏差計(jì)算,最后得到基于預(yù)期偏差的分類模型對文本進(jìn)行分類。本文主要研究工作及成果概況如下。首先,本文采用強(qiáng)擾動共振過濾及K-means文本聚類過濾的方法從大量新聞文本中抽取有效用的突發(fā)新聞,實(shí)現(xiàn)了新聞文本初篩過程。其次,針對常用文本分類方法分類效果較差的問題,本文提出了基于預(yù)期偏差的分類方法。通過分析LDA主題模型,提出了新聞文本主題之間匹配的概念。利用新聞文本主題聚類結(jié)果作為先驗(yàn)分布,預(yù)測新聞文本的主題并計(jì)算新聞文本主題之間的相似度。在主題相似的基礎(chǔ)上,繼而提出基于詞典的新聞文本之間偏差程度的度量方法,度量新聞文本之間的偏差。最后,本文結(jié)合LDA新聞主題匹配以及新聞之間的偏差程度的度量兩方面內(nèi)容,構(gòu)造分類模型,用于對新聞文本的分類。實(shí)驗(yàn)結(jié)果表明,在金融市場異常的情況下,通過本文提出的文本分類方法對新聞進(jìn)行分類時,能夠獲得更準(zhǔn)確的分類效果。
[Abstract]:With the development of Chinese economy, financial market is more and more closely related to people's life. Research shows that sudden financial information can bring a strong disturbance to the financial market quickly, but with the rapid development of Internet technology and social networks, this impact will be greatly amplified. Usually, for the positive information, stock prices tend to rise rapidly, and for bearish information, stock prices tend to decline. However, in the face of the important good information, the price of the securities market has shown an overall downward trend, which has a great impact on the traditional methods based on financial information mining. In this case, the traditional text classification method can not make an accurate classification of financial news. The reason is that the traditional classification methods usually focus on the classification model itself and use the text feature as the input of the model to predict the text class. To solve this problem, this paper proposes a financial text classification method based on expected deviation. On the basis of putting forward the concept of expected deviation, the text is matched by topic model, and then the expected deviation of news is calculated by descriptive dictionary. Finally, a classification model based on expected deviation is obtained to classify text. The main research work and results of this paper are as follows. Firstly, this paper uses strong perturbed resonance filtering and K-means text clustering filtering to extract useful burst news from a large number of news texts, and realizes the process of initial screening of news texts. Secondly, aiming at the poor classification effect of common text classification methods, this paper proposes a classification method based on expected deviation. Through the analysis of LDA topic model, the concept of topic matching between news texts is put forward. The topic clustering result of news text is used as a priori distribution to predict the topic of news text and calculate the similarity between the topics of news text. On the basis of the similarity of topics, a method of measuring the degree of deviation between news texts based on dictionaries is proposed to measure the deviation between news texts. Finally, combining LDA news topic matching and the measurement of news deviation degree, this paper constructs a classification model to classify news texts. The experimental results show that, in the case of financial market anomalies, a more accurate classification effect can be obtained when the text classification method proposed in this paper is used to classify news.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1
本文編號:2135209
[Abstract]:With the development of Chinese economy, financial market is more and more closely related to people's life. Research shows that sudden financial information can bring a strong disturbance to the financial market quickly, but with the rapid development of Internet technology and social networks, this impact will be greatly amplified. Usually, for the positive information, stock prices tend to rise rapidly, and for bearish information, stock prices tend to decline. However, in the face of the important good information, the price of the securities market has shown an overall downward trend, which has a great impact on the traditional methods based on financial information mining. In this case, the traditional text classification method can not make an accurate classification of financial news. The reason is that the traditional classification methods usually focus on the classification model itself and use the text feature as the input of the model to predict the text class. To solve this problem, this paper proposes a financial text classification method based on expected deviation. On the basis of putting forward the concept of expected deviation, the text is matched by topic model, and then the expected deviation of news is calculated by descriptive dictionary. Finally, a classification model based on expected deviation is obtained to classify text. The main research work and results of this paper are as follows. Firstly, this paper uses strong perturbed resonance filtering and K-means text clustering filtering to extract useful burst news from a large number of news texts, and realizes the process of initial screening of news texts. Secondly, aiming at the poor classification effect of common text classification methods, this paper proposes a classification method based on expected deviation. Through the analysis of LDA topic model, the concept of topic matching between news texts is put forward. The topic clustering result of news text is used as a priori distribution to predict the topic of news text and calculate the similarity between the topics of news text. On the basis of the similarity of topics, a method of measuring the degree of deviation between news texts based on dictionaries is proposed to measure the deviation between news texts. Finally, combining LDA news topic matching and the measurement of news deviation degree, this paper constructs a classification model to classify news texts. The experimental results show that, in the case of financial market anomalies, a more accurate classification effect can be obtained when the text classification method proposed in this paper is used to classify news.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 陳立中;趙萌;;證券投資基金的反饋交易行為:存在性檢驗(yàn)及對股價(jià)波動的影響[J];金融經(jīng)濟(jì)學(xué)研究;2013年01期
2 張秋麗;;淺議證券投資基金對證券市場的實(shí)際影響[J];經(jīng)濟(jì)論壇;2011年07期
,本文編號:2135209
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2135209.html
最近更新
教材專著