基于XGBoost算法的上證指數(shù)預(yù)測方案設(shè)計研究
本文選題:數(shù)據(jù)挖掘 切入點:上證指數(shù) 出處:《上海師范大學(xué)》2017年碩士論文
【摘要】:數(shù)據(jù)挖掘技術(shù)產(chǎn)生于20世紀(jì)80年代后期,90年代有了突飛猛進(jìn)的發(fā)展,隨著技術(shù)的不斷成熟,越來越多的學(xué)者將其廣泛運(yùn)用于不同的領(lǐng)域;其中,與金融領(lǐng)域的結(jié)合能夠給廣大投資者帶來額外收益;股票市場是一個受多方信息影響的復(fù)雜系統(tǒng),股市的漲跌由于其高度不穩(wěn)定性,更是難以預(yù)測。投資者面對大量的股市信息,通常希望能夠利用已知的歷史信息運(yùn)用某種方式對未來的市場漲跌進(jìn)行預(yù)測,以應(yīng)用于投資,獲得超額收益。面對巨大的信息量,人工進(jìn)行處理顯然不現(xiàn)實:花費的成本也過于昂貴;所以有許多學(xué)者運(yùn)用例如支持向量機(jī)、BP神經(jīng)網(wǎng)絡(luò)等機(jī)器學(xué)習(xí)的方法來對股市的漲跌進(jìn)行預(yù)測;這一領(lǐng)域逐漸成為近兩年待解決的熱點問題;但是支持向量機(jī)等方法有一定的局限性,為了達(dá)到最優(yōu)的分類效果,要采用高緯度的平面進(jìn)行分類,這無疑增加了模型的復(fù)雜度;XGBoost算法作為2015年新提出的算法,具有運(yùn)算效率和準(zhǔn)確率高的優(yōu)點,所以作者運(yùn)用這一新的算法對股市漲跌進(jìn)行預(yù)測,為投資者提供一種新的投資決策有效性方案。本文結(jié)合國內(nèi)股票市場和國際上主要的股票指數(shù),運(yùn)用了支持向量機(jī)、決策樹模型和XGBoost算法對上證綜指、上證50指數(shù)、標(biāo)準(zhǔn)普爾指數(shù)的漲跌進(jìn)行預(yù)測;同時,為了盡可能提高支持向量機(jī)、決策樹和XGBoost算法對股市漲跌預(yù)測的效果,作者還把與成交量有關(guān)的數(shù)據(jù)進(jìn)行了處理,使它的數(shù)值與其他指標(biāo)相差不是太大;與此同時,還將XGBoost算法的有關(guān)參數(shù)進(jìn)行了調(diào)優(yōu)。選取了 28個技術(shù)指標(biāo)作為輸入變量,將預(yù)測的第二天的股市漲跌作為分類的輸出變量;利用RStudio軟件進(jìn)行支持向量機(jī)、決策樹和XGBoost建模,并得到了相對合理的實證結(jié)果,結(jié)果顯示XGBoost模型對上證綜指有非常理想的預(yù)測效果,預(yù)測的準(zhǔn)確率達(dá)到了 70%以上,這與XGBoost算法的原理有關(guān),它迭代每次的誤差,達(dá)到最小化平方損失函數(shù),所以比普通算法的準(zhǔn)確率要高;上證50和標(biāo)準(zhǔn)普爾指數(shù)的預(yù)測準(zhǔn)確率達(dá)到了 60%到65%,這可能與這兩個指數(shù)只是選取的一部分股票作為樣本有關(guān);按照趨勢進(jìn)行劃分時,也能夠得到更高的預(yù)測準(zhǔn)確率,運(yùn)用XGBoost算法的預(yù)測結(jié)果進(jìn)行投資,結(jié)果也顯示能夠使投資者獲得理想的超額收益,支持向量機(jī)和決策樹略低,也達(dá)到了 60%以上?梢钥闯,機(jī)器學(xué)習(xí)方法對股市預(yù)測和投資有一定的指導(dǎo)意義。給投資者的決策和政府監(jiān)管提供了一個方便,切實可行的方案。
[Abstract]:Data mining technology emerged in the late 1980s and 1990s with the rapid development of technology, with the continuous maturity of the technology, more and more scholars widely used it in different fields.The combination with the financial field can bring extra income to the majority of investors. The stock market is a complex system affected by many kinds of information, the stock market's rise and fall is more difficult to predict because of its high instability.Faced with a large amount of stock market information, investors usually hope to use known historical information to predict the future market fluctuations in a certain way, in order to apply it to investment and obtain excess returns.Faced with the huge amount of information, it is obviously not realistic to deal with it manually: the cost is too high, so many scholars use machine learning methods such as support vector machine (SVM) and BP neural network to predict the stock market's rise and fall.This field has gradually become a hot issue to be solved in the last two years. However, support vector machine and other methods have some limitations. In order to achieve the optimal classification effect, the high-latitude plane should be used for classification.This undoubtedly increases the complexity of the model and the XGBoost algorithm, as a new algorithm proposed in 2015, has the advantages of high computational efficiency and high accuracy, so the author uses this new algorithm to predict the stock market's ups and downs.To provide investors with a new investment decision effectiveness scheme.This paper uses support vector machine, decision tree model and XGBoost algorithm to forecast the rise and fall of Shanghai Composite Index, Shanghai Stock Exchange 50 Index and Standard & Poor's Index.In order to improve the prediction effect of support vector machine, decision tree and XGBoost algorithm on stock market fluctuation, the author also processed the data related to trading volume so that its value is not too different from other indexes; at the same time,The parameters of XGBoost algorithm are also optimized.28 technical indexes are selected as input variables, and the stock market fluctuation in the second day of forecast is taken as the output variable of classification. The support vector machine, decision tree and XGBoost are used to model the model using RStudio software, and a relatively reasonable empirical result is obtained.The results show that the XGBoost model has a very good prediction effect on the Shanghai Composite Index, and the prediction accuracy is over 70%, which is related to the principle of the XGBoost algorithm. It iterates the error every time to minimize the square loss function.So it's more accurate than the normal algorithm; the accuracy of the Shanghai 50 and Standard & Poor's indices is between 60% and 65%, which may be related to the fact that the two indices are only selected as a sample; when they are divided according to the trend,It can also get higher prediction accuracy, using the prediction results of XGBoost algorithm to invest, the results also show that the investors can get ideal excess returns, support vector machines and decision trees are slightly lower, up to more than 60%.It can be seen that the machine learning method has certain guiding significance for stock market prediction and investment.It provides a convenient and feasible scheme for investors'decision making and government regulation.
【學(xué)位授予單位】:上海師范大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:F831.51
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 李海燕;;基于支持向量機(jī)算法的股市拐點預(yù)測分析[J];鄭州大學(xué)學(xué)報(哲學(xué)社會科學(xué)版);2015年01期
2 楊新斌;黃曉娟;;基于支持向量機(jī)的股票價格預(yù)測研究[J];計算機(jī)仿真;2010年09期
3 高婷婷;;我國股市影響因素模型分析[J];經(jīng)濟(jì)視角(下);2009年05期
4 鄧凱;趙振勇;;基于遺傳BP網(wǎng)絡(luò)的股市預(yù)測模型研究與仿真[J];計算機(jī)仿真;2009年05期
5 張新前;胡日東;;基于CARR模型和GARCH模型的股市波動性預(yù)測研究[J];西安財經(jīng)學(xué)院學(xué)報;2008年02期
6 楊穌;史耀媛;宋恒;;基于支持向量機(jī)的股市時間序列預(yù)測算法[J];科學(xué)技術(shù)與工程;2008年02期
7 鄭梅;苗佳;;Logit模型在上海股市預(yù)測中的應(yīng)用[J];統(tǒng)計與決策;2007年06期
8 王彥峰;高風(fēng);;基于支持向量機(jī)的股市預(yù)測[J];計算機(jī)仿真;2006年11期
9 施燕杰;基于支持向量機(jī)(SVM)的股市預(yù)測方法[J];統(tǒng)計與決策;2005年04期
10 劉q,
本文編號:1717544
本文鏈接:http://sikaile.net/jingjilunwen/jinrongzhengquanlunwen/1717544.html