基于XGBoost算法的多因子量化選股方案策劃
本文關(guān)鍵詞:基于XGBoost算法的多因子量化選股方案策劃 出處:《上海師范大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 量化選股 XGBoost 多因子 方案策劃
【摘要】:近年來(lái),量化投資憑著其紀(jì)律性、系統(tǒng)性、及時(shí)性及分散化的特點(diǎn),日益受到機(jī)構(gòu)投資者和對(duì)沖基金的重視。同時(shí),我國(guó)證券投資市場(chǎng)的規(guī)模和證券開戶數(shù)都在迅猛的增加,從我國(guó)證券市場(chǎng)有效性和國(guó)外證券市場(chǎng)的發(fā)展經(jīng)驗(yàn)來(lái)看,量化投資的發(fā)展前景毋庸置疑且值得期待。盡管如此,目前國(guó)內(nèi)量化投資產(chǎn)品依然存在總體規(guī)模小、量化策略單一、策略業(yè)績(jī)分化等缺點(diǎn)。此時(shí),研究新的量化投資方式和挖掘新的建模思路的重要性對(duì)于豐富量化投資產(chǎn)品,提升市場(chǎng)規(guī)模,推動(dòng)量化投資的發(fā)展意義重大。在眾多的量化策略中,多因子選股策略憑借其穩(wěn)定性和覆蓋廣等優(yōu)勢(shì)被許多研究者關(guān)注。多因子選股量化策略方案主要致力于解決多因子的選取夠全面,其次是分類模型有良好的泛化能力,基于此兩大方向,本文都進(jìn)行了一定的優(yōu)化和改進(jìn),其一本文首次相對(duì)全面的搜集了因子數(shù)據(jù),除了大部分研究者使用的財(cái)務(wù)、紅利、動(dòng)量等因子,總共使用了307個(gè)因子,我還加入了規(guī)模、估值、宏觀、債券和樓市相關(guān)因子;其二本文首次使用較為新穎的XGBoost提升算法,此算法的主要優(yōu)勢(shì)是:XGBoost支持線性分類器,而且自帶L1和L2正則化項(xiàng)的邏輯回歸或者線性回歸。其次,XGBoost在代價(jià)函數(shù)里加入了正則項(xiàng),使學(xué)習(xí)出來(lái)的模型更加簡(jiǎn)單,防止過(guò)擬合;最后,XGBoost借鑒了隨機(jī)森林的做法,支持列抽樣,不僅能降低過(guò)擬合,還能減少計(jì)算,并且XGBoost工具支持并行,速度較快。并比較了SVM、隨機(jī)森林和XGBoost三種算法的優(yōu)缺點(diǎn)和建模交過(guò)對(duì)比,證實(shí)XGBoost算法效果和穩(wěn)定性最好;其三,本文改變了以往的因子篩選方式以及建模流程,使用邊訓(xùn)練邊篩選的方式,篩選的方法更為科學(xué)合理;谝陨喜邉澦悸,最后成功設(shè)計(jì)出了利用機(jī)器學(xué)習(xí)的方法量化選股,并取得了超越滬深300指數(shù)的超額收益率的多因子量化選股方案,經(jīng)過(guò)23個(gè)持有期所選出的股票組合的總收益為287%,年化復(fù)合收益率高達(dá)127%,夏普比率為0.91,信息比率為2.41,有82%的季度跑贏滬深300指數(shù),有59%的季度取得正收益,最后凈值達(dá)到3.87,遠(yuǎn)超基準(zhǔn)滬深300指數(shù)收益率。
[Abstract]:In recent years, quantitative investment, with its characteristics of discipline, systematization, timeliness and decentralization, has been paid more and more attention by institutional investors and hedge funds. The scale of China's securities investment market and the number of securities account opening are increasing rapidly. From the perspective of the effectiveness of China's securities market and the development experience of foreign securities market. The development prospect of quantitative investment is beyond doubt and worthy of expectation. However, at present, domestic quantitative investment products still have some shortcomings, such as small overall scale, single quantitative strategy, differentiation of strategy performance, and so on. It is of great significance to study the new quantitative investment mode and the importance of mining new modeling ideas for enriching the quantitative investment products, improving the market scale and promoting the development of quantitative investment. Multi-factor stock selection strategy has been concerned by many researchers because of its stability and wide coverage. The multi-factor stock selection strategy is focused on solving the problem of multi-factor selection. Secondly, the classification model has a good generalization ability. Based on these two directions, this paper has carried out certain optimization and improvement. First, this paper has collected the factor data relatively comprehensively for the first time. In addition to the financial, dividend, momentum and other factors used by most researchers, a total of 307 factors were used, and I added scale, valuation, macro, bond and housing related factors; Second, this paper first uses a novel XGBoost lifting algorithm, the main advantage of this algorithm is that the XGBoost boost support linear classifier. And the logical regression or linear regression with L1 and L2 regularization terms. Secondly, XGBoost adds a regular term to the cost function, which makes the learning model simpler and prevents over-fitting. Finally, XGBoost draws on the random forest approach and supports column sampling, which can not only reduce over-fitting but also reduce computation, and the XGBoost tool supports parallelism. The advantages and disadvantages of SVM, stochastic forest and XGBoost are compared with each other, and it is proved that XGBoost algorithm is the best in effect and stability. Third, this paper changes the previous factor screening method and modeling process, using the method of training while screening, the screening method is more scientific and reasonable. Based on the above planning ideas. Finally, the paper successfully designs the method of machine learning to quantify stock selection, and obtains a multi-factor quantification stock selection scheme that surpasses the Shanghai and Shenzhen 300 index. The total income of the stock portfolio selected after 23 holding periods is 2870.The annualized compound yield is as high as 127. The Sharp ratio is 0.91and the information ratio is 2.41. 82% beat the CSI 300 index for the quarter, and 59% for the quarter, with a net worth of 3.87, well ahead of the benchmark CSI 300 yield.
【學(xué)位授予單位】:上海師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:F832.51;F224
【參考文獻(xiàn)】
相關(guān)期刊論文 前9條
1 陳健;宋文達(dá);;量化投資的特點(diǎn)、策略和發(fā)展研究[J];時(shí)代金融;2016年29期
2 何亞莉;;論量化投資對(duì)中國(guó)資本市場(chǎng)的影響[J];現(xiàn)代商貿(mào)工業(yè);2016年19期
3 王淑燕;曹正鳳;陳銘芷;;隨機(jī)森林在量化選股中的應(yīng)用研究[J];運(yùn)籌與管理;2016年03期
4 李姝錦;胡曉旭;王聰;;淺析基于大數(shù)據(jù)的多因子量化選股策略[J];經(jīng)濟(jì)研究導(dǎo)刊;2016年17期
5 董素娟;;國(guó)內(nèi)量化產(chǎn)品分類及現(xiàn)狀[J];新經(jīng)濟(jì);2016年06期
6 楊喻欽;;基于Alpha策略的量化投資研究[J];中國(guó)市場(chǎng);2015年25期
7 唐煒怡;孟小菊;鄢方方;;量化投資盛行對(duì)中國(guó)資本市場(chǎng)的影響[J];經(jīng)營(yíng)管理者;2013年31期
8 方浩文;;量化投資發(fā)展趨勢(shì)及其對(duì)中國(guó)的啟示[J];管理現(xiàn)代化;2012年05期
9 王博;;國(guó)內(nèi)量化基金現(xiàn)狀分析及展望[J];經(jīng)濟(jì)視角(下);2011年11期
相關(guān)博士學(xué)位論文 前1條
1 汪東;基于支持向量機(jī)的選時(shí)和選股研究[D];上海交通大學(xué);2007年
相關(guān)碩士學(xué)位論文 前6條
1 張偉;支持向量分類機(jī)(SVC)在量化選股中的應(yīng)用[D];山東大學(xué);2014年
2 王昭棟;多因子選股模型在中國(guó)股票市場(chǎng)的實(shí)證分析[D];山東大學(xué);2014年
3 盧鈺;基于參數(shù)優(yōu)化的支持向量機(jī)股票市場(chǎng)趨勢(shì)預(yù)測(cè)[D];浙江工商大學(xué);2013年
4 許芳;基于支持向量機(jī)的對(duì)優(yōu)質(zhì)股票選取的研究[D];重慶交通大學(xué);2013年
5 江方敏;基于多因子量化模型的A股投資組合選股分析[D];西南交通大學(xué);2013年
6 陳軍華;基于多分類支持向量機(jī)的選股模型研究[D];華中科技大學(xué);2010年
,本文編號(hào):1392309
本文鏈接:http://sikaile.net/jingjilunwen/huobiyinxinglunwen/1392309.html