Cox模型中的變量選擇方法及股票市場實證研究
發(fā)布時間:2018-04-13 13:25
本文選題:Cox比例風險回歸模型 + 變量選擇; 參考:《中南財經政法大學》2017年碩士論文
【摘要】:近年來,生存分析方法與技術廣泛應用于流行病學和臨床醫(yī)學,研究者們逐漸將其引入到人口統(tǒng)計學、保險精算學、經濟學等領域,但這些方法在金融領域的應用還不算多,本文運用Cox比例風險回歸模型,來研究股票交易數(shù)據(jù),以滬深300指數(shù)的基本成分股為樣本,意圖找出影響股票生存期的重要因素,并比較Cox模型中的變量選擇方法的優(yōu)劣,以期找到更合適的方法來研究股票市場。首先,分協(xié)變量之間相互獨立和協(xié)變量之間存在相關關系兩種情形,進行數(shù)值模擬實驗,探究在Cox比例風險回歸模型基礎上,Lasso方法和Elastic Net方法的變量選擇效果,并驗證Elastic Net方法的組效應性質,為針對滬深300指基本成分股股票數(shù)據(jù)的實證分析做準備。然后,運用國泰君安數(shù)據(jù)庫收集每支股的30個財務指標,以2016年第一季度作為觀測時間,并定義滬深300指數(shù)的股票生存期,得到每支股票在該季度的生存期和生存狀態(tài),整理出所需要的基本股票數(shù)據(jù)。通過分析2016年第一季度的股票研究數(shù)據(jù),得出30個財務指標的相關系數(shù),并進行協(xié)變量的描述性統(tǒng)計分析,了解協(xié)變量的基本數(shù)據(jù)特征。隨后分別利用Cox逐步回歸方法、Lasso方法和Elastic Net方法這三種方法進行實證分析,求解算法運用了坐標下降算法,并運用10折交叉驗證方法尋找合適的參數(shù)值,從而得到影響股票生存期的重要協(xié)變量,并分析其影響作用的程度與方向。最后,比較這三種實證方法的優(yōu)劣,總結三種方法選擇出來的共同的重要協(xié)變量,發(fā)現(xiàn)Lasso變量選擇方法和Elastic Net方法的變量選擇效果比Cox逐步回歸方法好,Lasso方法和Elastic Net方法選擇的協(xié)變量比Cox逐步回歸方法要精簡,沒有多余的變量。通過Cox逐步回歸方法選擇出的變量存在多重共線性,說明此方法不太適用于自變量之間存在相關關系的情況,而Lasso方法選擇出來的變量沒有相關關系,說明當自變量之間存在共線性時,該方法能較好地處理這種情況。Elastic Net方法具有一個顯著的特征,即組效應性質,即能將具有相關關系甚至是強相關的協(xié)變量共同選入模型,而Lasso方法沒有這種性質,它只能在具有相關關系的變量之間選出一個進入模型,不能同時將協(xié)變量選入。特別是當數(shù)據(jù)呈現(xiàn)高維度、小樣本、強相關的特征時,Elastic Net方法更加優(yōu)于Lasso方法。在擬合效果方面,Lasso方法和Elastic Net方法優(yōu)于Cox逐步回歸法,而Lasso方法的模型擬合效果最好。
[Abstract]:In recent years, survival analysis methods and techniques have been widely used in epidemiology and clinical medicine. Researchers have gradually introduced them into the fields of demography, insurance actuarial science, economics and so on, but these methods have not been widely used in the field of finance.To find a more appropriate way to study the stock market.First of all, in the case of independent covariables and correlation between covariables, numerical simulation experiments are carried out to explore the effect of variable selection based on Cox proportional risk regression model and Elastic Net method.The group effect property of Elastic Net method is verified to prepare for the empirical analysis of Shanghai and Shenzhen 300 index basic component stock data.Then, using the Guotai Junan database to collect 30 financial indicators of each stock, taking the first quarter of 2016 as the observation time, and defining the stock survival period of the CSI 300 index, we can get the survival period and survival status of each stock in that quarter.Sort out the basic stock data you need.By analyzing the stock research data in the first quarter of 2016, the correlation coefficients of 30 financial indexes are obtained, and the descriptive statistical analysis of the covariables is carried out to understand the basic data characteristics of the covariables.Then the Cox stepwise regression method and Elastic Net method are used for empirical analysis. The coordinate descent algorithm is used to solve the problem, and the 10 fold cross-validation method is used to find the appropriate parameter value.An important covariable influencing stock life is obtained, and the degree and direction of its influence are analyzed.Finally, the advantages and disadvantages of the three empirical methods are compared, and the common important covariables selected by the three methods are summarized.It is found that the selection effect of Lasso variable selection method and Elastic Net method is better than that of Cox stepwise regression method and Elastic Net method is simpler than Cox stepwise regression method.The variables selected by Cox stepwise regression method have multiple collinearity, which shows that this method is not suitable for the case where independent variables have correlation relations, but the variables selected by Lasso method have no correlation relationship.It is shown that when there is collinearity between independent variables, the method can well deal with this case. Elastic Net method has a remarkable characteristic, that is, the group effect property, that is, the covariables with correlation and even strong correlation can be selected into the model together.The Lasso method does not have this property. It can only select one entry model between the variables with correlation, and can not select the covariable at the same time.Especially when the data show high dimension, small sample and strong correlation, the Elastic Net method is better than the Lasso method.In terms of fitting effect, the Lasso method and Elastic Net method are better than Cox stepwise regression method, while Lasso method has the best model fitting effect.
【學位授予單位】:中南財經政法大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:F832.51
【參考文獻】
相關期刊論文 前10條
1 李春紅;韋新星;;Elastic Net方法在Cox模型變量選擇中的研究[J];西南大學學報(自然科學版);2015年07期
2 賀筱君;陳俊男;吳佳懋;;生存分析在股市期市漲跌預測中的應用[J];數(shù)量經濟技術經濟研究;2014年12期
3 王娉;郭鵬江;夏志明;;Logistic模型中參數(shù)的自適應Lasso估計[J];西北大學學報(自然科學版);2012年05期
4 劉睿智;杜n,
本文編號:1744706
本文鏈接:http://sikaile.net/jingjilunwen/huobiyinxinglunwen/1744706.html
教材專著