天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

回歸模型中變量選擇的若干問題研究

發(fā)布時(shí)間:2018-05-19 16:52

  本文選題:變量選擇 + Gamma分布。 參考:《蘭州交通大學(xué)》2017年碩士論文


【摘要】:在多元線性回歸建模中,自變量的選擇至關(guān)重要,一般從預(yù)測(cè)的準(zhǔn)確性和模型的可解釋性兩個(gè)方面進(jìn)行約束自變量個(gè)數(shù)的選擇.數(shù)目眾多的自變量可以反映更多響應(yīng)變量的信息,從而達(dá)到更高的預(yù)測(cè)準(zhǔn)確性,然而太多的自變量將導(dǎo)致模型可解釋性減弱,應(yīng)用價(jià)值大打折扣;自變量的太少的話,不足以反映響應(yīng)變量的信息,因而預(yù)測(cè)準(zhǔn)確性顯著降低.變量選擇問題的研究中,大多是在普通最小二乘法的基礎(chǔ)上,附加關(guān)于待估計(jì)參數(shù)的約束條件,也就是增加懲罰函數(shù),轉(zhuǎn)化為懲罰最小二乘法.由于約束條件的壓縮作用,會(huì)使得部分待估計(jì)參數(shù)變?yōu)?,從而實(shí)現(xiàn)變量選擇的目的.此類方法中的常用經(jīng)典算法有LASSO算法、適應(yīng)性LASSO算法、SCAD算法以及彈性網(wǎng)算法.本文考慮待估計(jì)參數(shù)受到隨機(jī)因素的影響前提下,建立了新的懲罰函數(shù)及懲罰最小二乘估計(jì)方法,并對(duì)該方法進(jìn)行評(píng)價(jià),具體內(nèi)容如下:首先,系統(tǒng)介紹了變量選擇方法的發(fā)展過(guò)程、通過(guò)添加懲罰函數(shù)來(lái)實(shí)現(xiàn)變量選擇的基本思想;詳細(xì)分析了LASSO算法、適應(yīng)性LASSO算法、SCAD算法以及彈性網(wǎng)算法的建立過(guò)程和各自的優(yōu)缺點(diǎn):由于LASSO算法中懲罰函數(shù)的特性,導(dǎo)致在變量選擇時(shí)選取的自變量個(gè)數(shù)偏多,同時(shí)存在多重共線時(shí)LASSO算法效果很差,于是適應(yīng)性LASSO算法在LASSO的基礎(chǔ)上進(jìn)行改進(jìn),使得估計(jì)所得系數(shù)更加稀疏,選擇更少的自變量;SCAD算法效果更加更明顯,不僅可以選擇更少的自變量,同時(shí)所得估計(jì)量滿足稀疏性、無(wú)偏性、連續(xù)性以及Oracle等一系列優(yōu)良性質(zhì);彈性網(wǎng)方法是將LASSO與經(jīng)典的嶺回歸法結(jié)合而建立的新的變量選擇方法,該方法主要優(yōu)勢(shì)在于處理自變量中出現(xiàn)組效應(yīng)時(shí)的情形.其次,考慮到Gamma分布和Weibull分布是兩類重要的壽命分布類,具有廣泛的應(yīng)用,于是分別假定參數(shù)受到的隨機(jī)影響因素服從Gamma分布和Weibull分布,建立了新的懲罰函數(shù)以及懲罰最小二乘估計(jì)方法.文中通過(guò)層次極大似然估計(jì)法構(gòu)造新的懲罰函數(shù),討論了懲罰函數(shù)性質(zhì),給出了參數(shù)估計(jì)的方法并證明新建立的懲罰最小二乘量滿足Oracle性質(zhì).最后,通過(guò)案例分析對(duì)新建立的變量選擇方法進(jìn)行評(píng)價(jià).文中以均方誤差和平均絕對(duì)誤差作為評(píng)價(jià)指標(biāo),選取了以往文獻(xiàn)中使用的經(jīng)典案例進(jìn)行分析,計(jì)算各評(píng)價(jià)指標(biāo),并和LASSO算法、適應(yīng)性LASSO算法、SCAD算法以及彈性網(wǎng)算法計(jì)算的結(jié)果進(jìn)行對(duì)比,我們發(fā)現(xiàn),新建立的算法處理稀疏情形優(yōu)勢(shì)明顯,均優(yōu)于其他算法,而對(duì)于非稀疏情形,效果和適應(yīng)性LASSO算法差異不大.
[Abstract]:In multivariate linear regression modeling, the selection of independent variables is very important. In general, the selection of the number of independent variables is constrained from two aspects of the accuracy of prediction and the interpretability of the model. A large number of independent variables can reflect more information of the response variables, thus achieving higher prediction accuracy. However, too many independent variables will lead to the model. The type of interpretability is weakened and the application value is discounted; too few of the independent variables are not sufficient to reflect the information of the response variables, so the accuracy of the prediction is significantly reduced. In the study of the selection of variables, the constraints of the parameters to be estimated are added to the general least square method, which is to increase the penalty function. In order to punish the least square method, due to the compression of the constraint conditions, some parameters to be estimated will be changed to 0 to achieve the purpose of variable selection. The commonly used classical algorithms in this kind of method have LASSO algorithm, adaptive LASSO algorithm, SCAD algorithm and elastic network algorithm. A new penalty function and a penalty least square estimation method are established, and the method is evaluated. The specific contents are as follows: firstly, the development process of variable selection method is introduced, and the basic idea of variable selection is realized by adding penalty function. The LASSO algorithm, adaptive LASSO algorithm, SCAD algorithm and elastic network are analyzed in detail. The process of building the algorithm and its advantages and disadvantages: because of the characteristics of the penalty function in the LASSO algorithm, the number of independent variables selected in the selection of variables is much more than that of the variable selection. At the same time, the effect of the LASSO algorithm is very poor when there is multiple Coline. So the adaptive LASSO algorithm is modified on the basis of LASSO, making the estimated coefficient more sparse and less choice. The effect of SCAD algorithm is more obvious, not only can choose less independent variables, but also the estimated quantity satisfies a series of excellent properties such as sparsity, unbiased, continuous and Oracle. The elastic network method is a new variable selection method which combines the LASSO with the classical ridge regression method. The main advantage of this method lies in the advantages of the method. Second, the Gamma distribution and Weibull distribution are two classes of important life distribution classes, which are widely used. Therefore, the random influence factors of the parameters are assumed to be subject to the Gamma distribution and the Weibull distribution, and a new penalty penalty function and a penalty least square estimation method are established. This paper constructs a new penalty function by hierarchical maximum likelihood estimation, discusses the property of penalty function, gives the method of parameter estimation and proves that the newly established penalty least squares satisfy the Oracle property. Finally, the new variable selection method is evaluated by case analysis. The mean square error and the mean absolute error are used as the evaluation. By analyzing the classic cases used in the previous literature and calculating the evaluation indexes, we compare the results with the LASSO algorithm, the adaptive LASSO algorithm, the SCAD algorithm and the elastic network algorithm. We find that the new algorithm is superior to other algorithms in dealing with the sparse situation and is better than the other algorithms, but for the non sparse case, There is little difference between the effect and the adaptive LASSO algorithm.
【學(xué)位授予單位】:蘭州交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:F224

【參考文獻(xiàn)】

相關(guān)博士學(xué)位論文 前3條

1 袁晶;貝葉斯方法在變量選擇問題中的應(yīng)用[D];山東大學(xué);2013年

2 劉吉彩;生存數(shù)據(jù)統(tǒng)計(jì)模型的變量選擇方法[D];華東師范大學(xué);2014年

3 樊亞莉;穩(wěn)健變量選擇方法的若干問題研究[D];復(fù)旦大學(xué);2013年

相關(guān)碩士學(xué)位論文 前1條

1 高少龍;幾種變量選擇方法的模擬研究和實(shí)證分析[D];山東大學(xué);2014年

,

本文編號(hào):1910897

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/jingjifazhanlunwen/1910897.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶d89ec***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com