線性模型與單指標(biāo)模型的若干研究
發(fā)布時(shí)間:2018-04-25 17:31
本文選題:線性模型 + 單指標(biāo)模型。 參考:《重慶大學(xué)》2016年博士論文
【摘要】:穩(wěn)健估計(jì)和變量選擇是統(tǒng)計(jì)建模中非常重要的兩個(gè)方面。變量選擇意味著我們需要尋找真正影響響應(yīng)變量的那些協(xié)變量,從而降低模型復(fù)雜度和提高預(yù)測精度。同時(shí),我們希望所提估計(jì)方法是穩(wěn)健的特別是當(dāng)數(shù)據(jù)存在較多異常值時(shí),從而使得變量選擇結(jié)果不會受到較大的影響。另一方面,縱向數(shù)據(jù)在生物醫(yī)學(xué)、經(jīng)濟(jì)學(xué)、社會學(xué)等領(lǐng)域有著廣泛的應(yīng)用,目前已成為統(tǒng)計(jì)學(xué)研究的熱點(diǎn)問題之一。本文基于線性模型、廣義線性模型、單指標(biāo)模型和單指標(biāo)系數(shù)模型研究了穩(wěn)健估計(jì)、變量選擇和縱向數(shù)據(jù)分析。在第二章中,針對參數(shù)個(gè)數(shù)隨樣本量發(fā)散的線性模型,本章基于SCAD懲罰函數(shù)和秩回歸提出了一種穩(wěn)健的變量選擇方法,該方法能夠有效地克服響應(yīng)變量中異常值或厚尾誤差分布的影響。在一些正則條件下,證明了所提估計(jì)具有相合性和Oracle性質(zhì)。進(jìn)一步,為了克服現(xiàn)有方法的計(jì)算困難,本章提出了能夠快速求解懲罰秩回歸估計(jì)的貪婪坐標(biāo)下降算法。為了處理p(29)n的情形,本章基于距離相關(guān)的獨(dú)立篩選方法提出了兩步估計(jì),同時(shí)證明了兩步估計(jì)具有Oracle性質(zhì)。最后通過數(shù)值模擬驗(yàn)證了本章所提方法的穩(wěn)健性和有效性。在第三章中,針對上一章所考慮的線性模型不能處理離散響應(yīng)變量,本章將研究縱向廣義線性模型的穩(wěn)健估計(jì)與變量選擇。具體地,我們結(jié)合指數(shù)得分函數(shù)和權(quán)函數(shù)構(gòu)造了穩(wěn)健且有效的估計(jì)方程,該估計(jì)方程能夠同時(shí)克服響應(yīng)變量和協(xié)變量中異常值的影響。為了避免解凸優(yōu)化問題,本章構(gòu)建了穩(wěn)健且有效的光滑閾廣義估計(jì)方程同時(shí)實(shí)現(xiàn)參數(shù)估計(jì)與變量選擇。在一些正則條件下,證明了所提估計(jì)具有相合性和Oracle性質(zhì)。進(jìn)一步,通過影響函數(shù)證明了所提估計(jì)是穩(wěn)健的。最后,運(yùn)用數(shù)值模擬以及實(shí)例分析驗(yàn)證了所提估計(jì)的有限樣本性質(zhì)。在第四章中,我們研究了縱向單指標(biāo)模型的估計(jì)問題。首先,通過忽略重復(fù)測量的組內(nèi)相關(guān)性獲得指標(biāo)系數(shù)向量和非參連接函數(shù)的初始估計(jì)。其次,為了避免廣義估計(jì)方程中工作相關(guān)系數(shù)矩陣的估計(jì),本章基于修正的Choleksy分解將協(xié)方差矩陣分解為自回歸系數(shù)和更新方差,然后通過回歸建模的方式獲得它們的估計(jì)。再次,利用剖面加權(quán)最小二乘方法構(gòu)建了指標(biāo)系數(shù)向量和非參連接函數(shù)更有效的兩步估計(jì)。在一些正則條件下,證明了所提估計(jì)的相合性和漸近正態(tài)性。最后,數(shù)值模擬和實(shí)例分析驗(yàn)證了所提方法的優(yōu)越性。在第五章中,針對單指標(biāo)系數(shù)模型,結(jié)合局部線性近似和眾數(shù)回歸提出了穩(wěn)健且有效的估計(jì)方法。在一些正則條件下,建立了所提估計(jì)的相合性和漸近正態(tài)性。進(jìn)一步,討論了最優(yōu)的理論窗寬以及給出了實(shí)際問題中選擇窗寬的辦法,并從理論上證明了所提估計(jì)方法不會損失估計(jì)的效率。最后,數(shù)值模擬驗(yàn)證了所提估計(jì)的穩(wěn)健性和有效性。在第六章中,我們研究了縱向單指標(biāo)系數(shù)模型的估計(jì)問題。由于第五章中非參連接函數(shù)的估計(jì)涉及“欠光滑”窗寬,從而給實(shí)際應(yīng)用中的窗寬選取帶來了挑戰(zhàn)。因此,本章提出了中心化的廣義估計(jì)方程來克服這一問題。為了提高統(tǒng)計(jì)推斷的效率,本章利用修正的Cholesky分解獲得協(xié)方差矩陣的估計(jì),進(jìn)而對指標(biāo)系數(shù)向量構(gòu)建更有效的中心化廣義估計(jì)方程。然后利用加權(quán)最小二乘獲得非參連接函數(shù)更有效的估計(jì)。在一些正則條件下,建立了所提估計(jì)的大樣本性質(zhì)。最后,通過數(shù)值模擬和實(shí)例分析驗(yàn)證了所提方法的有效性和實(shí)用性。
[Abstract]:Robust estimation and variable selection are two important aspects of statistical modeling. Variable selection means that we need to find those covariables that really affect the response variables, thus reducing the complexity of the model and improving the accuracy of the prediction. On the other hand, the longitudinal data is widely used in the fields of biomedicine, economics, sociology and other fields, and it has become one of the hot issues in the research of statistics. This paper studies the robust estimation based on linear model, generalized linear model, single index model and single index coefficient model. In the second chapter, a robust variable selection method is proposed based on SCAD penalty function and rank regression in this chapter. This method can effectively overcome the effect of abnormal value or thick tail error distribution in response variables. Under some regular conditions, the method can effectively overcome the influence of abnormal value or thick tail error distribution in the response variable. In order to overcome the difficulty of computing the existing methods, this chapter proposes a greedy coordinate descent algorithm which can quickly solve the penalty rank regression estimation. In order to deal with the case of P (29) n, this chapter presents a two step estimation based on the distance dependent independent screening method and proves two. The step estimation has Oracle properties. Finally, the robustness and effectiveness of the proposed method in this chapter are verified by numerical simulation. In the third chapter, the linear model considered in the last chapter can not deal with the discrete response variables. In this chapter, we will study the robust estimation and variable selection of the longitudinal generalized linear model. In order to avoid the problem of convex optimization, this chapter constructs a robust and effective smooth threshold generalized estimation equation for the simultaneous realization of parameter estimation and variable selection in order to avoid the problem of convex optimization. Under some regular conditions, it is proved under some regular conditions. The proposed estimates have consistency and Oracle properties. Further, the proposed estimation is robust by the influence function. Finally, the finite sample properties of the proposed estimate are verified by numerical simulation and case analysis. In the fourth chapter, we study the problem of the estimation of the longitudinal single index model. The correlation obtains the initial estimation of the index coefficient vector and the non parametric join function. Secondly, in order to avoid the estimation of the work correlation coefficient matrix in the generalized estimation equation, this chapter decomposes the covariance matrix into the autoregressive coefficient and the updated variance based on the modified Choleksy decomposition, and then obtains their estimation by the regression modeling method. Again, The two step estimation of the index coefficient vector and the non parametric connection function is constructed by using the weighted least square method of the section. Under some regular conditions, the consistency and asymptotic normality of the proposed estimate are proved. Finally, the superiority of the proposed method is verified by numerical simulation and case analysis. In the fifth chapter, the single index coefficient model, A robust and effective estimation method is proposed in combination with local linear approximation and multiple regression. Under some regular conditions, the consistency and asymptotic normality of the proposed estimation are established. Further, the optimal theoretical window width is discussed and the method of selecting window width in practical problems is given. Finally, the numerical simulation proves the robustness and effectiveness of the proposed estimate. In the sixth chapter, we study the estimation of the longitudinal single index coefficient model. Since the estimation of the non parametric connection function in the fifth chapter involves "less smooth" window width, it brings challenges to the selection of the window width in the actual use. Therefore, this chapter In order to improve the efficiency of statistical inference, in order to improve the efficiency of statistical inference, this chapter uses the modified Cholesky decomposition to obtain the estimation of covariance matrix, and then constructs a more effective central generalized estimation equation for the index coefficient vector, and then uses weighted least squares to obtain the non parametric connection function more effectively. In some regular conditions, the large sample properties of the proposed estimate are established. Finally, the effectiveness and practicability of the proposed method are verified by numerical simulation and example analysis.
【學(xué)位授予單位】:重慶大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:O212
,
本文編號:1802287
本文鏈接:http://sikaile.net/shoufeilunwen/jckxbs/1802287.html
最近更新
教材專著