變系數(shù)部分線性模型的統(tǒng)計推斷

發(fā)布時間：2018-02-25 22:32

本文關(guān)鍵詞： 變系數(shù)部分線性模型有效估計模型識別變量選擇超高維　出處：《南京信息工程大學(xué)》2015年碩士論文　論文類型：學(xué)位論文

【摘要】：隨著互聯(lián)網(wǎng)數(shù)據(jù)庫的不斷擴(kuò)展,實際問題中收集到的數(shù)據(jù)很多都是高維數(shù)據(jù)。為了處理高維數(shù)據(jù)分析的問題,許多參數(shù)和半?yún)?shù)模型被提出用來避免“維數(shù)禍根”的問題。在眾多的參數(shù)和半?yún)?shù)模型中,變系數(shù)部分線性模型由于模型本身既包含了常系數(shù),又包含了函數(shù)型系數(shù),從而受到廣泛的關(guān)注。一般常用的估計模型方法有最小二乘法(LSE),最小方差法(MAVE)等,但這些方法得到的估計結(jié)果可能不是有效估計,這就要求我們發(fā)展更適合的估計方法。另一方面,由于數(shù)據(jù)的不斷積累,數(shù)據(jù)中的協(xié)變量的數(shù)目經(jīng)常呈多項式速率增長,有時甚至呈指數(shù)速率增長。在高維數(shù)據(jù)下如何更好地對參數(shù)或半?yún)?shù)模型進(jìn)行估計和統(tǒng)計推斷,就更加重要。因此,當(dāng)我們研究變系數(shù)部分線性模型在高維和超高維情況下的統(tǒng)計問題時,就要求我們發(fā)展更合適的處理方法。本文系統(tǒng)地研究了變系數(shù)部分線性模型的估計、變量選擇以及在超高維數(shù)據(jù)中的降維問題。研究結(jié)果表明通過建立有效估計方程可以得到有效估計；使用group lasso方法進(jìn)行變量選擇可以識別出常系數(shù)變量和函數(shù)型變量；通過排序KL距離來進(jìn)行特征篩選,可以降低超高維數(shù)據(jù)的維數(shù)。本文研究了變系數(shù)部分線性模型在不同維數(shù)情況下的一些統(tǒng)計問題。主要內(nèi)容如下：(1)研究了帶異方差的變系數(shù)部分線性模型的估計有效性問題,給出完全樣本下感興趣參數(shù)的有效得分向量函數(shù)和有效估計。提出有效估計方程并給出帶異方差的變系數(shù)部分線性模型的半?yún)?shù)有效界,證明所得估計為有效估計,并證明其大樣本性質(zhì),通過數(shù)值模擬研究其有限樣本性質(zhì)。(2)研究了在高維數(shù)據(jù)下,變系數(shù)部分線性模型的變量選擇問題。提出兩階段變量選擇方法,分別對模型的線性部分和變系數(shù)部分做變量選擇,得到參數(shù)的Adaptive Lasso估計,證明了估計的漸近性質(zhì)與相合性,并利用數(shù)值模擬研究估計的有限樣本性質(zhì)。(3)研究了超高維數(shù)據(jù)情況下的變系數(shù)模型變量篩選問題,提出基于KL距離的變量篩選方法,通過條件累計分布函數(shù)構(gòu)造協(xié)變量與響應(yīng)變量之間的邊際KL距離統(tǒng)計量,進(jìn)行排序來篩選變量,并利用數(shù)值模擬驗證了所提出方法的有限樣本性質(zhì)。
[Abstract]:With the continuous expansion of the Internet database, many of the data collected in practical problems are high-dimensional data. In order to deal with the problem of high-dimensional data analysis, Many parametric and semi-parametric models have been proposed to avoid the problem of "dimensionality curse". Among many parametric and semi-parametric models, the partial linear model with variable coefficients contains both constant and functional coefficients. The commonly used estimation methods are the least square method (LSE), the least variance method (MAVEV) and so on, but the estimation results obtained by these methods may not be valid. This requires us to develop more appropriate estimation methods. On the other hand, because of the constant accumulation of data, the number of covariables in the data often increases at a polynomial rate. Sometimes it even increases exponentially. How to better estimate and statistically infer parametric or semi-parametric models under high-dimensional data is even more important. When we study the statistical problems of partial linear models with variable coefficients in high and ultra-high dimensions, we are required to develop more suitable methods of dealing with them. In this paper, the estimation of partial linear models with variable coefficients is studied systematically. Variable selection and dimensionality reduction in ultra-high dimensional data. The results show that effective estimation can be obtained by establishing effective estimation equations, constant coefficient variables and functional variables can be identified by group lasso method. By sorting the KL distance for feature filtering, In this paper, we study some statistical problems of partial linear models with variable coefficients under different dimensions. The main contents are as follows: (1) the estimation validity of partial linear models with variable coefficients with heteroscedasticity is studied. The effective score vector function and effective estimate of the parameters of interest under complete samples are given. The effective estimation equation and the semi-parametric efficient bound of the partial linear model with variable coefficients with heteroscedasticity are proposed, and the obtained estimates are proved to be effective estimates. The problem of variable selection for variable coefficient partial linear model under high-dimensional data is studied by numerical simulation. A two-stage variable selection method is proposed. The linear part and variable coefficient part of the model are selected, and the Adaptive Lasso estimation of the parameters is obtained. The asymptotic property and consistency of the estimator are proved. The variable selection problem of variable coefficient model under ultra-high dimensional data is studied by using finite sample properties of numerical simulation. A variable selection method based on KL distance is proposed. The marginal KL distance statistics between the covariable and the response variable are constructed by the conditional cumulative distribution function and sorted to filter the variables. The finite sample properties of the proposed method are verified by numerical simulation.
【學(xué)位授予單位】：南京信息工程大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2015
【分類號】：O212.1

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 胡端平;劉吉定;;成分?jǐn)?shù)據(jù)的線性模型[J];應(yīng)用數(shù)學(xué);2009年03期

2 劉洪偉;徐文科;;線性模型的廣義最小二乘估計遞推算法[J];哈爾濱師范大學(xué)自然科學(xué)學(xué)報;2011年03期

3 邱紅兵;羅季;孫旭;;奇異線性模型下最小范數(shù)二次無偏估計關(guān)于誤差分布的穩(wěn)健性[J];華僑大學(xué)學(xué)報(自然科學(xué)版);2012年01期

4 何仲洛;在線性模型中的隨機(jī)加權(quán)逼近[J];湖州師專學(xué)報;1987年05期

5 孫平;;線性模型的非負(fù)優(yōu)良性[J];應(yīng)用概率統(tǒng)計;1987年04期

6 施國生;沈宗畸;;計算機(jī)配棉中的線性模型[J];浙江絲綢工學(xué)院學(xué)報;1988年04期

7 陳定庚;線性模型的比較[J];湖南大學(xué)學(xué)報;1990年02期

8 施沛德;;部分線性模型中M型回歸樣條估計的一些新結(jié)果[J];科學(xué)通報;1993年20期

9 吳今培;可能性線性模型診斷方法及其應(yīng)用[J];五邑大學(xué)學(xué)報(自然科學(xué)版);1995年02期

10 劉維奇,潘晉孝;聚集數(shù)據(jù)線性模型參數(shù)的一種新估計[J];工程數(shù)學(xué)學(xué)報;1996年04期

相關(guān)會議論文前8條

1 岳珠;;奇異線性模型的交互影響分析[A];中國現(xiàn)場統(tǒng)計研究會第九屆學(xué)術(shù)年會論文集[C];1999年

2 張尚立;;約束條件線性模型最佳線性無偏估計的影響分析[A];中國現(xiàn)場統(tǒng)計研究會第十三屆學(xué)術(shù)年會論文集[C];2007年

3 劉玄鶴;向曉峰;高南;;背向臺階流動的線性模型[A];中國力學(xué)大會——2013論文摘要集[C];2013年

4 趙宏艷;井世潔;;兒童數(shù)字估計能力的發(fā)展?fàn)顩r研究[A];第十一屆全國心理學(xué)學(xué)術(shù)會議論文摘要集[C];2007年

5 許偉;龔昌超;曾新吾;;調(diào)制氣流聲源線性模型理論分析[A];中國聲學(xué)學(xué)會2009年青年學(xué)術(shù)會議[CYCA’09]論文集[C];2009年

6 曹燕華;紀(jì)波;殷伯明;;用高三10次月考成績預(yù)測高考成績的建模比較研究[A];第25屆全國灰色系統(tǒng)會議論文集[C];2014年

7 謝小慶;任杰;;HSK等值方法的改進(jìn)[A];對外漢語教學(xué)的全方位探索——對外漢語研究學(xué)術(shù)討論會論文集[C];2004年

8 褚麗媛;高天德;;基于寬線性模型的立體聲回波對消方法[A];2012'中國西部聲學(xué)學(xué)術(shù)交流會論文集(Ⅰ)[C];2012年

相關(guān)博士學(xué)位論文前10條

1 葛洪偉;可能性線性模型中參數(shù)與輸入噪聲間關(guān)系的研究及其應(yīng)用[D];江南大學(xué);2008年

2 張日權(quán);函數(shù)系數(shù)和部分線性模型中的估計問題[D];華東師范大學(xué);2003年

3 黎雅蓮;帶約束條件的線性模型參數(shù)估計理論與方法研究[D];重慶大學(xué);2009年

4 李文學(xué);線性模型和線性測量誤差約束估計及其性質(zhì)研究[D];重慶大學(xué);2011年

5 徐文科;基于微分方程的生態(tài)數(shù)學(xué)模型統(tǒng)計分析[D];東北林業(yè)大學(xué);2009年

6 趙娟;線性模型的最小方差估計問題[D];四川大學(xué);2002年

7 鄔吉波;線性模型參數(shù)估計的若干性質(zhì)研究[D];重慶大學(xué);2013年

8 曾云輝;高維線性模型和部分線性模型的相合統(tǒng)計推斷[D];山東大學(xué);2013年

9 劉鋒;部分線性模型的序列相關(guān)檢驗與異方差檢驗[D];中南大學(xué);2006年

10 徐建文;線性模型參數(shù)的約束有偏估計和預(yù)檢驗估計研究[D];重慶大學(xué);2009年

相關(guān)碩士學(xué)位論文前10條

1 趙文星;變系數(shù)部分線性模型的統(tǒng)計推斷[D];南京信息工程大學(xué);2015年

2 孫自朋;扭曲測量誤差數(shù)據(jù)下受限制部分線性模型的統(tǒng)計分析[D];深圳大學(xué);2015年

3 宛書楠;函數(shù)部分線性模型在煤譜數(shù)據(jù)中的應(yīng)用[D];東北師范大學(xué);2015年

4 張中洋;線性模型假設(shè)條件的影響分析[D];武漢科技大學(xué);2005年

5 李強(qiáng);線性模型下的蒙特卡羅算法和數(shù)據(jù)挖掘[D];重慶大學(xué);2005年

6 查道慶;污染線性模型的討論[D];安徽大學(xué);2006年

7 孔楊;線性模型中幾種模型條件對統(tǒng)計結(jié)果的影響[D];山東大學(xué);2010年

8 朱利亞;累加階層線性模型的研究及應(yīng)用[D];南京信息工程大學(xué);2012年

9 何幫強(qiáng);污染線性模型的參數(shù)和非參數(shù)估計的研究[D];合肥工業(yè)大學(xué);2007年

10 銀利;部分線性模型及其應(yīng)用[D];重慶理工大學(xué);2014年

，

本文編號：1535438

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/yysx/1535438.html

上一篇：3維空間中零Cartan曲線的零曲面及標(biāo)架曲線的漸屈線
下一篇：基于EM算法的有限維混合分布參數(shù)估計研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

變系數(shù)部分線性模型的統(tǒng)計推斷