偏最小二乘及稀疏偏最小二乘回歸的應(yīng)用研究
發(fā)布時間:2018-01-29 21:29
本文關(guān)鍵詞: 偏最小二乘回歸模型 稀疏偏最小二乘回歸模型 云南省電力需求 交叉驗證 出處:《昆明理工大學(xué)》2015年碩士論文 論文類型:學(xué)位論文
【摘要】:當(dāng)今,高維復(fù)雜數(shù)據(jù)在各個科學(xué)領(lǐng)域廣泛出現(xiàn),這就要求統(tǒng)計學(xué)家尋求新的統(tǒng)計建模方法.處理高維數(shù)據(jù)的一個潛在難點是如何解決預(yù)測變量之間的多維共線性.偏最小二乘(PLS)回歸是傳統(tǒng)多元線性回歸的推廣,非常適用于具有強(qiáng)相關(guān)性數(shù)據(jù)的統(tǒng)計分析處理.偏最小二乘在建模過程中采用信息綜合和篩選技術(shù),從原有變量中提取若干對系統(tǒng)最具解釋能力的新成分,然后再利用這些新的綜合變量進(jìn)行建模,可以說偏最小二乘是多元線性回歸,主成分分析和典型相關(guān)分析這三者的綜合.本文利用隨機(jī)模擬的數(shù)據(jù)及云南省電力數(shù)據(jù),從偏最小二乘的建模原理、模型求解、模型算法、算法模擬、參數(shù)調(diào)節(jié)、數(shù)據(jù)分析等方面對偏最小二乘模型展開了詳細(xì)的研究和探討,并利用交叉驗證、均方差等準(zhǔn)則對多元線性回歸和偏最小二乘模型進(jìn)行了綜合比較,數(shù)據(jù)分析結(jié)果表明當(dāng)預(yù)測變量之間存在較強(qiáng)的共線性時,偏最小二乘具有較高的優(yōu)越性.本文的另一個研究重點是稀疏偏最小二乘(SPLS)回歸.由于偏最小二乘的每個新成分都是原來所有預(yù)測變量的線性組合,當(dāng)預(yù)測變量數(shù)較大時,這會給模型解釋帶來負(fù)面影響,也不利于最重要預(yù)測變量的尋找.稀疏偏最小二乘是偏最小二乘的改進(jìn),它能在偏最小二乘的基礎(chǔ)上對估計系數(shù)進(jìn)行收縮,并使那些較小的系數(shù)(絕對值意義下)恰好收縮到零,從而使與之對應(yīng)的變量能夠從模型中剔除.本文研究了稀疏偏最小二乘算法和實現(xiàn),并采用類似于研究偏最小二乘的思路,對多元回歸、偏最小二乘和稀疏偏最小二乘模型進(jìn)行了全方面的比較,并就云南省電力數(shù)據(jù),找出了影響電力消費(fèi)的最重要因素.模擬數(shù)據(jù)回歸結(jié)果表明:偏最小二乘回歸及稀疏偏最小二乘回歸模型可以有效解決變量之間存在共線性的問題.相比之下,稀疏偏最小二乘回歸模型的擬合效果更好,模型預(yù)測精度更高.對云南省電力消費(fèi)影響因素進(jìn)行的研究表明:云南省的電力需求隨著云南省經(jīng)濟(jì)的發(fā)展,社會消費(fèi)品零售總額的增長以及固定資產(chǎn)投資的增加在不斷增長.云南省的城鎮(zhèn)化進(jìn)程同樣也拉動了全社會對電力的需求,居民消費(fèi)價格指數(shù)的升高也對電力需求有正向的拉動作用,但作用不大可忽略.
[Abstract]:Nowadays, high-dimensional and complex data are widely used in various fields of science. This requires statisticians to seek new statistical modeling methods. A potential difficulty in dealing with high-dimensional data is how to solve the multi-dimensional collinearity between predictive variables. Partial least Squares (PLS). Regression is a generalization of traditional multivariate linear regression. It is very suitable for statistical analysis and processing with strong correlation data. In the modeling process, partial least squares uses information synthesis and screening techniques to extract some new components that have the most ability to explain the system from the original variables. Then using these new comprehensive variables to model, it can be said that partial least squares is multivariate linear regression. This paper uses random simulation data and Yunnan electric power data, from the partial least squares modeling principle, model solution, model algorithm, algorithm simulation. Parameter adjustment, data analysis and other aspects of the partial least squares model were studied and discussed in detail, and the use of cross-validation, mean square error and other criteria for multiple linear regression and partial least squares model comprehensive comparison. The data analysis results show that there is a strong collinearity between the predicted variables. Partial least squares has higher superiority. Another research focus of this paper is sparse partial least squares regression. Because each new component of partial least squares is a linear combination of all the original prediction variables. When the number of prediction variables is large, this will bring negative effects to the interpretation of the model, and is also not conducive to the search of the most important prediction variables. Sparse partial least squares is an improvement of partial least squares. It can shrink the estimated coefficients on the basis of partial least squares and make the smaller coefficients (in the absolute sense) just shrink to zero. So that the corresponding variables can be removed from the model. This paper studies the sparse partial least squares algorithm and its implementation. The partial least squares model and sparse partial least squares model are compared in all aspects, and the electric power data of Yunnan Province are compared. The simulation results show that the partial least square regression and sparse partial least squares regression model can effectively solve the problem of collinearity between variables. The sparse partial least square regression model has better fitting effect and higher prediction precision. The research on the influencing factors of Yunnan power consumption shows that the power demand of Yunnan Province is developing with the development of Yunnan economy. The growth of total retail sales of consumer goods and the increase of investment in fixed assets in Yunnan Province has also driven the demand for electricity from the whole society. The increase of consumer price index also has a positive effect on electricity demand, but the effect is not negligible.
【學(xué)位授予單位】:昆明理工大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:O212.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 于松青;林盛;;基于偏最小二乘回歸的山東省電力需求預(yù)測分析[J];干旱區(qū)資源與環(huán)境;2015年02期
2 陳月東;;稀疏偏最小二乘方法用于光譜波長選擇及定量分析[J];計算機(jī)與應(yīng)用化學(xué);2014年02期
3 潘東東;童艷彩;陳興;唐年勝;;基于R的運(yùn)籌學(xué)實驗教學(xué)實踐與探討[J];統(tǒng)計與管理;2014年01期
4 李科;;基于閾值回歸模型的中國電力消費(fèi)與經(jīng)濟(jì)增長的關(guān)系[J];系統(tǒng)工程理論與實踐;2012年08期
,本文編號:1474400
本文鏈接:http://sikaile.net/kejilunwen/yysx/1474400.html
最近更新
教材專著