基于成分?jǐn)?shù)據(jù)的缺失值補(bǔ)全方法研究
本文選題:成分?jǐn)?shù)據(jù) + 缺失值。 參考:《山西大學(xué)》2016年碩士論文
【摘要】:成分?jǐn)?shù)據(jù)是一類復(fù)雜的多維數(shù)據(jù),主要用來研究構(gòu)成某個整體的各部分的比例.成分?jǐn)?shù)據(jù)近年來被廣泛應(yīng)用在地質(zhì)學(xué)、社會結(jié)構(gòu)和經(jīng)濟(jì)發(fā)展等方面.由于在調(diào)查中的無回答、數(shù)據(jù)收集過程的失誤都會導(dǎo)致數(shù)據(jù)缺失,而數(shù)據(jù)缺失會影響統(tǒng)計(jì)數(shù)據(jù)的質(zhì)量,致使增大了統(tǒng)計(jì)分析結(jié)果中的估計(jì)方差,從而降低統(tǒng)計(jì)數(shù)據(jù)的說服力.因而對缺失數(shù)據(jù)的填補(bǔ)變得十分為重要.本文主要針對成分?jǐn)?shù)據(jù)的缺失值填補(bǔ)方法進(jìn)行研究,對于含多重共線性的成分?jǐn)?shù)據(jù),提出了主成分分析的填補(bǔ)法;對于既含多重共線性又含異常值的成分?jǐn)?shù)據(jù),提出了基于MCD估計(jì)的主成分填補(bǔ)法.全文共分為五章:第一章,介紹了成分?jǐn)?shù)據(jù)的研究背景和意義,以及缺失數(shù)據(jù)的研究現(xiàn)狀.第二章,給出了成分?jǐn)?shù)據(jù)的定義以及所屬空間上的運(yùn)算,其次給出了常用的三種對數(shù)比變換和球坐標(biāo)變換,并回顧了常用的普通數(shù)據(jù)和成分?jǐn)?shù)據(jù)的缺失數(shù)據(jù)填補(bǔ)法.第三章,針對含有多重共線性的成分?jǐn)?shù)據(jù),提出了單形空間上的均值填補(bǔ)法和基于主成分分析的填補(bǔ)法,并通過實(shí)例分析和實(shí)驗(yàn)?zāi)M驗(yàn)證新提出填補(bǔ)方法的精準(zhǔn)度.第四章,在第三章所提出填補(bǔ)法的基礎(chǔ)上,針對含有異常值的成分?jǐn)?shù)據(jù)提出了基于MCD估計(jì)的穩(wěn)健主成分插補(bǔ)法,再次通過實(shí)例分析和實(shí)驗(yàn)?zāi)M驗(yàn)證新方法的合理性.第五章,總結(jié)概括本文的研究工作和結(jié)果,提出不足之處和待解決的問題.
[Abstract]:Component data is a kind of complex multidimensional data, which is mainly used to study the proportion of the parts that make up a whole. Component data have been widely used in geology, social structure and economic development in recent years. Since there is no answer in the survey, the errors in the data collection process will lead to the lack of data, and the lack of data will affect the quality of the statistical data, resulting in an increase in the estimated variance in the statistical analysis results, thus reducing the persuasiveness of the statistical data. Therefore, the filling of missing data becomes very important. In this paper, the missing value filling method of component data is studied, and the filling method of principal component analysis is put forward for the component data with multiple collinearity, and for the component data with multiple collinearity and abnormal value, A principal component filling method based on MCD estimation is proposed. The thesis is divided into five chapters: the first chapter introduces the research background and significance of component data and the research status of missing data. In the second chapter, the definition of component data and the operation in its own space are given. Then, three kinds of logarithmic transformation and spherical coordinate transformation are given, and the common missing data filling methods of common data and component data are reviewed. In chapter 3, for the component data with multiple collinearity, the mean filling method in simplex space and the filling method based on principal component analysis are proposed, and the accuracy of the new method is verified by example analysis and experimental simulation. In chapter 4, on the basis of the filling method proposed in chapter 3, a robust principal component interpolation method based on MCD estimation is proposed for the component data with outliers, and the rationality of the new method is verified by example analysis and experimental simulation. The fifth chapter summarizes the research work and results of this paper, puts forward the shortcomings and problems to be solved.
【學(xué)位授予單位】:山西大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:O212.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 高樹國;王學(xué)磊;李慶民;楊芮;;基于MCD穩(wěn)健統(tǒng)計(jì)分析的變壓器油色譜異常值檢測及分布特性[J];高電壓技術(shù);2014年11期
2 荊文君;張曉琴;常王華;;一種基于成分?jǐn)?shù)據(jù)的修正EM算法[J];中北大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年05期
3 孫懷宇;劉芳;李元;;EM-PCA在化工過程隨機(jī)缺失數(shù)據(jù)補(bǔ)值中的應(yīng)用研究[J];計(jì)算機(jī)與應(yīng)用化學(xué);2013年07期
4 曹芳;朱永忠;;基于多重共線性的Lasso方法[J];江南大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年01期
5 謝小韋;;淺析多元線性回歸中多重共線性問題的三種解決方法[J];科技信息;2009年28期
6 葛培運(yùn);;主成分回歸分析在經(jīng)濟(jì)學(xué)中的應(yīng)用[J];科技信息;2009年27期
7 劉羅曼;;用主成分回歸分析解決回歸模型中復(fù)共線性問題[J];沈陽師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2008年01期
8 魯茂;;幾種處理多重共線性方法的比較研究[J];統(tǒng)計(jì)與決策;2007年13期
9 王斌會;陳一非;;基于MCD的穩(wěn)健主成分算法及其實(shí)證分析[J];數(shù)理統(tǒng)計(jì)與管理;2006年04期
10 邱浪波;王正志;;基于逐步回歸分析的基因表達(dá)缺失值估計(jì)[J];計(jì)算機(jī)工程與應(yīng)用;2006年20期
,本文編號:2000649
本文鏈接:http://sikaile.net/jingjilunwen/jiliangjingjilunwen/2000649.html