稀疏矩陣插補(bǔ)及在大型問卷調(diào)查中的應(yīng)用研究
[Abstract]:Since 2012, the word "big data" has appeared more and more in people's life. In work and study,.IBM has conducted a study. The results show that 90% of all the data in our human world have been produced in the past two years from ancient times to the present, and it is expected that the amount of data in the whole human category may reach the target after 2020. 44 times the amount of previous data. Incomplete data is inevitable in the process of generating and expanding a large number of data, and the missing values in incomplete data often have a significant impact on the availability of data. The evaluation system of the network shopping platform plays a great role in collecting a large number of incomplete data. Consumers are all evaluating what they have bought. The scoring system of the online shopping platform can collect all the scoring data into a matrix with a large number of missing values. We call it a "sparse matrix". If some consumers buy a commodity but do not evaluate the commodity, it will improve the sparse moment. Based on the data structure obtained by the network shopping platform scoring system and the film evaluation system of Netflix online film leasing company in the United States, this paper is not difficult to find out that the simple small sample survey can not meet the current social reality. As a result, a new breakthrough is needed both in the size of the questionnaire and in the size of the sample. In the past, the past practice usually gives the respondents a reward or feedback to obtain the cooperation of the respondents. The method not only needs a certain guarantee in human, material and financial resources. The quality of the questionnaire data is not guaranteed. In this paper, the questionnaire segmentation method is used to divide the large questionnaire in the survey into a number of small questionnaires according to the correlation between the questions and the questions. In the course of the investigation, a small number of small questionnaires are randomly selected from each of the respondents. After sorting out the survey data, a sparse matrix with a large number of missing values is finally obtained. Then the sparse matrix is interpolated with the missing value interpolation to obtain the complete data. Two interpolations are taken by random number interpolation and multiple logic model interpolation, and the corresponding conclusions are obtained by comparing the results of interpolation. The data of this paper are derived from the simulation of R-Studio software because of human and time constraints. First, R-Studio software is used to generate analog data, because the data each respondents answered is "unit". "As a unit, therefore, in the process of missing data, we have to realize the missing block, that is, the missing unit, and each of the investigators in the final sparse matrix answers the problem of the number of specific units. Secondly, the problem is used by different respondents as a riveting problem, and the correlation between the respondents in answering the same question is calculated. In the end, the data from the interpolation are compared with the original data to verify the feasibility and accuracy of the questionnaire segmentation method and the interpolation method used in this paper. Because the data used in this paper are simulated by R-Studio software, it has a certain idealization hypothesis in theory. The unit number of each respondents' answers to the questionnaire can be controlled artificially during the investigation, but the respondents' answer to each unit's problem data needs to be assumed to be internal, that is, the whole data matrix has only "unit" missing and no individual missing. The full text includes five chapters. Chapter 1 introduces the basic content of the article. Including the background and purpose of research, literature review, research methods and the innovation of the paper; the second chapter is the introduction of the missing data processing methods, and expounds the methods and simple concepts used by scholars in the absence of data interpolation in recent years. The third chapter, as the core content of this paper, is from easy to difficult, from data generation to missing, In the fourth chapter, the fourth chapter uses the contents of the third chapters and the large sparse matrix generated by the software of the third chapter to verify the feasibility and accuracy of the theory and method of this paper; Fifth Chapter one is the summary of the whole paper and the prospects for the development of the research content in this paper. At the same time, the paper puts forward the improvement methods for the deficiencies of this paper.
【學(xué)位授予單位】:河北經(jīng)貿(mào)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:O151.21
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 周家斌;一種氣象資料插補(bǔ)方法[J];科學(xué)通報(bào);1987年15期
2 張時(shí)釗;;氣象哨溫度資料的插補(bǔ)[J];陜西氣象;1981年08期
3 曹宗智;利用電子計(jì)算機(jī)實(shí)現(xiàn)水文資料的自動(dòng)插補(bǔ)[J];干旱區(qū)地理;1987年04期
4 蔣勇敏,邱士安;無(wú)誤差插補(bǔ)方法初探[J];機(jī)械;2000年S1期
5 喬麗華;傅德印;;缺失數(shù)據(jù)的多重插補(bǔ)方法[J];統(tǒng)計(jì)教育;2006年12期
6 楊偉東;朱紅春;劉麗冰;;計(jì)算機(jī)數(shù)據(jù)課程中插補(bǔ)原理教學(xué)方法的探討[J];實(shí)驗(yàn)室科學(xué);2009年02期
7 屠其璞;一種氣溫場(chǎng)序列的延長(zhǎng)插補(bǔ)方法[J];南京氣象學(xué)院學(xué)報(bào);1986年01期
8 黃蓉;胡澤勇;關(guān)婷;孫根厚;楊耀先;劉火霖;;藏北高原氣溫資料插補(bǔ)及其變化的初步分析[J];高原氣象;2014年03期
9 龐新生;;分層隨機(jī)抽樣條件下缺失數(shù)據(jù)的多重插補(bǔ)方法[J];統(tǒng)計(jì)與信息論壇;2009年05期
10 楊軍;趙宇;丁文興;;抽樣調(diào)查中缺失數(shù)據(jù)的插補(bǔ)方法[J];數(shù)理統(tǒng)計(jì)與管理;2008年05期
相關(guān)會(huì)議論文 前7條
1 余予;李俊;任芝花;張志富;;標(biāo)準(zhǔn)序列法在日平均氣溫缺測(cè)數(shù)據(jù)插補(bǔ)中的應(yīng)用[A];第八屆全國(guó)優(yōu)秀青年氣象科技工作者學(xué)術(shù)研討會(huì)論文匯編[C];2014年
2 呂強(qiáng);;編寫數(shù)控車、銑床加工多邊形插補(bǔ)程序的方法[A];數(shù)控技術(shù)學(xué)術(shù)研討會(huì)論文集[C];1999年
3 安金剛;;離線插補(bǔ)技術(shù)在運(yùn)動(dòng)控制中的應(yīng)用[A];全國(guó)第十二屆空間及運(yùn)動(dòng)體控制技術(shù)學(xué)術(shù)會(huì)議論文集[C];2006年
4 鄭金興;張銘鈞;孟慶鑫;;變插補(bǔ)周期的數(shù)控進(jìn)給速度控制算法研究[A];先進(jìn)制造技術(shù)論壇暨第五屆制造業(yè)自動(dòng)化與信息化技術(shù)交流會(huì)論文集[C];2006年
5 谷永山;王銳;韋穗;;基于兩幅視圖的縱向插補(bǔ)方法[A];第十五屆全國(guó)圖象圖形學(xué)學(xué)術(shù)會(huì)議論文集[C];2010年
6 宋琦;陳璞;;稀疏求解—結(jié)構(gòu)修改的一種新的可能性[A];北京力學(xué)會(huì)第20屆學(xué)術(shù)年會(huì)論文集[C];2014年
7 徐道遠(yuǎn);王寶庭;王向東;馮伯林;;求解大型稀疏矩陣的ICCG法[A];第八屆全國(guó)結(jié)構(gòu)工程學(xué)術(shù)會(huì)議論文集(第Ⅰ卷)[C];1999年
相關(guān)博士學(xué)位論文 前9條
1 王允森;基于樣條插補(bǔ)的高質(zhì)量加工關(guān)鍵技術(shù)的研究[D];中國(guó)科學(xué)院研究生院(沈陽(yáng)計(jì)算技術(shù)研究所);2015年
2 金永喬;微小線段高速加工的軌跡優(yōu)化建模及前瞻插補(bǔ)技術(shù)研究[D];上海交通大學(xué);2015年
3 葉偉;數(shù)控系統(tǒng)納米插補(bǔ)及控制研究[D];北京交通大學(xué);2010年
4 梅鵬;中國(guó)群死群傷火災(zāi)數(shù)據(jù)插補(bǔ)及快速損失評(píng)估研究[D];中國(guó)科學(xué)技術(shù)大學(xué);2013年
5 孟書云;高精度開放式數(shù)控系統(tǒng)復(fù)雜曲線曲面插補(bǔ)關(guān)鍵技術(shù)研究[D];南京航空航天大學(xué);2006年
6 劉巍;ARGO稀損數(shù)據(jù)插補(bǔ)與三維海洋要素場(chǎng)重構(gòu)研究[D];西南交通大學(xué);2012年
7 郭松;面向稀疏矩陣運(yùn)算的異構(gòu)并行算法研究[D];國(guó)防科學(xué)技術(shù)大學(xué);2015年
8 周勇;高速進(jìn)給驅(qū)動(dòng)系統(tǒng)動(dòng)態(tài)特性分析及其運(yùn)動(dòng)控制研究[D];華中科技大學(xué);2008年
9 郝永江;復(fù)雜參數(shù)曲線曲面加工控制與狀態(tài)監(jiān)測(cè)技術(shù)研究[D];天津大學(xué);2011年
相關(guān)碩士學(xué)位論文 前10條
1 劉艷玲;調(diào)查數(shù)據(jù)無(wú)回答的插補(bǔ)方法及模擬比較[D];天津財(cái)經(jīng)大學(xué);2012年
2 余威;氣象相似性網(wǎng)絡(luò)構(gòu)建及缺失氣象要素?cái)?shù)據(jù)的插補(bǔ)[D];西南大學(xué);2015年
3 李玲雪;缺失偏態(tài)數(shù)據(jù)下異方差模型的統(tǒng)計(jì)推斷[D];昆明理工大學(xué);2015年
4 李永杰;基于PH曲線五軸數(shù)控插補(bǔ)策略的研究[D];遼寧科技大學(xué);2015年
5 趙偉;針對(duì)回歸模型的缺失數(shù)據(jù)插補(bǔ)方法模擬分析[D];天津財(cái)經(jīng)大學(xué);2014年
6 駱新珍;基于DA插補(bǔ)法的線性回歸模型系數(shù)估計(jì)量的模擬研究[D];天津財(cái)經(jīng)大學(xué);2014年
7 肖哲;基于STM32的嵌入式數(shù)控插補(bǔ)控制器的研究與實(shí)現(xiàn)[D];湖北工業(yè)大學(xué);2016年
8 李珍;不完全測(cè)量信息系統(tǒng)的辨識(shí)研究[D];安徽工程大學(xué);2016年
9 紀(jì)忠光;缺失數(shù)據(jù)的非參數(shù)插補(bǔ)[D];華中師范大學(xué);2016年
10 楊曉倩;缺失數(shù)據(jù)插補(bǔ)方法的選擇研究[D];蘭州財(cái)經(jīng)大學(xué);2016年
,本文編號(hào):2160108
本文鏈接:http://sikaile.net/kejilunwen/yysx/2160108.html