稀疏矩陣插補及在大型問卷調(diào)查中的應(yīng)用研究
[Abstract]:Since 2012, the word "big data" has appeared more and more in people's life. In work and study,.IBM has conducted a study. The results show that 90% of all the data in our human world have been produced in the past two years from ancient times to the present, and it is expected that the amount of data in the whole human category may reach the target after 2020. 44 times the amount of previous data. Incomplete data is inevitable in the process of generating and expanding a large number of data, and the missing values in incomplete data often have a significant impact on the availability of data. The evaluation system of the network shopping platform plays a great role in collecting a large number of incomplete data. Consumers are all evaluating what they have bought. The scoring system of the online shopping platform can collect all the scoring data into a matrix with a large number of missing values. We call it a "sparse matrix". If some consumers buy a commodity but do not evaluate the commodity, it will improve the sparse moment. Based on the data structure obtained by the network shopping platform scoring system and the film evaluation system of Netflix online film leasing company in the United States, this paper is not difficult to find out that the simple small sample survey can not meet the current social reality. As a result, a new breakthrough is needed both in the size of the questionnaire and in the size of the sample. In the past, the past practice usually gives the respondents a reward or feedback to obtain the cooperation of the respondents. The method not only needs a certain guarantee in human, material and financial resources. The quality of the questionnaire data is not guaranteed. In this paper, the questionnaire segmentation method is used to divide the large questionnaire in the survey into a number of small questionnaires according to the correlation between the questions and the questions. In the course of the investigation, a small number of small questionnaires are randomly selected from each of the respondents. After sorting out the survey data, a sparse matrix with a large number of missing values is finally obtained. Then the sparse matrix is interpolated with the missing value interpolation to obtain the complete data. Two interpolations are taken by random number interpolation and multiple logic model interpolation, and the corresponding conclusions are obtained by comparing the results of interpolation. The data of this paper are derived from the simulation of R-Studio software because of human and time constraints. First, R-Studio software is used to generate analog data, because the data each respondents answered is "unit". "As a unit, therefore, in the process of missing data, we have to realize the missing block, that is, the missing unit, and each of the investigators in the final sparse matrix answers the problem of the number of specific units. Secondly, the problem is used by different respondents as a riveting problem, and the correlation between the respondents in answering the same question is calculated. In the end, the data from the interpolation are compared with the original data to verify the feasibility and accuracy of the questionnaire segmentation method and the interpolation method used in this paper. Because the data used in this paper are simulated by R-Studio software, it has a certain idealization hypothesis in theory. The unit number of each respondents' answers to the questionnaire can be controlled artificially during the investigation, but the respondents' answer to each unit's problem data needs to be assumed to be internal, that is, the whole data matrix has only "unit" missing and no individual missing. The full text includes five chapters. Chapter 1 introduces the basic content of the article. Including the background and purpose of research, literature review, research methods and the innovation of the paper; the second chapter is the introduction of the missing data processing methods, and expounds the methods and simple concepts used by scholars in the absence of data interpolation in recent years. The third chapter, as the core content of this paper, is from easy to difficult, from data generation to missing, In the fourth chapter, the fourth chapter uses the contents of the third chapters and the large sparse matrix generated by the software of the third chapter to verify the feasibility and accuracy of the theory and method of this paper; Fifth Chapter one is the summary of the whole paper and the prospects for the development of the research content in this paper. At the same time, the paper puts forward the improvement methods for the deficiencies of this paper.
【學(xué)位授予單位】:河北經(jīng)貿(mào)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:O151.21
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 周家斌;一種氣象資料插補方法[J];科學(xué)通報;1987年15期
2 張時釗;;氣象哨溫度資料的插補[J];陜西氣象;1981年08期
3 曹宗智;利用電子計算機(jī)實現(xiàn)水文資料的自動插補[J];干旱區(qū)地理;1987年04期
4 蔣勇敏,邱士安;無誤差插補方法初探[J];機(jī)械;2000年S1期
5 喬麗華;傅德印;;缺失數(shù)據(jù)的多重插補方法[J];統(tǒng)計教育;2006年12期
6 楊偉東;朱紅春;劉麗冰;;計算機(jī)數(shù)據(jù)課程中插補原理教學(xué)方法的探討[J];實驗室科學(xué);2009年02期
7 屠其璞;一種氣溫場序列的延長插補方法[J];南京氣象學(xué)院學(xué)報;1986年01期
8 黃蓉;胡澤勇;關(guān)婷;孫根厚;楊耀先;劉火霖;;藏北高原氣溫資料插補及其變化的初步分析[J];高原氣象;2014年03期
9 龐新生;;分層隨機(jī)抽樣條件下缺失數(shù)據(jù)的多重插補方法[J];統(tǒng)計與信息論壇;2009年05期
10 楊軍;趙宇;丁文興;;抽樣調(diào)查中缺失數(shù)據(jù)的插補方法[J];數(shù)理統(tǒng)計與管理;2008年05期
相關(guān)會議論文 前7條
1 余予;李俊;任芝花;張志富;;標(biāo)準(zhǔn)序列法在日平均氣溫缺測數(shù)據(jù)插補中的應(yīng)用[A];第八屆全國優(yōu)秀青年氣象科技工作者學(xué)術(shù)研討會論文匯編[C];2014年
2 呂強;;編寫數(shù)控車、銑床加工多邊形插補程序的方法[A];數(shù)控技術(shù)學(xué)術(shù)研討會論文集[C];1999年
3 安金剛;;離線插補技術(shù)在運動控制中的應(yīng)用[A];全國第十二屆空間及運動體控制技術(shù)學(xué)術(shù)會議論文集[C];2006年
4 鄭金興;張銘鈞;孟慶鑫;;變插補周期的數(shù)控進(jìn)給速度控制算法研究[A];先進(jìn)制造技術(shù)論壇暨第五屆制造業(yè)自動化與信息化技術(shù)交流會論文集[C];2006年
5 谷永山;王銳;韋穗;;基于兩幅視圖的縱向插補方法[A];第十五屆全國圖象圖形學(xué)學(xué)術(shù)會議論文集[C];2010年
6 宋琦;陳璞;;稀疏求解—結(jié)構(gòu)修改的一種新的可能性[A];北京力學(xué)會第20屆學(xué)術(shù)年會論文集[C];2014年
7 徐道遠(yuǎn);王寶庭;王向東;馮伯林;;求解大型稀疏矩陣的ICCG法[A];第八屆全國結(jié)構(gòu)工程學(xué)術(shù)會議論文集(第Ⅰ卷)[C];1999年
相關(guān)博士學(xué)位論文 前9條
1 王允森;基于樣條插補的高質(zhì)量加工關(guān)鍵技術(shù)的研究[D];中國科學(xué)院研究生院(沈陽計算技術(shù)研究所);2015年
2 金永喬;微小線段高速加工的軌跡優(yōu)化建模及前瞻插補技術(shù)研究[D];上海交通大學(xué);2015年
3 葉偉;數(shù)控系統(tǒng)納米插補及控制研究[D];北京交通大學(xué);2010年
4 梅鵬;中國群死群傷火災(zāi)數(shù)據(jù)插補及快速損失評估研究[D];中國科學(xué)技術(shù)大學(xué);2013年
5 孟書云;高精度開放式數(shù)控系統(tǒng)復(fù)雜曲線曲面插補關(guān)鍵技術(shù)研究[D];南京航空航天大學(xué);2006年
6 劉巍;ARGO稀損數(shù)據(jù)插補與三維海洋要素場重構(gòu)研究[D];西南交通大學(xué);2012年
7 郭松;面向稀疏矩陣運算的異構(gòu)并行算法研究[D];國防科學(xué)技術(shù)大學(xué);2015年
8 周勇;高速進(jìn)給驅(qū)動系統(tǒng)動態(tài)特性分析及其運動控制研究[D];華中科技大學(xué);2008年
9 郝永江;復(fù)雜參數(shù)曲線曲面加工控制與狀態(tài)監(jiān)測技術(shù)研究[D];天津大學(xué);2011年
相關(guān)碩士學(xué)位論文 前10條
1 劉艷玲;調(diào)查數(shù)據(jù)無回答的插補方法及模擬比較[D];天津財經(jīng)大學(xué);2012年
2 余威;氣象相似性網(wǎng)絡(luò)構(gòu)建及缺失氣象要素數(shù)據(jù)的插補[D];西南大學(xué);2015年
3 李玲雪;缺失偏態(tài)數(shù)據(jù)下異方差模型的統(tǒng)計推斷[D];昆明理工大學(xué);2015年
4 李永杰;基于PH曲線五軸數(shù)控插補策略的研究[D];遼寧科技大學(xué);2015年
5 趙偉;針對回歸模型的缺失數(shù)據(jù)插補方法模擬分析[D];天津財經(jīng)大學(xué);2014年
6 駱新珍;基于DA插補法的線性回歸模型系數(shù)估計量的模擬研究[D];天津財經(jīng)大學(xué);2014年
7 肖哲;基于STM32的嵌入式數(shù)控插補控制器的研究與實現(xiàn)[D];湖北工業(yè)大學(xué);2016年
8 李珍;不完全測量信息系統(tǒng)的辨識研究[D];安徽工程大學(xué);2016年
9 紀(jì)忠光;缺失數(shù)據(jù)的非參數(shù)插補[D];華中師范大學(xué);2016年
10 楊曉倩;缺失數(shù)據(jù)插補方法的選擇研究[D];蘭州財經(jīng)大學(xué);2016年
,本文編號:2160108
本文鏈接:http://sikaile.net/kejilunwen/yysx/2160108.html