大數(shù)據(jù)環(huán)境下數(shù)據(jù)分析與可視化核心技術(shù)研究
本文選題:數(shù)據(jù)分析 + 數(shù)據(jù)可視化; 參考:《北京郵電大學(xué)》2016年碩士論文
【摘要】:步入信息社會以來,數(shù)據(jù)的爆炸式增長為信息處理與數(shù)據(jù)分析帶來了新的機(jī)遇與挑戰(zhàn)。IBM Watson指出,每天都產(chǎn)生約25億GB的非結(jié)構(gòu)化的原始數(shù)據(jù),數(shù)據(jù)中的關(guān)鍵信息往往淹沒在龐大的數(shù)據(jù)量以及復(fù)雜的結(jié)構(gòu)之中。因此,需要對數(shù)據(jù)信息進(jìn)行有效的過濾處理,其中,數(shù)據(jù)可視化分析技術(shù)是至關(guān)重要的部分。本文分析了當(dāng)前數(shù)據(jù)可視化分析領(lǐng)域的主要方法和工具,發(fā)現(xiàn)當(dāng)前的可視分析方案仍存在著一些值得探究的問題:人對于數(shù)據(jù)的認(rèn)知特點決定了人在分析過程中具有一定的認(rèn)知局限性,而單一的采用機(jī)器計算的方式進(jìn)行數(shù)據(jù)分析存在目的性差的問題;在可視化分析過程中存在可視化模型選擇困難的現(xiàn)實問題;一些可視化工具片面追求華麗的效果,而輕視了對數(shù)據(jù)關(guān)鍵信息進(jìn)行傳遞的能力等。據(jù)此本課題完成了以下的研究與實踐工作:1.本文從數(shù)據(jù)可視化分析的流程入手,結(jié)合傳統(tǒng)探索式數(shù)據(jù)發(fā)現(xiàn)方法,提出了啟發(fā)式數(shù)據(jù)可視化分析策略。該策略重視人對數(shù)據(jù)的認(rèn)知特性,強(qiáng)調(diào)人機(jī)之間的互動與反饋,注重數(shù)據(jù)本身對可視化分析過程的指導(dǎo)性。2.文章重點論述了構(gòu)建一個可視化數(shù)據(jù)分析平臺的設(shè)計與實現(xiàn)過程。這個平臺以B/S結(jié)構(gòu)為基礎(chǔ),結(jié)合當(dāng)前流行的D3.js可視化框架,實現(xiàn)了一個集數(shù)據(jù)屬性分析、可視化模型推薦、動態(tài)可視化模型構(gòu)建為一體的可視化分析應(yīng)用。3.該平臺實踐了理論提出的啟發(fā)式數(shù)據(jù)分析策略,并提出自動化可視分析圖形推薦方法以及可視化模型編碼推薦方法等兩種可視分析新方法。系統(tǒng)功能簡化了分析人員使用數(shù)據(jù)可視化分析方法的過程,提升了分析效率,并且能夠得到更佳的可視化分析結(jié)果。本系統(tǒng)架構(gòu)重視降低模塊之間的耦合程度,提供了較強(qiáng)的可擴(kuò)展性。以該平臺的架構(gòu)為基礎(chǔ),可以進(jìn)一步實踐啟發(fā)式的數(shù)據(jù)可視化分析方法,并將系統(tǒng)擴(kuò)展出更多的具體應(yīng)用。此外,本系統(tǒng)還可以擴(kuò)展可視化模型的種類、模型推薦算法等方面繼續(xù)優(yōu)化。
[Abstract]:Since entering the information society, the explosive growth of data has brought new opportunities and challenges for information processing and data analysis. IBM Watson points out that about 2.5 billion GB of unstructured raw data is produced every day. The key information in the data is often submerged in the huge amount of data and complex structure. Therefore, it is necessary to filter the data information effectively, among which, the data visualization analysis technology is the most important part. In this paper, the main methods and tools in the field of data visualization analysis are analyzed. It is found that there are still some problems worth exploring in the current visual analysis scheme: the cognitive characteristics of data determine the cognitive limitations in the process of analysis. However, there is a problem of poor purpose in a single method of data analysis using machine computing, a practical problem of difficult choice of visual model in the process of visual analysis, a partial pursuit of ornate effects by some visualization tools, It belittles the ability to transmit the key information of the data. Accordingly, this subject has completed the following research and practice work: 1. This paper starts with the flow of data visualization analysis and proposes a heuristic strategy of data visualization analysis combined with the traditional exploratory data discovery method. The strategy attaches importance to the cognitive characteristics of data, emphasizes the interaction and feedback between man and machine, and emphasizes the guidance of data itself to the visual analysis process. This paper focuses on the design and implementation of a visual data analysis platform. Based on the structure of B / S and the current popular D3.js visualization framework, this platform implements a visual analysis application .3which integrates data attribute analysis, visual model recommendation and dynamic visualization model construction. The platform implements the heuristic data analysis strategy proposed by the theory, and proposes two new visual analysis methods, such as automated visual analysis graphics recommendation method and visual model coding recommendation method. The function of the system simplifies the process of using the data visualization analysis method, improves the efficiency of the analysis, and can obtain better visual analysis results. This system architecture pays attention to reduce the coupling degree between modules, and provides a strong scalability. Based on the architecture of the platform, the heuristic method of data visualization analysis can be further practiced, and the system can be extended to more concrete applications. In addition, the system can also extend the types of visual models, model recommendation algorithms and other aspects continue to optimize.
【學(xué)位授予單位】:北京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前6條
1 陳小輝;高燕;;基于優(yōu)化歐氏距離的協(xié)同過濾推薦[J];計算機(jī)與現(xiàn)代化;2015年03期
2 楊靜;;大數(shù)據(jù)技術(shù)研究[J];計算機(jī)時代;2015年01期
3 劉智慧;張泉靈;;大數(shù)據(jù)技術(shù)研究綜述[J];浙江大學(xué)學(xué)報(工學(xué)版);2014年06期
4 洪文學(xué);王金甲;;可視化和可視化分析學(xué)[J];燕山大學(xué)學(xué)報;2010年02期
5 萬星火,檀亦麗;主成分分析原始數(shù)據(jù)的預(yù)處理問題[J];中國衛(wèi)生統(tǒng)計;2005年05期
6 肖瓏,陳凌,馮項云,馮英;中文元數(shù)據(jù)標(biāo)準(zhǔn)框架及其應(yīng)用[J];大學(xué)圖書館學(xué)報;2001年05期
相關(guān)博士學(xué)位論文 前1條
1 譚璐;高維數(shù)據(jù)的降維理論及應(yīng)用[D];國防科學(xué)技術(shù)大學(xué);2005年
相關(guān)碩士學(xué)位論文 前3條
1 蔡朱華;基于聚類分析的可視化技術(shù)及其應(yīng)用研究[D];廈門大學(xué);2014年
2 董軍凱;基于平行坐標(biāo)法的可視化數(shù)據(jù)挖掘技術(shù)研究[D];北京工業(yè)大學(xué);2008年
3 劉輝;數(shù)據(jù)挖掘中約簡技術(shù)與屬性選擇算法的研究[D];吉林大學(xué);2006年
,本文編號:1862541
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1862541.html