基于STORM的數(shù)據(jù)流查詢分析系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)
發(fā)布時(shí)間:2018-05-06 16:41
本文選題:數(shù)據(jù)流分析 + 數(shù)據(jù)流查詢; 參考:《哈爾濱工業(yè)大學(xué)》2017年碩士論文
【摘要】:數(shù)據(jù)流分析一直以來都是研究熱點(diǎn),特別是近十來年大數(shù)據(jù)的發(fā)展,使得數(shù)據(jù)流分析越來越重要和流行。就目前來看,簡便易用的數(shù)據(jù)流分析系統(tǒng)還是比較少而且上手不易,需要相關(guān)的專業(yè)人員才行。本文以實(shí)驗(yàn)室項(xiàng)目的數(shù)據(jù)流分析系統(tǒng)為背景,闡述了一個(gè)基于storm的通用數(shù)據(jù)流分析系統(tǒng)。本文針對(duì)以上問題,通過分析數(shù)據(jù)流開發(fā)的需求提出并實(shí)現(xiàn)一種類似sql的數(shù)據(jù)流分析語言scql,并將生成的邏輯應(yīng)用部署到storm上。該系統(tǒng)的特點(diǎn)是簡單易用,只要懂sql并且簡單學(xué)習(xí)scql語法即可使用,而且不需要去管理大量的配置信息。經(jīng)過大量的測試表明,該系統(tǒng)是可行的且有效的。整個(gè)系統(tǒng)分為基礎(chǔ)模塊、適配器模塊、編譯模塊三個(gè)模塊,基礎(chǔ)模塊用來提供數(shù)據(jù)的處理類,適配器模塊用來將邏輯應(yīng)用部署到storm上,編譯模塊用來將scql語句經(jīng)過語法分析、語義分析、算子的拆分合并和算子優(yōu)化,最后編譯構(gòu)建一個(gè)邏輯應(yīng)用。語法分析從抽象語法樹中提取每個(gè)葉節(jié)點(diǎn)的信息然后進(jìn)行語義分析,將信息重新組織創(chuàng)建表的元信息、語句分析結(jié)果和表達(dá)式的描述,下一步是根據(jù)語句進(jìn)行算子拆分并創(chuàng)建執(zhí)行器,然后生成物理執(zhí)行計(jì)劃?偨Y(jié)來說,本文介紹了數(shù)據(jù)流的背景和相關(guān)技術(shù)分析,在此基礎(chǔ)上分析了系統(tǒng)的需求然后提出的總體架構(gòu)和設(shè)計(jì)方案。在論文的核心部分詳細(xì)闡述了系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn),并且給出了系統(tǒng)的測試案例。
[Abstract]:Data flow analysis has always been a hot research topic, especially with the development of big data in recent ten years, which makes data flow analysis more and more important and popular. At present, simple and easy-to-use data flow analysis systems are relatively few and difficult to use, requiring relevant professionals. A general data flow analysis system based on storm is presented in this paper. Aiming at the above problems, this paper proposes and implements a data flow analysis language, scqlsimilar to sql, by analyzing the requirements of data stream development, and deploys the generated logic application to storm. The system is simple and easy to use, as long as you understand sql and simply learn scql syntax to use, and do not need to manage a lot of configuration information. A large number of tests show that the system is feasible and effective. The whole system is divided into three modules: the basic module, the adapter module and the compilation module. The basic module is used to provide the data processing class, the adapter module is used to deploy logic to the storm, and the compiler module is used to analyze the syntax of the scql statement. Semantic analysis, operator splitting and merging, operator optimization, and finally build a logical application. The syntax analysis extracts the information of each leaf node from the abstract syntax tree and then carries on the semantic analysis, reorganizes the information to create the table meta-information, the statement analysis result and the expression description. The next step is to split the operator according to the statement, create the executor, and then generate the physical execution plan. In conclusion, this paper introduces the background of the data flow and related technical analysis, and then analyzes the requirements of the system and then proposes the overall architecture and design scheme. In the core part of the paper, the design and implementation of the system are described in detail, and a test case is given.
【學(xué)位授予單位】:哈爾濱工業(yè)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 周杰;毛宇光;;數(shù)據(jù)流查詢語言的研究與實(shí)現(xiàn)[J];計(jì)算機(jī)技術(shù)與發(fā)展;2008年01期
相關(guān)碩士學(xué)位論文 前2條
1 楊鵬;面向流式數(shù)據(jù)處理平臺(tái)JStorm的負(fù)載均衡技術(shù)研究[D];北京工業(yè)大學(xué);2016年
2 徐超;大型互聯(lián)網(wǎng)公司分布式消息系統(tǒng)的設(shè)計(jì)與實(shí)施[D];復(fù)旦大學(xué);2013年
,本文編號(hào):1853066
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1853066.html
最近更新
教材專著