網(wǎng)絡(luò)流量存儲與分析平臺的設(shè)計與實現(xiàn)
本文選題:網(wǎng)絡(luò)流量 + NetFlow。 參考:《山東大學(xué)》2017年碩士論文
【摘要】:目前社會各領(lǐng)域的數(shù)據(jù)都在以爆炸式的速度增長,伴隨著計算機(jī)科學(xué)與技術(shù)的發(fā)展使得數(shù)據(jù)的傳遞、存儲與處理方式產(chǎn)生了巨大的變化,由于網(wǎng)絡(luò)信息技術(shù)的蓬勃發(fā)展,個人和互聯(lián)網(wǎng)上的程序以及應(yīng)用都產(chǎn)生了大量的數(shù)據(jù),數(shù)據(jù)的飛快增長對數(shù)據(jù)的存儲和高速訪問都提出了新的挑戰(zhàn)。正是因為網(wǎng)絡(luò)數(shù)據(jù)的快速增長,不同網(wǎng)絡(luò)應(yīng)用程序都需要擁有擴(kuò)展存儲容量的性能,并且應(yīng)用程序應(yīng)該能夠?qū)崿F(xiàn)存儲節(jié)點的動態(tài)加入,并且保證歷史數(shù)據(jù)及應(yīng)用程序產(chǎn)生的最新數(shù)據(jù)在不同存儲節(jié)點的分布均勻,從而實現(xiàn)整個平臺系統(tǒng)的運行負(fù)載均衡。根據(jù)現(xiàn)階段網(wǎng)絡(luò)數(shù)據(jù)存儲技術(shù)的發(fā)展,應(yīng)用程序的動態(tài)擴(kuò)容成本較高,而分布式存儲技術(shù)可以良好的解決海量數(shù)據(jù)的低廉存儲及高效的檢索。本文介紹了網(wǎng)絡(luò)流量存儲與分析平臺的開發(fā)與實現(xiàn),針對業(yè)務(wù)需求,結(jié)合網(wǎng)絡(luò)流量分析的相關(guān)技術(shù),對網(wǎng)絡(luò)流量存儲與分析平臺開展功能性需求分析和非功能性需求分析,并根據(jù)需求分析結(jié)果對系統(tǒng)功能架構(gòu)和技術(shù)架構(gòu)完成設(shè)計與實現(xiàn)。網(wǎng)絡(luò)數(shù)據(jù)的采集分析業(yè)務(wù)使用C++語言基于nprobe改造完成,使用gRPC框架實現(xiàn)采集分析子系統(tǒng)與元數(shù)據(jù)管理子系統(tǒng)的數(shù)據(jù)通信傳輸功能,元數(shù)據(jù)管理系統(tǒng)負(fù)責(zé)維護(hù)、更新存儲節(jié)點的信息并為存儲請求、查詢請求分配節(jié)點信息。查詢業(yè)務(wù)使用C++語言實現(xiàn),首先向元數(shù)據(jù)管理系統(tǒng)發(fā)送查詢請求,獲得存儲節(jié)點信息后多線程完成各個存儲節(jié)點的查詢業(yè)務(wù),并將結(jié)果數(shù)據(jù)返回客戶端,由客戶端完成結(jié)果匯聚后展示給用戶或供第三方應(yīng)用使用。根據(jù)業(yè)務(wù)需求,選擇fastbit作為網(wǎng)絡(luò)流量存儲與分析平臺的數(shù)據(jù)庫,針對業(yè)務(wù)邏輯,設(shè)計數(shù)據(jù)庫分區(qū)結(jié)構(gòu)及存儲路徑。根據(jù)當(dāng)前網(wǎng)絡(luò)的實際運行環(huán)境,本文設(shè)計一套完整的流量采集分析和存儲方案,并在此基礎(chǔ)上實現(xiàn)了網(wǎng)絡(luò)流量存儲與分析平臺。根據(jù)平臺數(shù)據(jù)的分析結(jié)果,我們能夠掌握在特定時刻下當(dāng)前網(wǎng)絡(luò)的整體狀態(tài);掌握不同的應(yīng)用在網(wǎng)絡(luò)中的資源占用情況,使得資源不足的應(yīng)用能夠得到更多的帶寬;掌握每一個用戶使用網(wǎng)絡(luò)的具體情況,并根據(jù)結(jié)果信息促進(jìn)管理人員更加有效地分配相關(guān)資源;利用NetFlow的相關(guān)數(shù)據(jù),對運行網(wǎng)絡(luò)的不同特征如數(shù)據(jù)流向等開展解析過程,從而獲得網(wǎng)絡(luò)的運行特征從而抓取潛在的流量異常。本文通過實驗測評,驗證了網(wǎng)絡(luò)流量存儲與分析平臺在海量流量下系統(tǒng)的采集、存儲與檢索能力,在海量歷史數(shù)據(jù)下的系統(tǒng)查詢性能的提升以及系統(tǒng)運行的穩(wěn)定性與可擴(kuò)展性。
[Abstract]:At present, the data in all fields of society are increasing at an explosive rate. With the development of computer science and technology, great changes have taken place in the way of data transmission, storage and processing, because of the vigorous development of network information technology. Personal and Internet programs and applications have generated a lot of data, and the rapid growth of data has posed new challenges to data storage and high-speed access. Because of the rapid growth of network data, different network applications need to have the capability of expanding storage capacity, and applications should be able to dynamically join storage nodes. It also ensures that the historical data and the latest data generated by the application are evenly distributed among different storage nodes, thus realizing the load balance of the whole platform system. According to the development of network data storage technology at present, the dynamic expansion cost of application program is high, while distributed storage technology can solve the problem of low cost storage and efficient retrieval of mass data. This paper introduces the development and implementation of the network traffic storage and analysis platform. According to the business requirements and the related technologies of network traffic analysis, the functional requirement analysis and non-functional requirement analysis are carried out for the network traffic storage and analysis platform. According to the results of requirement analysis, the system functional architecture and technical architecture are designed and implemented. The network data collection and analysis business is transformed by C language based on nprobe, and the data communication and transmission function between the collection and analysis subsystem and the metadata management subsystem is realized by using the gRPC framework. The metadata management system is responsible for the maintenance. Updates the information of the storage node and assigns node information to the storage request, query request. The query service is implemented in C language. Firstly, the query request is sent to the metadata management system. After obtaining the information of the storage node, the query service of each storage node is completed by multi-threading, and the result data is returned to the client. After the client completes the result aggregation, it is displayed to the user or the third party application. According to the business requirement, fastbit is chosen as the database of network traffic storage and analysis platform, and the database partition structure and storage path are designed according to the business logic. According to the actual running environment of the current network, this paper designs a set of complete traffic collection, analysis and storage scheme, and realizes the network traffic storage and analysis platform on this basis. According to the analysis results of the platform data, we can master the whole state of the current network at a particular time, grasp the resource occupation of different applications in the network, make the application with insufficient resources get more bandwidth. Master the specific situation of each user using the network, and according to the result information to promote the managers to allocate the relevant resources more effectively; using the relevant data of NetFlow, the different characteristics of running the network, such as the flow of data, etc. In order to obtain the characteristics of the network operation, and grab the potential traffic anomalies. This paper verifies the ability of collecting, storing and retrieving the network traffic storage and analysis platform under the massive traffic, the improvement of the system query performance under the massive historical data, and the stability and expansibility of the system running by the network traffic storage and analysis platform.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP311.52
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 Zhen Chen;Yuhao Wen;Junwei Cao;Wenxun Zheng;Jiahui Chang;Yinjun Wu;Ge Ma;Mourad Hakmaoui;Guodong Peng;;A Survey of Bitmap Index Compression Algorithms for Big Data[J];Tsinghua Science and Technology;2015年01期
2 曹潤澤;馮濤;;Protocol Buffers在數(shù)據(jù)采集與傳輸系統(tǒng)中的應(yīng)用[J];無線互聯(lián)科技;2014年12期
3 田源;潘晨光;丁杰;;Protocol Buffers在即時通訊系統(tǒng)中的應(yīng)用研究[J];現(xiàn)代電子技術(shù);2014年05期
4 徐慧;姜恒;楊林;;PF_RING高效數(shù)據(jù)包捕獲技術(shù)研究與設(shè)計[J];計算機(jī)科學(xué);2012年S2期
5 王梅;楊思簫;樂嘉錦;;列存儲數(shù)據(jù)庫中壓縮位圖索引技術(shù)[J];計算機(jī)工程;2012年18期
6 康書恒;楊子江;;FastBit在流量測量系統(tǒng)中的應(yīng)用[J];數(shù)字通信;2012年01期
7 劉陽成;周儉;謝玉波;;海量數(shù)據(jù)存儲管理技術(shù)研究[J];微計算機(jī)應(yīng)用;2011年10期
8 楊_g劍;林波;;分布式存儲系統(tǒng)中一致性哈希算法的研究[J];電腦知識與技術(shù);2011年22期
9 程鵬;;位圖索引技術(shù)及其研究綜述[J];科技信息;2010年26期
10 石飛;史嵐;喬建忠;莫曉靜;;網(wǎng)絡(luò)數(shù)據(jù)采集技術(shù)研究[J];小型微型計算機(jī)系統(tǒng);2008年10期
相關(guān)博士學(xué)位論文 前1條
1 丁祥武;列存儲系統(tǒng)的若干關(guān)鍵技術(shù)研究[D];東華大學(xué);2013年
相關(guān)碩士學(xué)位論文 前4條
1 余駿;面向海量天文數(shù)據(jù)的分布式存儲引擎的研究[D];天津大學(xué);2014年
2 任春梅;網(wǎng)絡(luò)流量分析關(guān)鍵技術(shù)研究[D];電子科技大學(xué);2013年
3 龍礴濤;列存儲數(shù)據(jù)倉庫中壓縮技術(shù)的研究與實現(xiàn)[D];東華大學(xué);2013年
4 高瞻;基于NetFlow技術(shù)的網(wǎng)絡(luò)流量測量與分析[D];廣西大學(xué);2012年
,本文編號:1919300
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/1919300.html