天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 文藝論文 > 廣告藝術(shù)論文 >

分布式流數(shù)據(jù)實(shí)時(shí)計(jì)算框架的研究和開發(fā)

發(fā)布時(shí)間:2018-02-08 14:08

  本文關(guān)鍵詞: 分布式流計(jì)算 計(jì)算模型 任務(wù)調(diào)度 動(dòng)態(tài)均衡負(fù)載 時(shí)間序列預(yù)測算法 出處:《浙江理工大學(xué)》2013年碩士論文 論文類型:學(xué)位論文


【摘要】:隨著大數(shù)據(jù)量計(jì)算技術(shù)的發(fā)展,基于數(shù)據(jù)處理的應(yīng)用受到廣泛關(guān)注,而數(shù)據(jù)源的結(jié)構(gòu)也顯示出多樣化的趨勢,這些數(shù)據(jù)中不僅有傳統(tǒng)的非實(shí)時(shí)的、靜態(tài)結(jié)構(gòu)化數(shù)據(jù),還有很多實(shí)時(shí)的、動(dòng)態(tài)產(chǎn)生的非結(jié)構(gòu)化數(shù)據(jù)流。這類連續(xù)到達(dá)的非結(jié)構(gòu)化數(shù)據(jù)序列,它們的輸入率、輸入量和來源都在不斷變化,很難準(zhǔn)確預(yù)測。面對(duì)龐大變化的海量數(shù)據(jù)流,要獲取流數(shù)據(jù)中攜帶的重要信息,實(shí)時(shí)地進(jìn)行復(fù)雜計(jì)算,依靠傳統(tǒng)的分布式計(jì)算模式很難實(shí)現(xiàn)。這就促使本文對(duì)分布式流數(shù)據(jù)實(shí)時(shí)計(jì)算這一新的計(jì)算模式展開深入研究。 目前,國內(nèi)外針對(duì)分布式流數(shù)據(jù)實(shí)時(shí)計(jì)算框架的研究仍在起步階段,尚沒有一個(gè)成熟的產(chǎn)品。因此,作者在深入分析流數(shù)據(jù)處理應(yīng)用需求的情況下,設(shè)計(jì)并實(shí)現(xiàn)了完整的分布式流數(shù)據(jù)實(shí)時(shí)計(jì)算框架iStream,對(duì)框架性能的關(guān)鍵性因素一負(fù)載均衡做了深入的研究和優(yōu)化。經(jīng)過實(shí)驗(yàn)和性能測試,證明該框架可以根據(jù)實(shí)際應(yīng)用場景進(jìn)行靈活的定制,并具有良好的實(shí)時(shí)性和可擴(kuò)展性。本文的主要研究內(nèi)容和成果如下: (1)對(duì)分布式計(jì)算框架中幾個(gè)關(guān)鍵技術(shù)進(jìn)行了研究,結(jié)合數(shù)據(jù)流形式的多樣化和數(shù)據(jù)流應(yīng)用場景的多樣化的特點(diǎn),本文實(shí)現(xiàn)和設(shè)計(jì)了一個(gè)不針對(duì)任何特定場景,可以解決多種復(fù)雜計(jì)算的分布式流數(shù)據(jù)實(shí)時(shí)計(jì)算平臺(tái)iStream,它具有很強(qiáng)通用性和可擴(kuò)展性,,顯著提高了第三方開發(fā)人員的開發(fā)效率。 (2)為了增加吞吐量、加強(qiáng)數(shù)據(jù)處理能力、提高計(jì)算節(jié)點(diǎn)集群的靈活性和可用性,研究了動(dòng)態(tài)調(diào)度技術(shù)以及負(fù)載均衡算法,提出了使用時(shí)間序列預(yù)測算法解決并行計(jì)算中的任務(wù)調(diào)度這—NP-完全問題,并通過改進(jìn)模型化AR模型評(píng)估算法來處理非平穩(wěn)數(shù)據(jù)序列,使得程序更有效率,預(yù)測更精準(zhǔn),并可適用于流數(shù)據(jù)這類不能用簡單的分段模型表示的數(shù)據(jù)源,同時(shí)保證了動(dòng)態(tài)負(fù)載均衡算法的性能。 (3)系統(tǒng)框架的設(shè)計(jì)與實(shí)現(xiàn)。在研究了并行計(jì)算中主流編程模型,諸如MapReduce等模型的基礎(chǔ)上,將改進(jìn)的發(fā)布—訂閱者模型用到iStream框架中,并分析比較了多種主流的分布式進(jìn)程通信方式,解決了高并發(fā)實(shí)時(shí)處理,分布式系統(tǒng)數(shù)據(jù)通信安全和自適應(yīng)調(diào)整等分布式系統(tǒng)中的關(guān)鍵問題。并結(jié)合流計(jì)算的特點(diǎn),在框架各模塊的設(shè)計(jì)與實(shí)現(xiàn)中,對(duì)傳統(tǒng)分布式計(jì)算策略進(jìn)行了改進(jìn),提高了框架的安全性,顯著降低了延遲率。 (4)深入分析了分布式實(shí)時(shí)計(jì)算框架的適用場景,并通過基于CTR效果廣告系統(tǒng)和在線參數(shù)優(yōu)化系統(tǒng)作為案例研究了iStream在商業(yè)應(yīng)用中的效果。最后對(duì)本課題進(jìn)行了總結(jié)和下一步研究的展望。
[Abstract]:With the development of computing technology of large amount of data, the application of data processing has been paid more and more attention, and the structure of data source has shown a trend of diversification. Not only are there traditional non-real-time, static structured data in these data, There are also a lot of real-time, dynamic unstructured data streams. These unstructured data sequences that arrive in succession, whose input rates, inputs, and sources are constantly changing, are difficult to predict accurately. In order to obtain the important information carried in the stream data and carry out complex computing in real time, it is difficult to realize the traditional distributed computing model, which leads to the in-depth study of the new computing mode of real-time computing of distributed stream data in this paper. At present, the research on the real-time computing framework for distributed stream data is still in its infancy, and there is not a mature product. Therefore, the author analyzes the application requirements of streaming data processing in depth. A complete real-time computing framework for distributed stream data, iStream, is designed and implemented. The key factor of the performance of the framework, load balancing, is deeply studied and optimized. It is proved that the framework can be flexibly customized according to the actual application scenarios and has good real-time and extensibility. The main research contents and results of this paper are as follows:. In this paper, several key technologies in the distributed computing framework are studied. Considering the diversity of data flow forms and the diversity of data flow application scenarios, a non-specific scenario is implemented and designed in this paper. IStream, a real-time computing platform for distributed stream data that can solve many complex computations, has strong generality and extensibility, and improves the development efficiency of third-party developers. In order to increase throughput, enhance data processing ability and improve the flexibility and availability of computing node cluster, dynamic scheduling technology and load balancing algorithm are studied. In this paper, a time series prediction algorithm is proposed to solve the problem of task scheduling in parallel computing. The improved AR model evaluation algorithm is used to deal with non-stationary data sequences, which makes the program more efficient and accurate. It can be applied to stream data which can not be represented by simple segmental model, and the performance of dynamic load balancing algorithm is guaranteed at the same time. Based on the study of the mainstream programming model in parallel computing, such as MapReduce, the improved publish-subscriber model is used in the iStream framework. This paper also analyzes and compares the main communication methods of distributed process, solves the key problems in distributed system, such as high concurrent real-time processing, data communication security and adaptive adjustment of distributed system, and combines the characteristics of stream computing. In the design and implementation of each module of the framework, the traditional distributed computing strategy is improved to improve the security of the framework and significantly reduce the delay rate. Finally, the application of distributed real-time computing framework is analyzed. Based on the CTR effect advertising system and online parameter optimization system as a case study of the effect of iStream in commercial applications. Finally, this paper summarizes the topic and prospects for the next research.
【學(xué)位授予單位】:浙江理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:TP338.8

【參考文獻(xiàn)】

相關(guān)期刊論文 前8條

1 周筱瑜;雷曉俊;陳芳;;分布式系統(tǒng)中的通信方式:RPC與RMI[J];電腦與電信;2012年03期

2 周曉峰,王志堅(jiān);分布式計(jì)算技術(shù)綜述[J];計(jì)算機(jī)時(shí)代;2004年12期

3 楊學(xué)軍;曾麗芳;鄧宇;唐玉華;;Imagine流處理器上流的優(yōu)化組織方法[J];計(jì)算機(jī)學(xué)報(bào);2008年07期

4 高雅俠;鄒海榮;;基于Java的RMI技術(shù)的研究與應(yīng)用[J];計(jì)算機(jī)與數(shù)字工程;2011年08期

5 楊穎,韓忠明,楊磊;數(shù)據(jù)流的核心技術(shù)與應(yīng)用發(fā)展研究綜述[J];計(jì)算機(jī)應(yīng)用研究;2005年11期

6 王友良,葉柏龍;分布式系統(tǒng)中動(dòng)態(tài)負(fù)載平衡的研究[J];科學(xué)技術(shù)與工程;2005年09期

7 劉利;何先平;袁文亮;;檢測非平穩(wěn)時(shí)間序列中離群點(diǎn)和變化點(diǎn)的統(tǒng)一框架[J];太原師范學(xué)院學(xué)報(bào)(自然科學(xué)版);2011年03期

8 陳濤;陳啟買;;分布式計(jì)算機(jī)系統(tǒng)負(fù)載平衡研究[J];計(jì)算機(jī)技術(shù)與發(fā)展;2006年05期

相關(guān)碩士學(xué)位論文 前6條

1 李登;分布式系統(tǒng)負(fù)載均衡策略研究[D];中南大學(xué);2002年

2 王友良;基于CORBA中間件的負(fù)載平衡服務(wù)的研究[D];湖南大學(xué);2005年

3 楊偉偉;一個(gè)基于負(fù)載平衡的網(wǎng)絡(luò)漏洞管理系統(tǒng)[D];南京理工大學(xué);2008年

4 余濤;無人值守載貨車輛自動(dòng)稱重系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)[D];北京交通大學(xué);2010年

5 周順;面向Web Service的負(fù)載均衡策略研究[D];湖南大學(xué);2010年

6 李琳;基于RFID的物聯(lián)網(wǎng)運(yùn)維管理系統(tǒng)的設(shè)計(jì)與實(shí)現(xiàn)研究[D];華中師范大學(xué);2012年



本文編號(hào):1495624

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/1495624.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶7b8f6***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com