基于Storm的在線序列極限學(xué)習機的降雨量預(yù)測研究
[Abstract]:Rainfall is an important parameter for disaster prevention and mitigation, which largely reflects the trend of disaster occurrence. Rainfall has an important impact on agricultural production, soil and water flow and engineering application. Accurate prediction of rainfall in a region can help agriculture. Water conservancy departments to improve the ability to prevent drought and waterlogging disasters, the harm to the minimum. With the frequent flood and waterlogging disasters in China in recent years, how to accurately and timely use meteorological data to forecast rainfall has become more and more important. The arrival of big data era, also brought new challenge to meteorological forecast industry. Weather data are mainly derived from ground observation, meteorological satellite remote sensing, weather radar and numerical forecast products. These four types of data account for more than 90% of the total data and are directly used in meteorological operations, weather forecasting, climate prediction and meteorological services. Stream data is a set of digitally encoded and continuous signals. In general, data flow can be regarded as an infinite and extensive application in network public opinion analysis, stock market trend, satellite positioning, financial real-time monitoring, Internet of things monitoring and real-time meteorological monitoring and so on. There is still much room for development in the field of rainfall prediction based on large-scale meteorological flow data. For the traditional rainfall prediction, the off-line meteorological data are often used to carry out batch training with the method of machine learning, that is, the learning process will not continue after all the training samples have been studied at one time. However, in practical applications, all samples in the training sample space can not be obtained at one time, but often in the order of time. Although large scale cluster can alleviate the problem of insufficient computing power caused by large amount of data to a certain extent, but for the newly arrived data, it is unable to process quickly and update the knowledge acquired by learning in time. In order to solve the problem of real-time calculation and massive processing of meteorological data, this paper presents a model of rainfall prediction based on online sequence based on Storm platform for extreme learning machine. The main contents and innovations of this paper are as follows: (1) aiming at the problem that the off-line batch forecasting method of meteorological data can not reflect the change of rainfall in time, a rainfall prediction model based on on-line sequence limit learning machine is proposed. Aiming at the large-scale and real-time characteristics of meteorological data, the algorithm of extreme learning machine is optimized on line. The model initializes several online extreme learning machine models. When the data of new batches are continuously reached, the model can continue to learn new samples on the basis of the existing training results. The method of random gradient descent and the adjustment of error weight are introduced to give error feedback to the new prediction results and update the error weight parameters in real time to improve the prediction accuracy of the model. (2) aiming at the problem of the massive high dimensional characteristics of meteorological data, In the stage of data preprocessing, the correlation coefficient between the decision attributes is used to analyze the meteorological data, and the correlation coefficient is used to filter the prediction attributes, which reduces the complexity of meteorological data and improves the efficiency of model training. In addition, Storm streaming big data frame and Kafka distributed message queue are used to train large scale meteorological data in parallel. Experimental results show that the algorithm runs on Storm platform and has excellent parallel performance and prediction accuracy.
【學(xué)位授予單位】:湘潭大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:P457.6;TP181
【參考文獻】
相關(guān)期刊論文 前10條
1 李志杰;李元香;王峰;何國良;匡立;;面向大數(shù)據(jù)分析的在線學(xué)習算法綜述[J];計算機研究與發(fā)展;2015年08期
2 孟小峰;慈祥;;大數(shù)據(jù)管理:概念、技術(shù)與挑戰(zhàn)[J];計算機研究與發(fā)展;2013年01期
3 姜文瑞;王玉英;郝小琪;李富鵬;;決策樹方法在氣溫預(yù)測中的應(yīng)用[J];計算機應(yīng)用與軟件;2012年08期
4 肖偉平;何宏;;基于遺傳算法的數(shù)據(jù)挖掘方法及應(yīng)用[J];湖南科技大學(xué)學(xué)報(自然科學(xué)版);2009年03期
5 鄒文安;劉立博;王鳳;;人工神經(jīng)網(wǎng)絡(luò)BP模型在枯季徑流量預(yù)測中的應(yīng)用[J];水資源研究;2008年03期
6 樊改娥;張順利;;淺談氣象預(yù)報的作用[J];科技情報開發(fā)與經(jīng)濟;2008年16期
7 石揚;張燕平;趙姝;張玲;田福生;汪小寒;;基于商空間的氣象時間序列數(shù)據(jù)挖掘研究[J];計算機工程與應(yīng)用;2007年01期
8 焦飛;黃天文;何華慶;;數(shù)據(jù)挖掘技術(shù)在氣溫長期變化趨勢預(yù)測中的應(yīng)用[J];廣東氣象;2006年02期
9 吳成東;許可;韓中華;裴濤;;基于粗糙集和決策樹的數(shù)據(jù)挖掘方法[J];東北大學(xué)學(xué)報;2006年05期
10 金龍,金健,姚才;A Short-Term Climate Prediction Model Based on a Modular Fuzzy Neural Network[J];Advances in Atmospheric Sciences;2005年03期
,本文編號:2349609
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2349609.html