Nyquist sampling theorem Bid data stream Sliding window Lag
本文關(guān)鍵詞:大數(shù)據(jù)流滯后相關(guān)性挖掘方法,,由筆耕文化傳播整理發(fā)布。
基于級數(shù)分層滑動窗口的大數(shù)據(jù)流滯后相關(guān)性挖掘方法
LAG CORRELATION MINING METHOD FOR BIG DATA STREAM BASED ON SERIES LAYERED SLIDING WINDOW
[1] [2] [3]
Ren Yonggong ,Qian Haizhen, Lang Hongyu ( School of Computer and Information Technology,Liaoning Normal University, Dalian 116029,Liaoning, China)
遼寧師范大學(xué)計算機(jī)與信息技術(shù)學(xué)院,遼寧大連116029
文章摘要:針對大數(shù)據(jù)流序列挖掘過程中,不能快速發(fā)現(xiàn)序列滯后相關(guān)性的問題,提出一種基于級數(shù)分層滑動窗口的大數(shù)據(jù)流序列滯后相關(guān)性挖掘方法。該方法首先對序列按級數(shù)遞增進(jìn)行分層,在每層上計算滑動窗口的覆蓋能力g;之后再對每層的滑動窗口計算序列的參數(shù)值;最后根據(jù)各層滑動窗口的參數(shù)值,計算序列的滯后相關(guān)系數(shù),以此來確定序列的滯后相關(guān)性。在序列滯后相關(guān)性的求解過程中,通過奈奎斯特抽樣定理證明了需要計算大數(shù)據(jù)流n個序列的log2(n)個點(diǎn),就能高精度地確定序列的滯后相關(guān)性。這大大減少了計算時間,并且序列越多,計算誤差越小,效率越高。實驗結(jié)果表明,該方法可以大幅度地減少運(yùn)算時間,在保證精度的情況下提高運(yùn)算效率,尤其對大數(shù)據(jù)流序列,效果良好,應(yīng)用前景廣闊。
Abstr:For the problem that in big data stream sequence mining process the lag correlation of sequence cannot be found quickly, the paper proposes a lag correlation mining method for big data stream which is based on series layered sliding window. The method first stratifies the sequence according to the increment of series, calculates the coverage g of sliding windows on each layer, and then figures up the sequence parameter values on sliding windows of each layer; According to the parameter values of sliding window of each layer, it calculates the lag correlation coefficient of sequence, in this way it determines the lag correlation of sequence. In the solving process of sequence lag correlation, through Nyquist sampling theorem it is proved that the need of computing log2 (n) of n sequences of big data stream only can determine the lag correlation of sequence with high precision. This greatly reduces the computation time, and the more the sequence, the smaller the error and the higher the efficiency. Experimental results show that the improved method can greatly reduce computation time, and improve the operation efficiency under the condition of ensuring precision, especially for large data flow sequence, the method has better effect and broad application prospect.
文章關(guān)鍵詞:
Keyword::Nyquist sampling theorem Bid data stream Sliding window Lag correlation
課題項目:遼寧省自然科學(xué)基金項目(201202119);遼寧省科學(xué)計劃項目(2013405003);大連市科技計劃項目(2013A16GX116).
本文關(guān)鍵詞:大數(shù)據(jù)流滯后相關(guān)性挖掘方法,由筆耕文化傳播整理發(fā)布。
本文編號:180366
本文鏈接:http://sikaile.net/shoufeilunwen/xixikjs/180366.html