時空序列數(shù)據(jù)挖掘中若干關(guān)鍵技術(shù)研究
本文關(guān)鍵詞:時空序列數(shù)據(jù)挖掘中若干關(guān)鍵技術(shù)研究 出處:《中南大學(xué)》2013年碩士論文 論文類型:學(xué)位論文
更多相關(guān)文章: 數(shù)據(jù)挖掘 時空序列 聚類分析 關(guān)聯(lián)分析 灰色模型
【摘要】:時空序列數(shù)據(jù)挖掘作為時空數(shù)據(jù)挖掘的一個重要分支,是專門針對時空數(shù)據(jù)中時空序列類型的數(shù)據(jù)進行研究。時空序列數(shù)據(jù)不僅描述了地理對象或現(xiàn)象存在的空間特征,而且有效地記錄了地理對象或現(xiàn)象隨時間的演變狀態(tài),因此對其研究具有重要的意義。本文回顧了國內(nèi)外相關(guān)研究成果,結(jié)合現(xiàn)有的空間數(shù)據(jù)挖掘與時間序列數(shù)據(jù)挖掘理論體系,提出了對時空序列數(shù)據(jù)進行挖掘,探討了時空序列數(shù)據(jù)挖掘的主要內(nèi)容與技術(shù)手段,就時空序列數(shù)據(jù)挖掘的技術(shù)中存在的特定問題,提出了相應(yīng)的解決策略。本文主要工作包括: (1)在時空序列聚類分析研究方向,針對“時序相似,空間鄰接”的聚類要求,提出了種子點擴散的時空序列聚類算法,首先選取與空間近鄰時間序列相似性最高的對象作為種子,對種子進行標(biāo)記并且將標(biāo)記擴散到其空間近鄰,然后選取下一個種子點,進行標(biāo)記、擴散操作,直到所有的時空序列依附的實體都被標(biāo)記,該方法計算簡單、效率高并且無需進行參數(shù)的設(shè)定,避免了參數(shù)選取的主觀性。 (2)在時空序列關(guān)聯(lián)規(guī)則研究方向,針對“后件已知,前件未知”的關(guān)聯(lián)條件,提出了一種約束條件下事件關(guān)聯(lián)規(guī)則算法,首先在后件目標(biāo)事件已知的條件下,通過一個有效時間窗口來顧及前件事件間及前件事件與后件目標(biāo)事件在時間上的滯后因素,然后在計算前件事件集中,只考慮對后件目標(biāo)事件有效時間窗口中的候選前件事件集,而不需要對整個事件序列中的頻繁事件集進行搜索,避免對整個事件序列中的頻繁集計算,從而降低了算法的復(fù)雜度。 (3)在時空序列預(yù)測建模研究方向,針對GM(1,1)能夠?qū)Α靶颖?貧信息”進行建模預(yù)測,而缺乏對空間自相關(guān)性的考慮,提出了STGM(1,1)建模方法,此方法是結(jié)合空間自相關(guān)特性與灰色理論預(yù)測模型,空間自相關(guān)性是對空間對象或現(xiàn)象在空間上的依賴性描述,因而STGM(1,1)能夠處理具有小樣本的時空序列數(shù)據(jù)。 最后,總結(jié)了本文的研究成果,并展望了時空序列數(shù)據(jù)挖掘進一步需要研究的工作。
[Abstract]:As an important branch of spatio-temporal data mining, spatio-temporal sequence data mining is an important branch. Spatio-temporal data not only describe the spatial characteristics of geographical objects or phenomena, but also focus on the types of spatio-temporal data. And the evolution of geographical objects or phenomena with time is recorded effectively, so it is of great significance to study them. This paper reviews the relevant research results at home and abroad. Combined with the existing theory system of spatial data mining and time series data mining, this paper puts forward the mining of space-time series data, and discusses the main contents and technical means of spatio-temporal series data mining. In view of the specific problems existing in the technology of spatio-temporal series data mining, the corresponding solutions are put forward. The main work of this paper is as follows: 1) in the research direction of spatio-temporal sequence clustering analysis, aiming at the clustering requirements of "temporal similarity, spatial adjacency", a spatio-temporal sequence clustering algorithm based on seed point diffusion is proposed. Firstly, the object with the highest similarity to the spatial nearest neighbor time series is selected as seed, the seed is labeled and the marker is diffused to its spatial neighbor, and then the next seed point is selected for marking and diffusion operation. Until all the objects attached to the spatio-temporal sequence are marked, the method is simple and efficient, and does not need to set parameters, thus avoiding the subjectivity of parameter selection. 2) in the research direction of temporal and spatial sequence association rules, an event association rule algorithm based on constraint condition is proposed for the association condition of "the posterior part is known, the previous part is unknown". First, under the condition that the target event is known, the lag factor between the former event and the latter object event is taken into account by an effective time window, and then the time lag factor between the former event and the latter object event is considered, and then the time lag factor is considered in the calculation of the previous event set. Only the candidate pre-event set in the effective time window of the latter target event is considered, and the frequent event set in the whole event sequence is not searched to avoid the calculation of the frequent set in the whole event sequence. Thus, the complexity of the algorithm is reduced. 3) in the research direction of prediction modeling of time-space series, aiming at "small sample, poor information" can be modeled and forecasted, STGM(1 is put forward because of the lack of consideration of spatial autocorrelation. 1) Modeling method, which combines spatial autocorrelation characteristics with grey theory prediction model, spatial autocorrelation is the spatial dependent description of spatial objects or phenomena, so STGM(1. 1) capable of processing temporal and spatial sequence data with small samples. Finally, the research results of this paper are summarized, and the further research work of spatiotemporal series data mining is prospected.
【學(xué)位授予單位】:中南大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:P208
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 李光強;鄧敏;程濤;朱建軍;;一種基于雙重距離的空間聚類方法[J];測繪學(xué)報;2008年04期
2 張雪伍;蘇奮振;石憶邵;張丹丹;;空間關(guān)聯(lián)規(guī)則挖掘研究進展[J];地理科學(xué)進展;2007年06期
3 駱劍承,周成虎,梁怡,張講社,黃葉芳;多尺度空間單元區(qū)域劃分方法[J];地理學(xué)報;2002年02期
4 李德仁;對地觀測與地理信息系統(tǒng)[J];地球科學(xué)進展;2001年05期
5 陳俊勇;;地理國情監(jiān)測的學(xué)習(xí)札記[J];測繪學(xué)報;2012年05期
6 王海起;王勁峰;;一種基于空間鄰接關(guān)系的k-means聚類改進算法[J];計算機工程;2006年21期
7 李光強;鄭茂儀;鄧敏;;時空數(shù)據(jù)異常探測方法[J];計算機工程;2010年05期
8 翁小清;沈鈞毅;;多變量時間序列例外模式的識別[J];模式識別與人工智能;2007年03期
9 劉啟亮;李光強;鄧敏;;一種基于局部分布的空間聚類算法[J];武漢大學(xué)學(xué)報(信息科學(xué)版);2010年03期
10 李德仁;眭海剛;單杰;;論地理國情監(jiān)測的技術(shù)支撐[J];武漢大學(xué)學(xué)報(信息科學(xué)版);2012年05期
,本文編號:1430432
本文鏈接:http://sikaile.net/kejilunwen/dizhicehuilunwen/1430432.html