GML時(shí)空聚類及時(shí)空序列相似性查詢關(guān)鍵問(wèn)題研究
本文選題:GML + 時(shí)空聚類 ; 參考:《江西理工大學(xué)》2013年碩士論文
【摘要】:隨著現(xiàn)代信息技術(shù)的飛速發(fā)展,GIS作為現(xiàn)代信息技術(shù)的重要組成部分,仍然存在著信息數(shù)據(jù)共享和互操作的問(wèn)題。這樣,使得GIS工作缺乏較好的溝通和交流,為此,OGC推出了GML規(guī)范,使得可以在各種GIS數(shù)據(jù)間架起一座橋梁,實(shí)現(xiàn)GIS界的四通八達(dá)。 GML(Geography Markup Language,地理標(biāo)記語(yǔ)言)作為網(wǎng)絡(luò)環(huán)境下的一種地理信息編碼規(guī)范,隨著計(jì)算機(jī)技術(shù)、網(wǎng)絡(luò)技術(shù)、數(shù)據(jù)庫(kù)技術(shù)的不斷發(fā)展,已廣泛應(yīng)用于各個(gè)領(lǐng)域;隨著LBS(Location Based Service,基于位置的服務(wù))市場(chǎng)的擴(kuò)大,大量的GML時(shí)空數(shù)據(jù)不斷涌現(xiàn),GML在給人們帶來(lái)便利的同時(shí)也產(chǎn)生了一系列的問(wèn)題,其中最突出的問(wèn)題是信息過(guò)量,信息的利用率不高,對(duì)于信息的處理超出了人們的能力。傳統(tǒng)的數(shù)據(jù)挖掘技術(shù)面向結(jié)構(gòu)化數(shù)據(jù),無(wú)法解決變化的、具有層次結(jié)構(gòu)的GML數(shù)據(jù),為此,本文著力于研究GML時(shí)空聚類的問(wèn)題。 時(shí)間和空間關(guān)系作為世間萬(wàn)物的基本參照系,使得時(shí)空序列數(shù)據(jù)在現(xiàn)實(shí)生活中廣泛存在,而且數(shù)據(jù)呈現(xiàn)“幾何式”的增長(zhǎng)。這些大量的數(shù)據(jù)背后蘊(yùn)藏著眾多具有參考價(jià)值的信息。如何從海量的時(shí)空數(shù)據(jù)中提取知識(shí),分析其結(jié)果,,給決策者提供有用建議,已經(jīng)成為目前空間數(shù)據(jù)挖掘亟待解決的問(wèn)題。目前GML時(shí)空序列相似性查詢的研究還很有價(jià)值空間,特別是針對(duì)海量的GML數(shù)據(jù)。 針對(duì)GML時(shí)空聚類和時(shí)空序列相似性查詢的當(dāng)前研究狀況,本文做了主要以下幾個(gè)方面的研究工作: (1)、詳細(xì)的闡述介紹了GML時(shí)空數(shù)據(jù)的模型。對(duì)時(shí)空數(shù)據(jù)的多種模型分析方式做了描述,并針對(duì)海量數(shù)據(jù)的存儲(chǔ)闡述了基于HBase的GML時(shí)空數(shù)據(jù)模型。 (2)、對(duì)GML時(shí)空聚類的算法進(jìn)行了研究,闡述了經(jīng)典的聚類算法(劃分方法、層次方法、基于密度的算法、基于網(wǎng)格的算法、基于模型的算法),并在經(jīng)典算法的基礎(chǔ)之上提出了基于空間鄰近關(guān)系的K-均值聚類算法和基于空間鄰域的GML時(shí)空聚類算法,分別的對(duì)相應(yīng)算法進(jìn)行的實(shí)驗(yàn)的驗(yàn)證,對(duì)空間鄰近關(guān)系的K-均值聚類算法進(jìn)行了區(qū)域經(jīng)濟(jì)發(fā)展空間相關(guān)性驗(yàn)證、區(qū)域經(jīng)濟(jì)發(fā)展空間聚類分析和區(qū)域經(jīng)濟(jì)發(fā)展時(shí)空聚類分析等。 (3)、對(duì)GML時(shí)空序列相似性查詢的研究做了深入的研究,特別是基于空間鄰近關(guān)系的GML時(shí)間序列相似性查詢的研究,采用了我國(guó)大陸31個(gè)省直轄市1997年~2012年共16年的國(guó)民經(jīng)濟(jì)統(tǒng)計(jì)數(shù)據(jù),分別對(duì)GDP1per、GDP2per和GDP3per在相似性度量計(jì)算之前要進(jìn)行標(biāo)準(zhǔn)化處理,分析反映區(qū)域經(jīng)濟(jì)發(fā)展水平,反映區(qū)域三大產(chǎn)業(yè)的結(jié)構(gòu)。
[Abstract]:With the rapid development of modern information technology GIS as an important part of modern information technology still exists the problem of information data sharing and interoperability. In this way, GIS lacks good communication and communication. Therefore, GML specification is introduced, which can build a bridge between all kinds of GIS data and realize the connection of GIS boundary. GML(Geography Markup language (GIS) is a kind of geographic information coding standard under the network environment. With the development of computer technology, network technology and database technology, it has been widely used in various fields. With the expansion of the LBS(Location Based Service (location-based service) market, a large number of GML spatio-temporal data are emerging constantly, which bring convenience to people, but also produce a series of problems, among which the most prominent problem is information overdose, and the utilization rate of information is not high. The processing of information is beyond people's ability. The traditional data mining technology is oriented to structured data, and can not solve the problem of changing and hierarchical GML data. Therefore, this paper focuses on the problem of GML spatio-temporal clustering. As the basic frame of reference of everything in the world, the relationship between time and space makes the space-time series data widely exist in the real life, and the data presents the growth of "geometry". The large amount of data contains a lot of information with reference value. How to extract knowledge from massive spatio-temporal data, analyze its results and provide useful advice to decision makers has become an urgent problem in spatial data mining. At present, the research of GML spatiotemporal sequence similarity query is valuable, especially for massive GML data. In view of the current research situation of GML spatio-temporal clustering and spatio-temporal sequence similarity query, this paper has done the following research work: This paper introduces the model of GML spatiotemporal data in detail. In this paper, several models of spatiotemporal data are described, and the GML spatio-temporal data model based on HBase is described for the storage of massive data. In this paper, the algorithms of GML spatio-temporal clustering are studied, and the classical clustering algorithms (partitioning method, hierarchical method, density-based algorithm, grid-based algorithm) are expounded. Based on the model algorithm, and based on the classical algorithm, the paper proposes the K-means clustering algorithm based on the spatial proximity relationship and the GML space-time clustering algorithm based on the spatial neighborhood, respectively, and verifies the corresponding algorithm by experiments. The spatial correlation of regional economic development, spatial cluster analysis of regional economic development and spatial-temporal cluster analysis of regional economic development are verified by K-means clustering algorithm of spatial proximity relationship. This paper makes a deep research on the similarity query of GML time series, especially on the similarity query of GML time series based on spatial proximity. The statistical data of the national economy of 31 provinces and municipalities in mainland China from 1997 to 2012 are used to standardize the GDP1pern GDP2per and GDP3per before the calculation of similarity measurement, and to analyze and reflect the level of regional economic development. Reflect the structure of the three major industries in the region.
【學(xué)位授予單位】:江西理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2013
【分類號(hào)】:P208
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 張保鋼,朱重光,王潤(rùn)生;改進(jìn)的時(shí)空數(shù)據(jù)基態(tài)修正方法[J];測(cè)繪學(xué)報(bào);2005年03期
2 郭慶勝;鄭春燕;胡華科;;基于鄰近圖的點(diǎn)群層次聚類方法的研究[J];測(cè)繪學(xué)報(bào);2008年02期
3 李光強(qiáng);鄧敏;程濤;朱建軍;;一種基于雙重距離的空間聚類方法[J];測(cè)繪學(xué)報(bào);2008年04期
4 李光強(qiáng);鄧敏;劉啟亮;程濤;;一種適應(yīng)局部密度變化的空間聚類方法[J];測(cè)繪學(xué)報(bào);2009年03期
5 薛存金;周成虎;蘇奮振;董慶;謝炯;;面向過(guò)程的時(shí)空數(shù)據(jù)模型研究[J];測(cè)繪學(xué)報(bào);2010年01期
6 張豐;劉南;劉仁義;唐遠(yuǎn)彬;;面向?qū)ο蟮牡丶畷r(shí)空過(guò)程表達(dá)與數(shù)據(jù)更新模型研究[J];測(cè)繪學(xué)報(bào);2010年03期
7 劉啟亮;鄧敏;石巖;彭東亮;;一種基于多約束的空間聚類方法[J];測(cè)繪學(xué)報(bào);2011年04期
8 薛存金;謝炯;;時(shí)空數(shù)據(jù)模型的研究現(xiàn)狀與展望[J];地理與地理信息科學(xué);2010年01期
9 陳新保;朱建軍;陳建群;;時(shí)空數(shù)據(jù)模型綜述[J];地理科學(xué)進(jìn)展;2009年01期
10 范建永;龍明;熊偉;;基于HBase的矢量空間數(shù)據(jù)分布式存儲(chǔ)研究[J];地理與地理信息科學(xué);2012年05期
本文編號(hào):1858551
本文鏈接:http://sikaile.net/kejilunwen/dizhicehuilunwen/1858551.html