基于出租車軌跡數(shù)據(jù)挖掘的居民出行特征研究
本文選題:出租車 + 軌跡數(shù)據(jù)挖掘; 參考:《長安大學(xué)》2017年碩士論文
【摘要】:居民出行行為分析是進(jìn)行城市綜合交通體系規(guī)劃和城市建設(shè)規(guī)劃十分重要的基礎(chǔ)工作,同時也是制定交通政策的有效依據(jù)。傳統(tǒng)的居民出行數(shù)據(jù)采集主要靠入戶訪談及問卷調(diào)查,有著誤報率高、費(fèi)時費(fèi)力等缺點(diǎn),已不能滿足現(xiàn)代社會的需要。隨著地理信息系統(tǒng)(GIS)技術(shù)的飛速發(fā)展和全球定位系統(tǒng)(GPS)的廣泛部署,大量個體的移動軌跡數(shù)據(jù)被廣泛的存儲起來,為居民出行行為分析提供了新的思路。論文以西安市12000輛出租車一個月的真實(shí)GPS數(shù)據(jù)集為基礎(chǔ),研究如何從出租車軌跡數(shù)據(jù)提取出行特征,并將其應(yīng)用于居民出行行為分析、熱門區(qū)域發(fā)現(xiàn)、區(qū)域功能識別等內(nèi)容。論文的主要工作如下:(1)利用基于云計算的MapReduce并行計算框架對原始GPS數(shù)據(jù)集進(jìn)行二次排序和軌跡提取,完成了數(shù)據(jù)清洗及地圖匹配的工作。(2)從GPS數(shù)據(jù)中提取居民出行的OD信息,設(shè)計分時段的平均出行次數(shù)、平均出行時長、平均出行距離等多種出行特征,對比分析西安市居民節(jié)假日和工作日出行不同的時間規(guī)律,并用可視化的方法分析居民出行在空間上的分布情況。(3)提出一種改進(jìn)的DBSCAN算法的居民出行熱門區(qū)域發(fā)現(xiàn)算法,改善了傳統(tǒng)DBSCAN算法對參數(shù)敏感,聚類范圍無限制的缺點(diǎn)。該算法根據(jù)簇的動態(tài)近鄰密度自適應(yīng)的選擇參數(shù),給定熱門區(qū)域的面積約束,并對超過面積約束的簇進(jìn)行分裂,將數(shù)量眾多的OD點(diǎn)聚類成大小合理的熱門區(qū)域。(4)提取不同維度的熱門區(qū)域的人流時序特征,描述區(qū)域的人流變化規(guī)律與不同區(qū)域的社會功能之間的關(guān)系。提出了一種結(jié)合不確定抽樣的半監(jiān)督分類算法,將之應(yīng)用于熱門區(qū)域的社會功能識別上,最終成功將熱門區(qū)域分為車站、景區(qū)、商業(yè)區(qū)、居民區(qū)、學(xué)校、娛樂區(qū)六大類。實(shí)驗結(jié)果證明,出租車軌跡數(shù)據(jù)能較好的反映城市居民出行的時空分布規(guī)律。改進(jìn)的DBSCAN算法能聚類出合理面積的居民出行熱門區(qū)域,避免了傳統(tǒng)算法聚類結(jié)果面積不受約束的缺點(diǎn)。熱門區(qū)域的人流特征可以來識別區(qū)域的社會功能,且細(xì)顆粒的人流時序特征分類效果更好。結(jié)合不確定抽樣的半監(jiān)督分類算法只需要對少量的區(qū)域進(jìn)行標(biāo)注,即可獲得較高的分類精度。
[Abstract]:The analysis of residents' travel behavior is a very important basic work for urban comprehensive transportation system planning and urban construction planning, and it is also an effective basis for formulating traffic policies. The traditional data collection of residents travel mainly depends on household interviews and questionnaires, which has the shortcomings of high false alarm rate, time-consuming and laborious, and can not meet the needs of modern society. With the rapid development of GIS (Geographic Information system) technology and the wide deployment of GPS (Global Positioning system), a large number of individual trajectory data are widely stored, which provides a new idea for the analysis of residents' travel behavior. Based on the real GPS data set of 12000 taxis in Xi'an in one month, this paper studies how to extract the travel characteristics from the taxi track data, and applies it to the analysis of residents' travel behavior, the discovery of popular areas and the identification of regional functions. The main work of this paper is as follows: (1) using the MapReduce parallel computing framework based on cloud computing, the original GPS data set is sorted and locus extracted, and the data cleaning and map matching are completed. The OD information of residents travel is extracted from GPS data. Design the average travel times, average travel time, average travel distance and other travel characteristics, compare and analyze the different travel time rules of Xi'an residents on holidays and working days. Using the visualization method to analyze the spatial distribution of resident trip, we propose an improved DBSCAN algorithm for finding popular areas of resident travel, which improves the disadvantage of traditional DBSCAN algorithm, which is sensitive to parameters and unlimited in clustering range. According to the dynamic nearest neighbor density of the cluster, the algorithm adaptively selects the parameters, gives the area constraint of the hot area, and splits the cluster that exceeds the area constraint. A large number of OD points are clustered into hot areas of reasonable size. The time-series features of the popular areas with different dimensions are extracted to describe the relationship between the changing law of the flow of people and the social functions of different regions. A semi-supervised classification algorithm combined with uncertain sampling is proposed, which is applied to the social function recognition of hot areas. Finally, the hot areas are divided into six categories: station, scenic area, commercial district, residential area, school and entertainment area. The experimental results show that the taxi track data can well reflect the spatial and temporal distribution of urban residents' travel. The improved DBSCAN algorithm can cluster the popular area of residents with reasonable area and avoid the disadvantage that the result area of the traditional algorithm is unconstrained. The characteristics of the popular area can identify the social function of the region, and the classification effect of the fine particles is better. The semi-supervised classification algorithm based on uncertain sampling only needs to label a small number of regions to achieve higher classification accuracy.
【學(xué)位授予單位】:長安大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:U491;TP311.13
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 呂紹仟;孟凡榮;袁冠;;基于軌跡結(jié)構(gòu)的移動對象熱點(diǎn)區(qū)域發(fā)現(xiàn)[J];計算機(jī)應(yīng)用;2017年01期
2 盧光躍;劉迪;岳峗;董靜怡;;基于電信位置數(shù)據(jù)的人群活動熱點(diǎn)區(qū)域識別[J];西安郵電大學(xué)學(xué)報;2017年01期
3 蔡柳;`u飛;葉敏;康科;趙祥模;;基于不確定抽樣的半監(jiān)督城市土地功能分類方法[J];吉林大學(xué)學(xué)報(信息科學(xué)版);2016年04期
4 陳世莉;陶海燕;李旭亮;卓莉;;基于潛在語義信息的城市功能區(qū)識別——廣州市浮動車GPS時空數(shù)據(jù)挖掘[J];地理學(xué)報;2016年03期
5 涂山山;陶懷舟;黃永峰;;基于半監(jiān)督學(xué)習(xí)的即時語音通信隱藏檢測[J];清華大學(xué)學(xué)報(自然科學(xué)版);2015年11期
6 劉建偉;劉媛;羅雄麟;;半監(jiān)督學(xué)習(xí)方法[J];計算機(jī)學(xué)報;2015年08期
7 郭雪婷;秦艷麗;雷震;;基于出租車GPS數(shù)據(jù)的城市道路擁堵判別[J];交通信息與安全;2013年05期
8 夏英;溫海平;張旭;;基于軌跡聚類的熱點(diǎn)路徑分析方法[J];重慶郵電大學(xué)學(xué)報(自然科學(xué)版);2011年05期
9 袁冠;夏士雄;張磊;周勇;;基于結(jié)構(gòu)相似度的軌跡聚類算法[J];通信學(xué)報;2011年09期
10 閆小勇;;人類個體出行行為的統(tǒng)計實(shí)證[J];電子科技大學(xué)學(xué)報;2011年02期
相關(guān)碩士學(xué)位論文 前2條
1 姚國鑫;城市居民出行調(diào)查抽樣技術(shù)與數(shù)據(jù)分析研究[D];長安大學(xué);2010年
2 李民;基于活動鏈的居民出行行為分析[D];吉林大學(xué);2004年
,本文編號:1825548
本文鏈接:http://sikaile.net/kejilunwen/daoluqiaoliang/1825548.html