基于軌跡數(shù)據(jù)挖掘的短時(shí)出租車區(qū)域分布預(yù)測研究
本文選題:出租車 + 軌跡數(shù)據(jù)挖掘 ; 參考:《吉林大學(xué)》2017年碩士論文
【摘要】:出租車是城市公共交通系統(tǒng)中一個(gè)不可或缺的重要組成部分,為市民提供了一種方便快捷的出行方式。近年來,隨著移動(dòng)互聯(lián)網(wǎng)的發(fā)展,類似于滴滴這樣的網(wǎng)約車服務(wù)平臺(tái)興起,一方面在為市民提供了一種更加高效便捷的出行方式選擇的基礎(chǔ)上,另一方面也對(duì)傳統(tǒng)的出租車運(yùn)營模式造成了一定程度的沖擊,傳統(tǒng)的出租車行業(yè)面臨較大的競爭壓力。出租車運(yùn)營公司迫切需要提升運(yùn)營效率提高競爭力,抗衡網(wǎng)約車所造成的沖擊。其中有效的空載出租車運(yùn)力調(diào)度是降低出租車空載率,提升運(yùn)營效率的關(guān)鍵,而科學(xué)合理的空載出租車調(diào)度基于對(duì)未來一段時(shí)間內(nèi)不同區(qū)域打車需求情況以及出租車運(yùn)營公司旗下所有出租車在不同區(qū)域的分布情況的精確預(yù)測,打車需求遠(yuǎn)大于出租車供應(yīng)的區(qū)域是出租車調(diào)度的目的地,通過將未來會(huì)出現(xiàn)在供大于求區(qū)域的空載出租車調(diào)度到未來出租車供小于求的區(qū)域,從而達(dá)到出租車供需平衡的狀態(tài)是空載出租車調(diào)度所追求的目標(biāo)。本文主要關(guān)注于對(duì)未來一段時(shí)間出租車的區(qū)域分布情況進(jìn)行預(yù)測的研究。與打車請(qǐng)求所具有的明顯的隨機(jī)性不同,未來一段時(shí)間內(nèi)出租車在不同區(qū)域的分布情況與當(dāng)前出租車在各個(gè)區(qū)域的分布情況高度相關(guān),因此對(duì)出租車區(qū)域分布進(jìn)行預(yù)測主要利用當(dāng)前出租車的區(qū)域分布信息。通過挖掘歷史出租車軌跡數(shù)據(jù),本文提出了多個(gè)不同類型的短時(shí)出租車區(qū)域分布預(yù)測算法,并且通過模擬預(yù)測實(shí)驗(yàn),驗(yàn)證對(duì)比了各個(gè)算法的預(yù)測效果,三種預(yù)測算法分別是基于概率統(tǒng)計(jì)的馬爾可夫過程預(yù)測算法,屬于無監(jiān)督學(xué)習(xí)的矩陣分解算法,屬于監(jiān)督學(xué)習(xí)的GBRT預(yù)測算法等。馬爾可夫過程是隨機(jī)過程的一種,其最重要的性質(zhì)是馬爾可夫性即無后效性,在短時(shí)出租車區(qū)域分布預(yù)測問題中,基于馬爾可夫過程的預(yù)測算法通過將實(shí)時(shí)出租車區(qū)域分布抽象為向量形式,然后與描述一天中此時(shí)段出租車在各個(gè)區(qū)域之間進(jìn)行轉(zhuǎn)移的區(qū)域轉(zhuǎn)移概率矩陣進(jìn)行矩陣乘法運(yùn)算,從而獲取一段時(shí)間后出租車在各個(gè)區(qū)域內(nèi)的分布預(yù)測。矩陣分解是隱語義模型算法的核心步驟,主要應(yīng)用于推薦系統(tǒng)領(lǐng)域,本文中將矩陣分解算法引入出租車區(qū)域分布預(yù)測問題中,并且基于出租車區(qū)域分布的時(shí)空特性對(duì)基本的矩陣分解算法應(yīng)用進(jìn)行了改造,使其可以有效的適用于出租車區(qū)域分布預(yù)測問題。GBRT算法是一種典型的監(jiān)督機(jī)器學(xué)習(xí)算法,具有泛化能力強(qiáng),預(yù)測精確度高的特性,本文中通過為每一個(gè)區(qū)域單獨(dú)訓(xùn)練一個(gè)回歸器的方法來預(yù)測每個(gè)區(qū)域內(nèi)一段時(shí)間后會(huì)出現(xiàn)的出租車數(shù)目。為了有效的利用出租車運(yùn)營公司多年來所積攢的海量軌跡數(shù)據(jù),本文中利用軌跡數(shù)據(jù)挖掘技術(shù)對(duì)原始的軌跡數(shù)據(jù)進(jìn)行了預(yù)處理,從原始的軌跡數(shù)據(jù)中抽取出了與出租車時(shí)空分布相關(guān)的信息,并且將其組織為Tensor形式。在隨后的預(yù)測算法學(xué)習(xí)與模擬預(yù)測階段,可以方便的將Tensor中保存的相關(guān)信息轉(zhuǎn)換為適合學(xué)習(xí)算法進(jìn)行訓(xùn)練與模擬預(yù)測的形式。
[Abstract]:Taxi is an indispensable and important part of the urban public transportation system, which provides a convenient and quick way for the citizens to travel. In recent years, with the development of mobile Internet, the network of car service platform, which is similar to drip, has been rising. On the one hand, it provides a more efficient and convenient way for the citizens to choose the way to travel. On the other hand, on the other hand, the traditional taxi operation mode has caused a certain degree of impact, the traditional taxi industry is facing greater competition pressure. The taxi operation company urgently needs to improve the operation efficiency to improve the competitiveness and counterbalance the impact caused by the network about the car. Car rental rate is the key to improve the operation efficiency, and the scientific and reasonable taxi dispatch is based on the demand for the taxi in different areas in the next period of time and the accurate prediction of the distribution of all taxis under the taxi operation company in different areas. The taxi demand is far greater than the taxi supply area is the taxi regulation. In order to achieve a taxi supply and demand balance, the aim of the taxi dispatching is to dispatch the future taxi to the area where the taxi supply is less than the demand in the future. This paper mainly focuses on the prediction of the regional distribution of the taxi in the future period. The distribution of taxis in different regions is highly related to the distribution of taxis in different regions in the next period of time. Therefore, the regional distribution of taxis is predicted mainly by the regional distribution information of current taxis. Through mining historical taxis In this paper, a number of different types of short-term taxi regional distribution prediction algorithms are proposed, and the prediction results of each algorithm are compared by simulation prediction experiments. The three prediction algorithms are Markov process forecasting based on probability statistics, which belong to the unsupervised learning matrix decomposition algorithm and belong to the supervision. The GBRT prediction algorithm for learning. The Markov process is one of the random processes. The most important nature of the process is that the Markov nature is no aftereffect. In the short time taxi regional distribution prediction problem, the prediction algorithm based on the Markov process is abstracted into the vector form by the real-time taxi area distribution, and then it is described in the middle of the day. In order to obtain the distribution prediction of taxis in various regions after a period of time, a taxi can obtain the distribution prediction of the taxis in each region. Matrix decomposition is the core step of the algorithm of the semantic model of the hidden language, which is mainly used in the field of recommendation system. In this paper, the matrix decomposition algorithm is introduced to rent. In the vehicle regional distribution prediction problem, and based on the spatial and temporal characteristics of the taxis distribution, the application of the basic matrix decomposition algorithm is reformed, so that it can be effectively applied to the taxi regional distribution prediction problem.GBRT algorithm is a typical supervised machine learning algorithm, which has the characteristics of strong generalization ability and high prediction accuracy. In this paper, the number of taxis that will appear after a period of time in each region is predicted by training a regression device separately for each region. In order to effectively use the massive trajectory data accumulated by the taxi operators for many years, the trajectory data mining technique is used to preprocess the original trajectory data in this paper. The information related to the space-time distribution of taxis is extracted from the original trajectory data, and it is organized into a Tensor form. In the subsequent prediction algorithm learning and simulation prediction phase, the related information stored in the Tensor can be easily converted into the form of training and simulation prediction suitable for learning algorithms.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:F570;O211.62
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 鄭宇;;城市計(jì)算概述[J];武漢大學(xué)學(xué)報(bào)(信息科學(xué)版);2015年01期
2 陸鋒;劉康;陳潔;;大數(shù)據(jù)時(shí)代的人類移動(dòng)性研究[J];地球信息科學(xué)學(xué)報(bào);2014年05期
3 李婷;裴韜;袁燁城;宋辭;王維一;楊格格;;人類活動(dòng)軌跡的分類、模式和應(yīng)用研究綜述[J];地理科學(xué)進(jìn)展;2014年07期
4 陸鋒;張恒才;;大數(shù)據(jù)與廣義GIS[J];武漢大學(xué)學(xué)報(bào)(信息科學(xué)版);2014年06期
5 許寧;尹凌;胡金星;;從大規(guī)模短期規(guī)則采樣的手機(jī)定位數(shù)據(jù)中識(shí)別居民職住地[J];武漢大學(xué)學(xué)報(bào)(信息科學(xué)版);2014年06期
6 龍瀛;張宇;崔承印;;利用公交刷卡數(shù)據(jù)分析北京職住關(guān)系和通勤出行[J];地理學(xué)報(bào);2012年10期
7 劉良旭;樂嘉錦;喬少杰;宋加濤;;基于軌跡點(diǎn)局部異常度的異常點(diǎn)檢測算法[J];計(jì)算機(jī)學(xué)報(bào);2011年10期
8 袁冠;夏士雄;張磊;周勇;;基于結(jié)構(gòu)相似度的軌跡聚類算法[J];通信學(xué)報(bào);2011年09期
9 劉瑜;肖昱;高松;康朝貴;王瑤莉;;基于位置感知設(shè)備的人類移動(dòng)研究綜述[J];地理與地理信息科學(xué);2011年04期
10 周傲英;楊彬;金澈清;馬強(qiáng);;基于位置的服務(wù):架構(gòu)與進(jìn)展[J];計(jì)算機(jī)學(xué)報(bào);2011年07期
相關(guān)博士學(xué)位論文 前2條
1 呂明琪;基于軌跡數(shù)據(jù)挖掘的語義化位置感知計(jì)算研究[D];浙江大學(xué);2012年
2 張治華;基于GPS軌跡的出行信息提取研究[D];華東師范大學(xué);2010年
,本文編號(hào):1894592
本文鏈接:http://sikaile.net/shoufeilunwen/benkebiyelunwen/1894592.html