基于強(qiáng)化學(xué)習(xí)的自適應(yīng)城市交通信號控制方法研究

發(fā)布時間：2018-03-03 18:07

本文選題：強(qiáng)化學(xué)習(xí)　切入點：智能交通　出處：《浙江師范大學(xué)》2015年碩士論文　論文類型：學(xué)位論文

【摘要】：城市道路不斷興建和擴(kuò)寬,基礎(chǔ)設(shè)施建設(shè)投入也越來越大,然而城市交通擁堵問題卻越來越嚴(yán)重,主要原因是現(xiàn)有的城市交通信號控制TSC (Traffic Signal Control)系統(tǒng)不能充分做到對交通流量的最優(yōu)控制和管理。因此,如何通過交通信號的最優(yōu)控制來設(shè)計和優(yōu)化城市TSC系統(tǒng),成為保障交通安全和暢通、增加道路通行效率及其緩解交通擁塞問題的關(guān)鍵所在。本文選擇基于Q-learning算法的單Agent控制體系結(jié)構(gòu),基于分布式Q-learning算法的Multi-Agent系統(tǒng)以及Green Light District(GLD)開源仿真平臺進(jìn)行城市TSC系統(tǒng)優(yōu)化研究,主要做了如下工作：(1)設(shè)計了基于單路口和井字形區(qū)域路口的城市TSC系統(tǒng)Agent框架,模擬城市道路控制。對于城市單路口,通過一個智能Agent實時檢測每個方向的交通流數(shù)據(jù),交通流數(shù)據(jù)通過模糊邏輯化,輸入設(shè)計的單路口Q-learning決策器,尋得最優(yōu)控制策略。對于區(qū)域交通控制,提出了分布式Q-learning算法和MAS結(jié)合的優(yōu)化控制方式,給出了相鄰路口Agent協(xié)調(diào)控制模型,實現(xiàn)相鄰路口之間信息共享。(2)解決了Q-learning算法和分布式Q-learning算法對交通環(huán)境狀態(tài)集S、動作策略集A、獎懲函數(shù)R等關(guān)鍵問題。狀態(tài)空間的選擇,設(shè)計用模糊邏輯來計算排隊長度；動作策略集A：增加、保持和減少相位綠燈時間：獎懲函數(shù)R以路口車輛排隊長度作為指標(biāo),以車輛排隊長度最小為目的。(3)實現(xiàn)了分布式Q-learning算法在區(qū)域TSC系統(tǒng)優(yōu)化上的運(yùn)用,解決了區(qū)域信號協(xié)調(diào)控制問題。分布式Q-learning算法和MAS的結(jié)合,實現(xiàn)對城市TSC系統(tǒng)最優(yōu)控制。城市區(qū)域交通網(wǎng)絡(luò)是分布式的多Agent網(wǎng)絡(luò),建立了基于分布式Q-learning算法的Multi-Agent模型框架,同時給出了分布式Q-learning算法設(shè)計的詳細(xì)步驟。最后分析了基于Q-learning算法的單路口城市TSC優(yōu)化和基于分布式Q-learning算法的區(qū)域TSC優(yōu)化的算法性能。在GLD中,對隨機(jī)配時,固定配時,Longest-queue, Traffic-controller 1 (TC1), ACGJ-1、Q-learning算法和分布式Q-learning算法優(yōu)化性能進(jìn)行了模擬驗證分析,實驗結(jié)果表明了Q-learning算法和分布式Q-learning算法在城市TSC系統(tǒng)優(yōu)化上優(yōu)于其他算法。
[Abstract]:With the construction and widening of urban roads and the increasing investment in infrastructure construction, however, the problem of urban traffic congestion is becoming more and more serious. The main reason is that the existing urban traffic signal control TSC traffic Signal control system can not fully achieve the optimal control and management of traffic flow. Therefore, how to design and optimize the urban TSC system through the optimal control of traffic signals, This paper chooses a single Agent control architecture based on Q-learning algorithm, which is the key to ensure traffic safety and smooth flow, increase road traffic efficiency and alleviate traffic congestion. Multi-Agent system based on distributed Q-learning algorithm and Green Light restricted GLD open source simulation platform are used to optimize urban TSC system. The main work is as follows: design the Agent framework of urban TSC system based on single intersection and well shaped area intersection. Simulation of urban road control. For a single intersection of a city, real-time detection of traffic flow data in each direction through an intelligent Agent, traffic flow data through fuzzy logic, input the design of a single intersection Q-learning decision maker, The optimal control strategy is obtained. For regional traffic control, a distributed Q-learning algorithm combined with MAS is proposed, and a coordinated control model of Agent in adjacent junctions is given. Realizing the information sharing between adjacent junctions.) it solves the key problems of Q-learning algorithm and distributed Q-learning algorithm to traffic environment state set, action strategy set A, reward and punishment function R, etc. The choice of state space is designed to calculate queue length with fuzzy logic. Action strategy set A: increase, maintain and reduce phase green time: the reward and punishment function R takes the queue length of the intersection as the index, and takes the minimum queue length as the goal. It realizes the application of the distributed Q-learning algorithm in the optimization of the regional TSC system. The problem of regional signal coordination control is solved. The combination of distributed Q-learning algorithm and MAS realizes the optimal control of urban TSC system. The urban area traffic network is a distributed multi-#en2# network. A Multi-Agent model framework based on distributed Q-learning algorithm is established. At the same time, the detailed steps of designing distributed Q-learning algorithm are given. Finally, the performance of single-intersection TSC optimization based on Q-learning algorithm and regional TSC optimization based on distributed Q-learning algorithm is analyzed. The optimization performance of fixed time scheduling algorithm, Traffic-controller 1 / TC1, ACGJ-1 and distributed Q-learning algorithm is simulated and analyzed. The experimental results show that Q-learning algorithm and distributed Q-learning algorithm are superior to other algorithms in urban TSC system optimization.
【學(xué)位授予單位】：浙江師范大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2015
【分類號】：U491.54

【參考文獻(xiàn)】

相關(guān)期刊論文前3條

1 李春貴;周堅和;孫自廣;王萌;張增芳;;基于多智能體團(tuán)隊強(qiáng)化學(xué)習(xí)的交通信號控制[J];廣西工學(xué)院學(xué)報;2011年02期

2 劉忠;李海紅;劉全;;強(qiáng)化學(xué)習(xí)算法研究[J];計算機(jī)工程與設(shè)計;2008年22期

3 劉小明;何忠賀;;城市智能交通系統(tǒng)技術(shù)發(fā)展現(xiàn)狀及趨勢[J];自動化博覽;2015年01期

相關(guān)碩士學(xué)位論文前2條

1 謝子青;基于模糊邏輯的智能交通信號控制方法及仿真研究[D];電子科技大學(xué);2011年

2 董友球;基于強(qiáng)化學(xué)習(xí)的區(qū)域交通控制方法研究[D];五邑大學(xué);2008年

，

本文編號：1562112

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/daoluqiaoliang/1562112.html

上一篇：基于層次分析法的公路運(yùn)營期路基狀況評價
下一篇：基于ANSYS接觸分析的拱座臺階基礎(chǔ)計算

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于強(qiáng)化學(xué)習(xí)的自適應(yīng)城市交通信號控制方法研究