基于深度強(qiáng)化學(xué)習(xí)的智能模型車(chē)云端決策方法研究
發(fā)布時(shí)間:2021-09-24 01:09
提高交通效率的常用方法是控制交通信號(hào)燈以確保交通暢通,然而由于車(chē)輛行為的不可控,實(shí)際效果有限。隨著智能網(wǎng)聯(lián)汽車(chē)技術(shù)的發(fā)展,交通系統(tǒng)的云端控制中心不僅可以控制交通信號(hào)燈,還有可能直接控制車(chē)輛。在這種以云端為控制中心的交通管理模式下,云端決策能力是決定交通系統(tǒng)效率的關(guān)鍵因素。云端決策算法將是未來(lái)智能交通系統(tǒng)的關(guān)鍵技術(shù)。由于云端決策研究涉及多車(chē)協(xié)同,使用真實(shí)車(chē)輛進(jìn)行研究的難度和危險(xiǎn)性都很高。因此本文將以智能模型車(chē)為載體重點(diǎn)研究基于深度強(qiáng)化學(xué)習(xí)的云端決策算法。本文的研究工作可以大致分為三大部分:首先,本文提出了融合視覺(jué)與UWB的室內(nèi)定姿定位算法,解決了模擬車(chē)位姿信息的準(zhǔn)確獲取。目前已有的室內(nèi)定位方法是基于相機(jī)的檢測(cè),缺點(diǎn)是對(duì)目標(biāo)的光線和顏色過(guò)于敏感,導(dǎo)致檢測(cè)定位不夠穩(wěn)定可靠。因此,本文研究了不依賴(lài)相機(jī)的無(wú)線定位方式UWB,構(gòu)建了基站自適應(yīng)選擇的UWB定位系統(tǒng),解決多基站時(shí)間難同步和定位精度不穩(wěn)定的問(wèn)題。在此基礎(chǔ)上,本文進(jìn)一步研究了實(shí)時(shí)定位信息與地圖先驗(yàn)信息的融合定位方法,實(shí)現(xiàn)了多目標(biāo)的位置檢測(cè)與跟蹤。本文基于UWB定位系統(tǒng)實(shí)現(xiàn)了良好的定位效果,然而無(wú)法獲取模型車(chē)姿態(tài)信息。因此進(jìn)一步研究了相機(jī)檢...
【文章來(lái)源】:清華大學(xué)北京市 211工程院校 985工程院校 教育部直屬院校
【文章頁(yè)數(shù)】:94 頁(yè)
【學(xué)位級(jí)別】:碩士
【文章目錄】:
摘要
Abstract
Chapter1 Introduction
1.1 Background
1.2 Related Work
1.3 Problem Statement
1.4 Thesis Outline
Chapter2 Theory of Deep Reinforcement Learning
2.1 Reinforcement Learning
2.1.1 Element of Reinforcement Learning
2.1.2 Markov Decision Process
2.1.3 Dynamic Programming
2.1.4 Learning Method
2.2 Neural Network
2.2.1 Basics of Neural Network
2.2.2 Neural Network in Reinforcement Learning(Deep Reinforcement Learning)
Chapter3 The localization based on Fusion of uwb and camera
3.1 Overall Structure of Localization System
3.2 UWB Localization
3.2.1 Ranging Process
3.2.2 False Ranging Detection
3.2.3 Solving Trilateration Algorithm
3.3 Camera Localization
3.3.1 Fish Eye Calibration
3.3.2 Camera Detection
3.4 Sensor Fusion Between Camera,UWB and Map Information
Chapter4 Deep Reinforcement Learning Method
4.1 Training Environment
4.1.1 Map Drawing
4.1.2 Dynamics Model
4.1.3 Steering Control Model
4.1.4 State Action Space
4.1.5 Reward Function
4.1.6 Termination Stage
4.2 Reinforcement Learning Models
4.2.1 Deep Q learning with Experience Replays
4.2.2 Asynchronous Advantage Actor Critic(A3C)
4.3 Training Results
4.3.1 Deep Q learning Network
4.3.2 Asynchronous Advantage Actor Critic(A3C)
Chapter5 Deep Reinforcement Learning Validation and Evaluation
5.1 Experimental Setup
5.2 Motion Control of Intelligent Vehicle
5.3 Localization Result
5.4 Validation Decision Making Experiment Result
5.4.1 Validation of RL Based Decision Through Simulation Software
5.4.2 Validation of RL based Decision Through Model Cars
Chapter6 Conclusion
References
Acknowledgement
RESUME
本文編號(hào):3406794
【文章來(lái)源】:清華大學(xué)北京市 211工程院校 985工程院校 教育部直屬院校
【文章頁(yè)數(shù)】:94 頁(yè)
【學(xué)位級(jí)別】:碩士
【文章目錄】:
摘要
Abstract
Chapter1 Introduction
1.1 Background
1.2 Related Work
1.3 Problem Statement
1.4 Thesis Outline
Chapter2 Theory of Deep Reinforcement Learning
2.1 Reinforcement Learning
2.1.1 Element of Reinforcement Learning
2.1.2 Markov Decision Process
2.1.3 Dynamic Programming
2.1.4 Learning Method
2.2 Neural Network
2.2.1 Basics of Neural Network
2.2.2 Neural Network in Reinforcement Learning(Deep Reinforcement Learning)
Chapter3 The localization based on Fusion of uwb and camera
3.1 Overall Structure of Localization System
3.2 UWB Localization
3.2.1 Ranging Process
3.2.2 False Ranging Detection
3.2.3 Solving Trilateration Algorithm
3.3 Camera Localization
3.3.1 Fish Eye Calibration
3.3.2 Camera Detection
3.4 Sensor Fusion Between Camera,UWB and Map Information
Chapter4 Deep Reinforcement Learning Method
4.1 Training Environment
4.1.1 Map Drawing
4.1.2 Dynamics Model
4.1.3 Steering Control Model
4.1.4 State Action Space
4.1.5 Reward Function
4.1.6 Termination Stage
4.2 Reinforcement Learning Models
4.2.1 Deep Q learning with Experience Replays
4.2.2 Asynchronous Advantage Actor Critic(A3C)
4.3 Training Results
4.3.1 Deep Q learning Network
4.3.2 Asynchronous Advantage Actor Critic(A3C)
Chapter5 Deep Reinforcement Learning Validation and Evaluation
5.1 Experimental Setup
5.2 Motion Control of Intelligent Vehicle
5.3 Localization Result
5.4 Validation Decision Making Experiment Result
5.4.1 Validation of RL Based Decision Through Simulation Software
5.4.2 Validation of RL based Decision Through Model Cars
Chapter6 Conclusion
References
Acknowledgement
RESUME
本文編號(hào):3406794
本文鏈接:http://sikaile.net/kejilunwen/qiche/3406794.html
最近更新
教材專(zhuān)著