基于機(jī)器學(xué)習(xí)的3D仿真足球機(jī)器人動作與協(xié)作優(yōu)化
[Abstract]:In this paper, a high-throughput computer cluster HTCondor system is established in the RoboCup 3D simulation platform, and the motion optimization mechanism of the RoboCup 3D individual robot is established, and the training and optimization of the robot's kicking and walking parameters are carried out. Secondly, the effective cooperative countermeasures of multi-agent in the dynamic environment are studied based on the formation and role distribution of the football team. in the problem of the operation optimization of the individual robot, because the single-machine optimization speed is slow, the network resource is distributed by adopting a high-throughput computer cluster system, the optimization time is shortened, and the ball-kicking action of the 5-class robot is optimized by using the CMA-ES algorithm, Using the enhanced learning training framework of this algorithm, the robot's long shot and kick action were successfully optimized. Aiming at the problem of overfitting a single training task by the robot walking optimization, a hierarchical learning method of a plurality of sub-tasks and a plurality of sub-parameter sets is designed, and the mobility and the stability of the walking, turning and ball-carrying of the 5-class robot are comprehensively improved. In the cooperative countermeasure of multi-agent, the problem of the optimization of the array and the role distribution of the multi-agent is studied. First, on the basis of Delaunay triangulation, the football field is divided, and the array type of the football team is designed by using the situation-based array mechanism (SBSP), which effectively realizes the diversification of the overall formation of the football team in the key position; on the basis of the formation determination, By using the Markov decision-making process (MDP) model, the role distribution of the robot team is optimized, and the influence factors such as the distance, the orientation, the fall, the speed and the like of the five different types of robots in the simulation environment are considered. ) The learning algorithm is used to solve the action value function in the MDP model, find the optimal role assignment scheme, and improve the overall attack and defense efficiency of the team. It has been proved by many experiments that the research of this paper has a great effect on the individual robot of the Apollo 3D team and the array distribution and the character rotation of the whole team.
【學(xué)位授予單位】:南京郵電大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP242
【參考文獻(xiàn)】
相關(guān)期刊論文 前8條
1 呂家杰;王改云;;機(jī)器人足球智能體行為選擇策略仿真研究[J];計算機(jī)仿真;2012年09期
2 石軻;陳小平;;行動驅(qū)動的馬爾可夫決策過程及在RoboCup中的應(yīng)用[J];小型微型計算機(jī)系統(tǒng);2011年03期
3 周文杰;徐勇;;基于CMA-ES算法的支持向量機(jī)模型選擇[J];計算機(jī)仿真;2010年04期
4 齊心躍;田彥濤;楊茂;楊永明;;基于市場機(jī)制的多機(jī)器人救火任務(wù)分配策略[J];吉林大學(xué)學(xué)報(信息科學(xué)版);2009年05期
5 汪連賀,董江;Delaunay三角剖分的快速實現(xiàn)[J];海洋測繪;2005年03期
6 邵春麗,胡鵬,黃承義,彭琪;DELAUNAY三角網(wǎng)的算法詳述及其應(yīng)用發(fā)展前景[J];測繪科學(xué);2004年06期
7 余麗瓊,周振宇,郭紹忠,郭金庚;Condor系統(tǒng)在大吞吐量計算中的應(yīng)用[J];信息工程大學(xué)學(xué)報;2004年01期
8 劉少華,程朋根,趙寶貴;約束數(shù)據(jù)域的Delaunay三角剖分算法研究及應(yīng)用[J];計算機(jī)應(yīng)用研究;2004年03期
相關(guān)碩士學(xué)位論文 前3條
1 劉娟;RoboCup3D仿真中雙足機(jī)器人的運(yùn)動規(guī)劃與協(xié)作機(jī)制[D];南京郵電大學(xué);2014年
2 傅桂霞;基于AUV測量信息的時空3D數(shù)據(jù)地形構(gòu)建[D];哈爾濱工程大學(xué);2011年
3 石軻;基于馬爾可夫決策過程理論的Agent決策問題研究[D];中國科學(xué)技術(shù)大學(xué);2010年
,本文編號:2492635
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2492635.html