基于機器學習的3D仿真足球機器人動作與協(xié)作優(yōu)化

發(fā)布時間：2019-06-04 09:34

【摘要】：本文在RoboCup3D仿真平臺中,實現(xiàn)建立了高通量計算機集群HTCondor系統(tǒng),并在此基礎(chǔ)上建立了Robocup3D個體機器人的動作優(yōu)化機制,對機器人的踢球及行走參數(shù)進行了訓練優(yōu)化,其次基于足球隊伍的陣型及角色分配兩個問題研究了動態(tài)環(huán)境中多智能體的有效協(xié)作對抗策略。個體機器人的動作優(yōu)化問題中,由于單機優(yōu)化速度較慢,采用高吞吐量計算機集群系統(tǒng)分配網(wǎng)絡資源,縮短優(yōu)化時間;其次利用CMA-ES算法,對5類機器人的踢球動作進行了優(yōu)化,使用該算法的增強學習訓練框架,成功地優(yōu)化了機器人的遠射及快踢動作。針對機器人行走優(yōu)化對單一訓練任務過擬合的問題,設計了多個子任務和多個子參數(shù)集的分層學習方法,全面提升了5類機器人的行走、轉(zhuǎn)彎和帶球的機動性和穩(wěn)定性。多智能體的協(xié)作對抗策略中,分別針對足球隊伍的陣型優(yōu)化和多智能體的角色分配優(yōu)化問題進行了研究。首先基于Delaunay三角網(wǎng)對足球場進行了剖分,并采用基于形勢的陣型機制(SBSP)對足球隊伍的陣型進行了設計,有效實現(xiàn)了足球在關(guān)鍵位置的足球隊伍整體陣型的多樣化;在陣型確定的基礎(chǔ)上,利用Markov決策過程(MDP)模型,對機器人隊伍的角色分配進行了優(yōu)化,綜合考慮仿真環(huán)境中5種不同類型的機器人的距離、朝向、是否跌倒、速度等影響因素,利用線性函數(shù)近似的Sarsa(?)學習算法對MDP模型中的動作值函數(shù)進行求解,尋找最優(yōu)的角色分配方案,提高了團隊的整體進攻防守效率。經(jīng)過多次實驗證明,本文的研究對于Apollo3D隊伍的個體機器人以及整體隊伍的陣型分配及角色輪換方面有著較大的提升。
[Abstract]:In this paper, a high-throughput computer cluster HTCondor system is established in the RoboCup 3D simulation platform, and the motion optimization mechanism of the RoboCup 3D individual robot is established, and the training and optimization of the robot's kicking and walking parameters are carried out. Secondly, the effective cooperative countermeasures of multi-agent in the dynamic environment are studied based on the formation and role distribution of the football team. in the problem of the operation optimization of the individual robot, because the single-machine optimization speed is slow, the network resource is distributed by adopting a high-throughput computer cluster system, the optimization time is shortened, and the ball-kicking action of the 5-class robot is optimized by using the CMA-ES algorithm, Using the enhanced learning training framework of this algorithm, the robot's long shot and kick action were successfully optimized. Aiming at the problem of overfitting a single training task by the robot walking optimization, a hierarchical learning method of a plurality of sub-tasks and a plurality of sub-parameter sets is designed, and the mobility and the stability of the walking, turning and ball-carrying of the 5-class robot are comprehensively improved. In the cooperative countermeasure of multi-agent, the problem of the optimization of the array and the role distribution of the multi-agent is studied. First, on the basis of Delaunay triangulation, the football field is divided, and the array type of the football team is designed by using the situation-based array mechanism (SBSP), which effectively realizes the diversification of the overall formation of the football team in the key position; on the basis of the formation determination, By using the Markov decision-making process (MDP) model, the role distribution of the robot team is optimized, and the influence factors such as the distance, the orientation, the fall, the speed and the like of the five different types of robots in the simulation environment are considered. ) The learning algorithm is used to solve the action value function in the MDP model, find the optimal role assignment scheme, and improve the overall attack and defense efficiency of the team. It has been proved by many experiments that the research of this paper has a great effect on the individual robot of the Apollo 3D team and the array distribution and the character rotation of the whole team.
【學位授予單位】：南京郵電大學
【學位級別】：碩士
【學位授予年份】：2017
【分類號】：TP242

【參考文獻】

相關(guān)期刊論文前8條

1 呂家杰;王改云;;機器人足球智能體行為選擇策略仿真研究[J];計算機仿真;2012年09期

2 石軻;陳小平;;行動驅(qū)動的馬爾可夫決策過程及在RoboCup中的應用[J];小型微型計算機系統(tǒng);2011年03期

3 周文杰;徐勇;;基于CMA-ES算法的支持向量機模型選擇[J];計算機仿真;2010年04期

4 齊心躍;田彥濤;楊茂;楊永明;;基于市場機制的多機器人救火任務分配策略[J];吉林大學學報(信息科學版);2009年05期

5 汪連賀,董江;Delaunay三角剖分的快速實現(xiàn)[J];海洋測繪;2005年03期

6 邵春麗,胡鵬,黃承義,彭琪;DELAUNAY三角網(wǎng)的算法詳述及其應用發(fā)展前景[J];測繪科學;2004年06期

7 余麗瓊,周振宇,郭紹忠,郭金庚;Condor系統(tǒng)在大吞吐量計算中的應用[J];信息工程大學學報;2004年01期

8 劉少華,程朋根,趙寶貴;約束數(shù)據(jù)域的Delaunay三角剖分算法研究及應用[J];計算機應用研究;2004年03期

相關(guān)碩士學位論文前3條

1 劉娟;RoboCup3D仿真中雙足機器人的運動規(guī)劃與協(xié)作機制[D];南京郵電大學;2014年

2 傅桂霞;基于AUV測量信息的時空3D數(shù)據(jù)地形構(gòu)建[D];哈爾濱工程大學;2011年

3 石軻;基于馬爾可夫決策過程理論的Agent決策問題研究[D];中國科學技術(shù)大學;2010年

，

本文編號：2492635

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2492635.html

上一篇：基于Pareto改進貓群優(yōu)化算法的多目標拆卸線平衡問題
下一篇：科技創(chuàng)造必須以人為本

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于機器學習的3D仿真足球機器人動作與協(xié)作優(yōu)化