當(dāng)前位置：主頁(yè) > 科技論文 > 自動(dòng)化論文 >

基于Stackelberg策略的多Agent強(qiáng)化學(xué)習(xí)警力巡邏路徑規(guī)劃

發(fā)布時(shí)間：2018-02-16 13:10

本文關(guān)鍵詞： 巡邏路線規(guī)劃 Stackelberg強(qiáng)均衡策略多agent 強(qiáng)化學(xué)習(xí)　出處：《北京理工大學(xué)學(xué)報(bào)》2017年01期 　論文類(lèi)型：期刊論文

【摘要】：為解決現(xiàn)有的巡邏路徑規(guī)劃算法僅僅能夠處理雙人博弈和忽略攻擊者存在的問(wèn)題,提出一種新的基于多agent的強(qiáng)化學(xué)習(xí)算法.在給定攻擊目標(biāo)分布的情況下,規(guī)劃任意多防御者和攻擊者條件下的最優(yōu)巡邏路徑.考慮到防御者與攻擊者選擇策略的非同時(shí)性,采用了Stackelberg強(qiáng)均衡策略作為每個(gè)agent選擇策略的依據(jù).為了驗(yàn)證算法,在多個(gè)巡邏任務(wù)中進(jìn)行了測(cè)試.定量和定性的實(shí)驗(yàn)結(jié)果證明了算法的收斂性和有效性.
[Abstract]:In order to solve the problem that the existing patrol path planning algorithms can only deal with the two-game and ignore the attackers, a new reinforcement learning algorithm based on multiple agent is proposed. The optimal patrol path is planned under the condition of arbitrary multiple defenders and attackers. Considering the non-synchronization of the choice strategy between the defender and the attacker, the Stackelberg strong equilibrium strategy is adopted as the basis of each agent selection strategy. The results of quantitative and qualitative experiments show that the algorithm is convergent and effective.
【作者單位】：中國(guó)人民公安大學(xué)網(wǎng)絡(luò)安全保衛(wèi)學(xué)院;
【基金】：中國(guó)人民公安大學(xué)基本科研業(yè)務(wù)費(fèi)項(xiàng)目(2014JKF01132)
【分類(lèi)號(hào)】：D631.1;TP18
，

本文編號(hào)：1515593

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1515593.html

上一篇：改進(jìn)螢火蟲(chóng)算法及其在全局優(yōu)化問(wèn)題中的應(yīng)用
下一篇：無(wú)鰭舵矢量推進(jìn)水下機(jī)器人縱向穩(wěn)定性研究

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于Stackelberg策略的多Agent強(qiáng)化學(xué)習(xí)警力巡邏路徑規(guī)劃