當(dāng)前位置：主頁(yè) > 科技論文 > 自動(dòng)化論文 >

基于事件驅(qū)動(dòng)的多智能體強(qiáng)化學(xué)習(xí)研究

發(fā)布時(shí)間：2018-04-27 23:25

本文選題：事件驅(qū)動(dòng) + 多智能體　；參考：《智能系統(tǒng)學(xué)報(bào)》2017年01期

【摘要】：本文針對(duì)多智能體強(qiáng)化學(xué)習(xí)中存在的通信和計(jì)算資源消耗大等問(wèn)題,提出了一種基于事件驅(qū)動(dòng)的多智能體強(qiáng)化學(xué)習(xí)算法,側(cè)重于事件驅(qū)動(dòng)在多智能體學(xué)習(xí)策略層方面的研究。在智能體與環(huán)境的交互過(guò)程中,算法基于事件驅(qū)動(dòng)的思想,根據(jù)智能體觀測(cè)信息的變化率設(shè)計(jì)觸發(fā)函數(shù),使學(xué)習(xí)過(guò)程中的通信和學(xué)習(xí)時(shí)機(jī)無(wú)需實(shí)時(shí)或按周期地進(jìn)行,故在相同時(shí)間內(nèi)可以降低數(shù)據(jù)傳輸和計(jì)算次數(shù)。另外,分析了該算法的計(jì)算資源消耗,以及對(duì)算法收斂性進(jìn)行了論證。最后,仿真實(shí)驗(yàn)說(shuō)明了該算法可以在學(xué)習(xí)過(guò)程中減少一定的通信次數(shù)和策略遍歷次數(shù),進(jìn)而緩解了通信和計(jì)算資源消耗。
[Abstract]:Aiming at the problems of communication and computing resource consumption in multi-agent reinforcement learning, this paper proposes an event-driven multi-agent reinforcement learning algorithm, which focuses on the event-driven learning strategy layer of multi-agent learning. In the process of interaction between agent and environment, the algorithm is based on the idea of event driven, and the trigger function is designed according to the change rate of the information observed by the agent, so that the communication and learning timing in the learning process do not need to be carried out in real time or on a periodic basis. Therefore, in the same time can reduce the number of data transmission and calculation. In addition, the computational resource consumption of the algorithm is analyzed, and the convergence of the algorithm is demonstrated. Finally, the simulation results show that the algorithm can reduce the number of times of communication and the number of policy traversal in the learning process, and then reduce the consumption of communication and computing resources.
【作者單位】：西南交通大學(xué)電氣工程學(xué)院;
【基金】：國(guó)家自然科學(xué)基金青年項(xiàng)目(61304166)
【分類號(hào)】：TP181

【相似文獻(xiàn)】

相關(guān)期刊論文前10條

1 郭魯;蘇文明;;企業(yè)內(nèi)組織的多智能體論述[J];科技廣場(chǎng);2008年02期

2 周，

本文編號(hào)：1812797

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/1812797.html

上一篇：Ag納米立方共振傳感器的熱組裝及應(yīng)用
下一篇：基于靜電傳感器的煤粉顆粒粒度和質(zhì)量流量的測(cè)量

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于事件驅(qū)動(dòng)的多智能體強(qiáng)化學(xué)習(xí)研究