Deep Learning (DL) allowed the field of Multi-Agent Reinforcement Learning (MARL) to make significant advances, speeding-up the progress in the field. However, agents trained by means of DL in MARL settings have an important drawback: their policies are extremely hard to interpret, not only at the individual agent level, but also (and especially) considering the fact that one has to take into account the interactions across the whole set of agents. In this work, we make a step towards achieving interpretability in MARL tasks. To do that, we present an approach that combines evolutionary computation (i.e., grammatical evolution) and reinforcement learning (Q-learning), which allows us to produce agents that are, at least to some extent, understandable. Moreover, differently from the typically centralized DL-based approaches (and because of the possibility to use a replay buffer), in our method we can easily employ Independent Q-learning to train a team of agents, which facilitates robustness and scalability. By evaluating our approach on the Battlefield task from the MAgent implementation in the PettingZoo library, we observe that the evolved team of agents is able to coordinate its actions in a distributed fashion, solving the task in an effective way.
Towards Interpretable Policies in Multi-agent Reinforcement Learning Tasks / Crespi, Marco; Custode, Leonardo Lucio; Iacca, Giovanni. - 13627:(2022), pp. 262-276. (Intervento presentato al convegno 10th International Conference on Bioinspired Optimization Methods and Their Applications, BIOMA 2022 tenutosi a Maribor nel 17th November-18th November 2022) [10.1007/978-3-031-21094-5_19].
Towards Interpretable Policies in Multi-agent Reinforcement Learning Tasks
Custode, Leonardo Lucio;Iacca, Giovanni
2022-01-01
Abstract
Deep Learning (DL) allowed the field of Multi-Agent Reinforcement Learning (MARL) to make significant advances, speeding-up the progress in the field. However, agents trained by means of DL in MARL settings have an important drawback: their policies are extremely hard to interpret, not only at the individual agent level, but also (and especially) considering the fact that one has to take into account the interactions across the whole set of agents. In this work, we make a step towards achieving interpretability in MARL tasks. To do that, we present an approach that combines evolutionary computation (i.e., grammatical evolution) and reinforcement learning (Q-learning), which allows us to produce agents that are, at least to some extent, understandable. Moreover, differently from the typically centralized DL-based approaches (and because of the possibility to use a replay buffer), in our method we can easily employ Independent Q-learning to train a team of agents, which facilitates robustness and scalability. By evaluating our approach on the Battlefield task from the MAgent implementation in the PettingZoo library, we observe that the evolved team of agents is able to coordinate its actions in a distributed fashion, solving the task in an effective way.File | Dimensione | Formato | |
---|---|---|---|
Towards Interpretable Policies in Multi-agent Reinforcement Learning Tasks.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
372.22 kB
Formato
Adobe PDF
|
372.22 kB | Adobe PDF | Visualizza/Apri |
Interpretable_MARL_Pettingzoo.pdf
Open Access dal 13/11/2023
Tipologia:
Post-print referato (Refereed author’s manuscript)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
328.08 kB
Formato
Adobe PDF
|
328.08 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione