Deep Learning (DL) allowed the field of Multi-Agent Reinforcement Learning (MARL) to make significant advances, speeding-up the progress in the field. However, agents trained by means of DL in MARL settings have an important drawback: their policies are extremely hard to interpret, not only at the individual agent level, but also (and especially) considering the fact that one has to take into account the interactions across the whole set of agents. In this work, we make a step towards achieving interpretability in MARL tasks. To do that, we present an approach that combines evolutionary computation (i.e., grammatical evolution) and reinforcement learning (Q-learning), which allows us to produce agents that are, at least to some extent, understandable. Moreover, differently from the typically centralized DL-based approaches (and because of the possibility to use a replay buffer), in our method we can easily employ Independent Q-learning to train a team of agents, which facilitates robustness and scalability. By evaluating our approach on the Battlefield task from the MAgent implementation in the PettingZoo library, we observe that the evolved team of agents is able to coordinate its actions in a distributed fashion, solving the task in an effective way.

Towards Interpretable Policies in Multi-agent Reinforcement Learning Tasks / Crespi, Marco; Custode, Leonardo Lucio; Iacca, Giovanni. - 13627:(2022), pp. 262-276. ((Intervento presentato al convegno BIOMA 2022 tenutosi a Maribor nel 17th November-18th November 2022 [10.1007/978-3-031-21094-5_19].

Towards Interpretable Policies in Multi-agent Reinforcement Learning Tasks

Custode, Leonardo Lucio;Iacca, Giovanni
2022-01-01

Abstract

Deep Learning (DL) allowed the field of Multi-Agent Reinforcement Learning (MARL) to make significant advances, speeding-up the progress in the field. However, agents trained by means of DL in MARL settings have an important drawback: their policies are extremely hard to interpret, not only at the individual agent level, but also (and especially) considering the fact that one has to take into account the interactions across the whole set of agents. In this work, we make a step towards achieving interpretability in MARL tasks. To do that, we present an approach that combines evolutionary computation (i.e., grammatical evolution) and reinforcement learning (Q-learning), which allows us to produce agents that are, at least to some extent, understandable. Moreover, differently from the typically centralized DL-based approaches (and because of the possibility to use a replay buffer), in our method we can easily employ Independent Q-learning to train a team of agents, which facilitates robustness and scalability. By evaluating our approach on the Battlefield task from the MAgent implementation in the PettingZoo library, we observe that the evolved team of agents is able to coordinate its actions in a distributed fashion, solving the task in an effective way.
Bioinspired Optimization Methods and Their Applications (BIOMA) 2022
Cham, Svizzera
Springer
978-3-031-21093-8
978-3-031-21094-5
Crespi, Marco; Custode, Leonardo Lucio; Iacca, Giovanni
Towards Interpretable Policies in Multi-agent Reinforcement Learning Tasks / Crespi, Marco; Custode, Leonardo Lucio; Iacca, Giovanni. - 13627:(2022), pp. 262-276. ((Intervento presentato al convegno BIOMA 2022 tenutosi a Maribor nel 17th November-18th November 2022 [10.1007/978-3-031-21094-5_19].
File in questo prodotto:
File Dimensione Formato  
Towards Interpretable Policies in Multi-agent Reinforcement Learning Tasks.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 372.22 kB
Formato Adobe PDF
372.22 kB Adobe PDF   Visualizza/Apri
Interpretable_MARL_Pettingzoo.pdf

embargo fino al 12/11/2023

Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 328.08 kB
Formato Adobe PDF
328.08 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/357524
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact