Learning in contexts where measuring the performance of the agents is either impossible or misleading requires different approaches in the search of the solution. These problems require either a complete exploration of the search space, or the use of reward-independent approaches, which may not be feasible in some situations. The Novelty Producing Synaptic Plasticity (NPSP) algorithm was recently proposed as a means to obtain successful learning in such contexts, by evolving synaptic plasticity rules able to generate as many novel behaviors as possible. Here, we consider a deceptive maze navigation task and extend the NPSP paradigm to a multi-objective case, by applying NSGA2 to maximizing a goal-agnostic metric (novelty) while minimizing a goal-aware metric (distance), in order to find the possible trade-offs. We then introduce an additional goal-agnostic metric (exploration) and apply MAP-Elites to “illuminate” the feature space projected by novelty and exploration. Lastly, we consider modified settings where 1) sensors are affected by random noise, and 2) the sensor perception is augmented, in order to assess the generalizability of the evolved synaptic rules across settings. Overall, our results show that both multi-objective and MAP-Elites based NPSP can find successful solutions in the different settings of the task.
Promoting Behavioral Diversity via Multi-Objective/Quality-Diversity Novelty Producing Synaptic Plasticity / Bizzotto, Edoardo; Yaman, Anil; Iacca, Giovanni. - (2021), pp. 01-08. (Intervento presentato al convegno 2021 IEEE Symposium Series on Computational Intelligence, SSCI 2021 tenutosi a Orlando, FL, USA nel 5th-7th December 2021) [10.1109/SSCI50451.2021.9659978].
Promoting Behavioral Diversity via Multi-Objective/Quality-Diversity Novelty Producing Synaptic Plasticity
Iacca, Giovanni
2021-01-01
Abstract
Learning in contexts where measuring the performance of the agents is either impossible or misleading requires different approaches in the search of the solution. These problems require either a complete exploration of the search space, or the use of reward-independent approaches, which may not be feasible in some situations. The Novelty Producing Synaptic Plasticity (NPSP) algorithm was recently proposed as a means to obtain successful learning in such contexts, by evolving synaptic plasticity rules able to generate as many novel behaviors as possible. Here, we consider a deceptive maze navigation task and extend the NPSP paradigm to a multi-objective case, by applying NSGA2 to maximizing a goal-agnostic metric (novelty) while minimizing a goal-aware metric (distance), in order to find the possible trade-offs. We then introduce an additional goal-agnostic metric (exploration) and apply MAP-Elites to “illuminate” the feature space projected by novelty and exploration. Lastly, we consider modified settings where 1) sensors are affected by random noise, and 2) the sensor perception is augmented, in order to assess the generalizability of the evolved synaptic rules across settings. Overall, our results show that both multi-objective and MAP-Elites based NPSP can find successful solutions in the different settings of the task.File | Dimensione | Formato | |
---|---|---|---|
Promoting_Behavioral_Diversity_in_Neural_Evolution.pdf
Solo gestori archivio
Tipologia:
Post-print referato (Refereed author’s manuscript)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
847.14 kB
Formato
Adobe PDF
|
847.14 kB | Adobe PDF | Visualizza/Apri |
Promoting_Behavioral_Diversity_via_Multi-Objective_Quality-Diversity_Novelty_Producing_Synaptic_Plasticity.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
505.8 kB
Formato
Adobe PDF
|
505.8 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione