Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation

Sayar, Erdi; Vintaykin, Vladislav; Iacca, Giovanni; Knoll, Alois

doi:10.1007/978-3-031-56855-8_1

Reinforcement learning (RL) algorithms often require a significant number of experiences to learn a policy capable of achieving desired goals in multi-goal robot manipulation tasks with sparse rewards. Hindsight Experience Replay (HER) is an existing method that improves learning efficiency by using failed trajectories and replacing the original goals with hindsight goals that are uniformly sampled from the visited states. However, HER has a limitation: the hindsight goals are mostly near the initial state, which hinders solving tasks efficiently if the desired goals are far from the initial state. To overcome this limitation, we introduce a curriculum learning method called HERDT (HER with Decision Trees). HERDT uses binary DTs to generate curriculum goals that guide a robotic agent progressively from an initial state toward a desired goal. During the warm-up stage, DTs are optimized using the Grammatical Evolution algorithm. In the training stage, curriculum goals are then sampled by DTs to help the agent navigate the environment. Since binary DTs generate discrete values, we fine-tune these curriculum points by incorporating a feedback value (i.e., the Q-value). This fine-tuning enables us to adjust the difficulty level of the generated curriculum points, ensuring that they are neither overly simplistic nor excessively challenging. In other words, these points are precisely tailored to match the robot’s ongoing learning policy. We evaluate our proposed approach on different sparse reward robotic manipulation tasks and compare it with the state-of-the-art HER approach. Our results demonstrate that our method consistently outperforms or matches the existing approach in all the tested tasks.

Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation / Sayar, Erdi; Vintaykin, Vladislav; Iacca, Giovanni; Knoll, Alois. - 14635:(2024), pp. 3-18. (Intervento presentato al convegno 27th European Conference on Applications of Evolutionary Computation, EvoApplications 2024 held as part of EvoStar 2024 tenutosi a Aberystwyth nel 3rd-5th April 2024) [10.1007/978-3-031-56855-8_1].

Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation

Sayar, Erdi;Vintaykin, Vladislav;Iacca, Giovanni;Knoll, Alois

2024-01-01

Abstract

Reinforcement learning (RL) algorithms often require a significant number of experiences to learn a policy capable of achieving desired goals in multi-goal robot manipulation tasks with sparse rewards. Hindsight Experience Replay (HER) is an existing method that improves learning efficiency by using failed trajectories and replacing the original goals with hindsight goals that are uniformly sampled from the visited states. However, HER has a limitation: the hindsight goals are mostly near the initial state, which hinders solving tasks efficiently if the desired goals are far from the initial state. To overcome this limitation, we introduce a curriculum learning method called HERDT (HER with Decision Trees). HERDT uses binary DTs to generate curriculum goals that guide a robotic agent progressively from an initial state toward a desired goal. During the warm-up stage, DTs are optimized using the Grammatical Evolution algorithm. In the training stage, curriculum goals are then sampled by DTs to help the agent navigate the environment. Since binary DTs generate discrete values, we fine-tune these curriculum points by incorporating a feedback value (i.e., the Q-value). This fine-tuning enables us to adjust the difficulty level of the generated curriculum points, ensuring that they are neither overly simplistic nor excessively challenging. In other words, these points are precisely tailored to match the robot’s ongoing learning policy. We evaluate our proposed approach on different sparse reward robotic manipulation tasks and compare it with the state-of-the-art HER approach. Our results demonstrate that our method consistently outperforms or matches the existing approach in all the tested tasks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2024
			
	Titolo del volume (Proceedings title)
	
				Applications of Evolutionary Computation. EvoApplications 2024
			
	Luogo di edizione (Place of publication)
	
				Cham, Svizzera
			
	Casa editrice (Publisher)
	
				Springer Science and Business Media Deutschland GmbH
			
	ISBN
	
				9783031568541
9783031568558
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85189515966
			
	Codice WOS (WOS identifier)
	
				WOS:001212344100001
			
	Tutti gli autori
	
						Sayar, Erdi; Vintaykin, Vladislav; Iacca, Giovanni; Knoll, Alois
					
	Citazione
	
				Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation / Sayar, Erdi; Vintaykin, Vladislav; Iacca, Giovanni; Knoll, Alois. - 14635:(2024), pp. 3-18. (Intervento presentato al  convegno 27th European Conference on Applications of Evolutionary Computation, EvoApplications 2024 held as part of EvoStar 2024 tenutosi a Aberystwyth nel 3rd-5th April 2024) [10.1007/978-3-031-56855-8_1].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.87 MB Formato Adobe PDF Visualizza/Apri	3.87 MB	Adobe PDF	Visualizza/Apri
sayar.pdf Open Access dal 22/03/2025 Tipologia: Post-print referato (Refereed author’s manuscript) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 5.16 MB Formato Adobe PDF Visualizza/Apri	5.16 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/405931

Citazioni

ND

0

0

ND

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

Hindsight Experience Replay with Evolutionary Decision Trees for Curriculum Goal Generation

Sayar, Erdi;Vintaykin, Vladislav;Iacca, Giovanni;Knoll, Alois

2024-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)