Data-efficient control policy search using residual dynamics learning

IRIS

In this work, we propose a model-based and data efficient approach for reinforcement learning. The main idea of our algorithm is to combine simulated and real rollouts to efficiently find an optimal control policy. While performing rollouts on the robot, we exploit sensory data to learn a probabilistic model of the residual difference between the measured state and the state predicted by a simplified model. The simplified model can be any dynamical system, from a very accurate system to a simple, linear one. The residual difference is learned with Gaussian processes. Hence, we assume that the difference between real and simplified model is Gaussian distributed, which is less strict than assuming that the real system is Gaussian distributed. The combination of the partial model and the learned residuals is exploited to predict the real system behavior and to search for an optimal policy. Simulations and experiments show that our approach significantly reduces the number of rollouts needed to find an optimal control policy for the real system.

Data-efficient control policy search using residual dynamics learning / Saveriano, M.; Yin, Y.; Falco, P.; Lee, D.. - 2017-:(2017), pp. 4709-4715. (Intervento presentato al convegno 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017 tenutosi a can nel 2017) [10.1109/IROS.2017.8206343].

Data-efficient control policy search using residual dynamics learning

Saveriano M.;Yin Y.;Falco P.;Lee D.

2017-01-01

Abstract

In this work, we propose a model-based and data efficient approach for reinforcement learning. The main idea of our algorithm is to combine simulated and real rollouts to efficiently find an optimal control policy. While performing rollouts on the robot, we exploit sensory data to learn a probabilistic model of the residual difference between the measured state and the state predicted by a simplified model. The simplified model can be any dynamical system, from a very accurate system to a simple, linear one. The residual difference is learned with Gaussian processes. Hence, we assume that the difference between real and simplified model is Gaussian distributed, which is less strict than assuming that the real system is Gaussian distributed. The combination of the partial model and the learned residuals is exploited to predict the real system behavior and to search for an optimal policy. Simulations and experiments show that our approach significantly reduces the number of rollouts needed to find an optimal control policy for the real system.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2017
			
	Titolo del volume (Proceedings title)
	
				IEEE International Conference on Intelligent Robots and Systems
			
	Luogo di edizione (Place of publication)
	
				Piscataway, New Jersey, USA
			
	Casa editrice (Publisher)
	
				Institute of Electrical and Electronics Engineers Inc.
			
	ISBN
	
				978-1-5386-2682-5
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85041955392
			
	Codice WOS (WOS identifier)
	
				WOS:000426978204079
			
	Tutti gli autori
	
						Saveriano, M.; Yin, Y.; Falco, P.; Lee, D.
					
	Citazione
	
				Data-efficient control policy search using residual dynamics learning / Saveriano, M.; Yin, Y.; Falco, P.; Lee, D.. - 2017-:(2017), pp. 4709-4715. (Intervento presentato al  convegno 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017 tenutosi a can nel 2017) [10.1109/IROS.2017.8206343].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/331051

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

38

34

ND

social impact