Learning How to Smile: Expression Video Generation with Conditional Adversarial Recurrent Nets

IRIS

While several research studies have focused on analyzing human behavior and, in particular, emotional signals from visual data, the problem of synthesizing face video sequences with specific attributes (e.g. age, facial expressions) received much less attention. This paper proposes a novel deep generative model able to produce face videos from a given image of a neutral face and a label indicating a specific facial expression, e.g. spontaneous smile. Our framework consists of two main building blocks: an image generator and a frame sequence generator. The image generator is implemented as a deep neural model which combines generative adversarial networks and variational auto-encoders, while the sequence generator is a label-conditioned recurrent neural network. In the proposed framework, given as input a neural face and a label, the sequence generator outputs a set of hidden representations with smooth transitions corresponding to video frames. Then, the image generator is used to decode the hidden representations into the actual face images. To impose that the net generates videos consistent with the given label, a novel identity adversarial loss is proposed. Our experimental results demonstrate the effectiveness of the framework and the advantage of introducing an adversarial component into recurrent models for face video generation.

Learning How to Smile: Expression Video Generation with Conditional Adversarial Recurrent Nets / Wang, W.; Alameda-Pineda, X.; Xu, D.; Ricci, E.; Sebe, N.. - In: IEEE TRANSACTIONS ON MULTIMEDIA. - ISSN 1520-9210. - 2020, 22:11(2020), pp. 2808-2819. [10.1109/TMM.2019.2963621]

Learning How to Smile: Expression Video Generation with Conditional Adversarial Recurrent Nets

Wang W.;Alameda-Pineda X.;Xu D.;Ricci E.;Sebe N.

2020-01-01

Abstract

While several research studies have focused on analyzing human behavior and, in particular, emotional signals from visual data, the problem of synthesizing face video sequences with specific attributes (e.g. age, facial expressions) received much less attention. This paper proposes a novel deep generative model able to produce face videos from a given image of a neutral face and a label indicating a specific facial expression, e.g. spontaneous smile. Our framework consists of two main building blocks: an image generator and a frame sequence generator. The image generator is implemented as a deep neural model which combines generative adversarial networks and variational auto-encoders, while the sequence generator is a label-conditioned recurrent neural network. In the proposed framework, given as input a neural face and a label, the sequence generator outputs a set of hidden representations with smooth transitions corresponding to video frames. Then, the image generator is used to decode the hidden representations into the actual face images. To impose that the net generates videos consistent with the given label, a novel identity adversarial loss is proposed. Our experimental results demonstrate the effectiveness of the framework and the advantage of introducing an adversarial component into recurrent models for face video generation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
			2020
		
	Titolo del periodico (Journal title)
	
			IEEE TRANSACTIONS ON MULTIMEDIA
		
	Numero e parte del fascicolo (Issue number and part)
	
			11
		
	DOI
	
			https://dx.doi.org/10.1109/TMM.2019.2963621
		
	Codice Scopus (Scopus identifier)
	
			2-s2.0-85077386512
		
	Codice WOS (WOS identifier)
	
			WOS:000584239900005
		
	Tutti gli autori
	
			Wang, W.; Alameda-Pineda, X.; Xu, D.; Ricci, E.; Sebe, N.
		
	Citazione
	
			Learning How to Smile: Expression Video Generation with Conditional Adversarial Recurrent Nets / Wang, W.; Alameda-Pineda, X.; Xu, D.; Ricci, E.; Sebe, N.. - In: IEEE TRANSACTIONS ON MULTIMEDIA. - ISSN 1520-9210. - 2020, 22:11(2020), pp. 2808-2819. [10.1109/TMM.2019.2963621]
		
	Appare nelle tipologie:
	
			03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
08948254.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 6.16 MB Formato Adobe PDF Visualizza/Apri	6.16 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/251274

Citazioni

ND

14

8

social impact