Gesture-to-gesture translation in the wild via category-independent conditional maps

Liu, Y.; De Nadai, M.; Zen, G.; Sebe, N.; Lepri, B.

doi:10.1145/3343031.3351020

Recent works have shown Generative Adversarial Networks (GANs) to be particularly effective in image-to-image translations. However, in tasks such as body pose and hand gesture translation, existing methods usually require precise annotations, e.g. key-points or skeletons, which are time-consuming to draw. In this work, we propose a novel GAN architecture that decouples the required annotations into a category label - that specifies the gesture type -and a simple-to-draw category-independent conditional map - that expresses the location, rotation and size of the hand gesture. Our architecture synthesizes the target gesture while preserving the background context, thus effectively dealing with gesture translation in the wild. To this aim, we use an attention module and a rolling guidance approach, which loops the generated images back into the network and produces higher quality images compared to competing works. Thus, our GAN learns to generate new images from simple annotations without requiring key-points or skeleton labels. Results on two public datasets show that our method outperforms state of the art approaches both quantitatively and qualitatively. To the best of our knowledge, no work so far has addressed the gesture-to-gesture translation in the wild by requiring user-friendly annotations.

Gesture-to-gesture translation in the wild via category-independent conditional maps / Liu, Y.; De Nadai, M.; Zen, G.; Sebe, N.; Lepri, B.. - (2019), pp. 1916-1924. (Intervento presentato al convegno ACM International Conference on Multimedia (ACM Multimedia’19) tenutosi a Nice nel 21-25 October 2019) [10.1145/3343031.3351020].

Gesture-to-gesture translation in the wild via category-independent conditional maps

Y. Liu;M. De Nadai;G. Zen;N. Sebe;B. Lepri

2019-01-01

Abstract

Recent works have shown Generative Adversarial Networks (GANs) to be particularly effective in image-to-image translations. However, in tasks such as body pose and hand gesture translation, existing methods usually require precise annotations, e.g. key-points or skeletons, which are time-consuming to draw. In this work, we propose a novel GAN architecture that decouples the required annotations into a category label - that specifies the gesture type -and a simple-to-draw category-independent conditional map - that expresses the location, rotation and size of the hand gesture. Our architecture synthesizes the target gesture while preserving the background context, thus effectively dealing with gesture translation in the wild. To this aim, we use an attention module and a rolling guidance approach, which loops the generated images back into the network and produces higher quality images compared to competing works. Thus, our GAN learns to generate new images from simple annotations without requiring key-points or skeleton labels. Results on two public datasets show that our method outperforms state of the art approaches both quantitatively and qualitatively. To the best of our knowledge, no work so far has addressed the gesture-to-gesture translation in the wild by requiring user-friendly annotations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2019
			
	Titolo del volume (Proceedings title)
	
				ACM International Conference on Multimedia (ACM Multimedia’19)
			
	Luogo di edizione (Place of publication)
	
				New York
			
	Casa editrice (Publisher)
	
				ACM
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85074849254
			
	Codice WOS (WOS identifier)
	
				WOS:000509743400235
			
	Tutti gli autori
	
						Liu, Y.; De Nadai, M.; Zen, G.; Sebe, N.; Lepri, B.
					
	Citazione
	
				Gesture-to-gesture translation in the wild via category-independent conditional maps / Liu, Y.; De Nadai, M.; Zen, G.; Sebe, N.; Lepri, B.. - (2019), pp. 1916-1924. (Intervento presentato al  convegno ACM International Conference on Multimedia (ACM Multimedia’19) tenutosi a Nice nel 21-25 October 2019) [10.1145/3343031.3351020].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
1907.05916.pdf accesso aperto Tipologia: Pre-print non referato (Non-refereed preprint) Licenza: Altra licenza (Other type of license) Dimensione 5.4 MB Formato Adobe PDF Visualizza/Apri	5.4 MB	Adobe PDF	Visualizza/Apri
3343031.3351020.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.99 MB Formato Adobe PDF Visualizza/Apri	2.99 MB	Adobe PDF	Visualizza/Apri