Interaction Transformer for Human Reaction Generation

IRIS

We address the challenging task of human reaction generation, which aims to generate a corresponding reaction based on an input action. Most of the existing works do not focus on generating and predicting the reaction and cannot generate the motion when only the action is given as input. To address this limitation, we propose a novel interaction Transformer (InterFormer) consisting of a Transformer network with both temporal and spatial attention. Specifically, temporal attention captures the temporal dependencies of the motion of both characters and of their interaction, while spatial attention learns the dependencies between the different body parts of each character and those which are part of the interaction. Moreover, we propose using graphs to increase the performance of spatial attention via an interaction distance module that helps focus on nearby joints from both characters. Extensive experiments on the SBU interaction, K3HI, and DuetDance datasets demonstrate the effectiveness of InterFormer. Our method is general and can be used to generate more complex and long-term interactions. We also provide videos of generated reactions and the code with pre-trained models at https://github.com/CRISTAL-3DSAM/InterFormer

Interaction Transformer for Human Reaction Generation / Chopin, B.; Tang, H.; Otberdout, N.; Daoudi, M.; Sebe, N.. - In: IEEE TRANSACTIONS ON MULTIMEDIA. - ISSN 1520-9210. - 25:(2023), pp. 8842-8854. [10.1109/TMM.2023.3242152]

Interaction Transformer for Human Reaction Generation

Chopin, B.;Tang, H.;Otberdout, N.;Daoudi, M.;Sebe, N.

2023-01-01

Abstract

We address the challenging task of human reaction generation, which aims to generate a corresponding reaction based on an input action. Most of the existing works do not focus on generating and predicting the reaction and cannot generate the motion when only the action is given as input. To address this limitation, we propose a novel interaction Transformer (InterFormer) consisting of a Transformer network with both temporal and spatial attention. Specifically, temporal attention captures the temporal dependencies of the motion of both characters and of their interaction, while spatial attention learns the dependencies between the different body parts of each character and those which are part of the interaction. Moreover, we propose using graphs to increase the performance of spatial attention via an interaction distance module that helps focus on nearby joints from both characters. Extensive experiments on the SBU interaction, K3HI, and DuetDance datasets demonstrate the effectiveness of InterFormer. Our method is general and can be used to generate more complex and long-term interactions. We also provide videos of generated reactions and the code with pre-trained models at https://github.com/CRISTAL-3DSAM/InterFormer

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2023
			
	Titolo del periodico (Journal title)
	
				IEEE TRANSACTIONS ON MULTIMEDIA
			
	DOI
	
				https://dx.doi.org/10.1109/TMM.2023.3242152
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85148435155
			
	Codice WOS (WOS identifier)
	
				WOS:001125902000025
			
	Tutti gli autori
	
						Chopin, B.; Tang, H.; Otberdout, N.; Daoudi, M.; Sebe, N.
					
	Citazione
	
				Interaction Transformer for Human Reaction Generation / Chopin, B.; Tang, H.; Otberdout, N.; Daoudi, M.; Sebe, N.. - In: IEEE TRANSACTIONS ON MULTIMEDIA. - ISSN 1520-9210. - 25:(2023), pp. 8842-8854. [10.1109/TMM.2023.3242152]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
Baptiste-TMM23-compressed.pdf accesso aperto Tipologia: Post-print referato (Refereed author’s manuscript) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 737.74 kB Formato Adobe PDF Visualizza/Apri	737.74 kB	Adobe PDF	Visualizza/Apri
Interaction_Transformer_for_Human_Reaction_Generation.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.71 MB Formato Adobe PDF Visualizza/Apri	3.71 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/399718

Citazioni

ND

14

5

ND

social impact