Bidirectional Transformer GAN for Long-term Human Motion Prediction

Zhao, My.; Tang, H.; Xie, P.; Dai, Sl.; Sebe, N.; Wang, W.

doi:10.1145/3579359

The mainstream motion prediction methods usually focus on short-term prediction, and their predicted long-term motions often fall into an average pose, i.e., the freezing forecasting problem [27]. To mitigate this problem, we propose a novel Bidirectional Transformer-based Generative Adversarial Network (BiTGAN) for long-term human motion prediction. The bidirectional setup leads to consistent and smooth generation in both forward and backward directions. Besides, to make full use of the history motions, we split them into two parts. The first part is fed to the Transformer encoder in our BiTGAN while the second part is used as the decoder input. This strategy can alleviate the exposure problem [37]. Additionally, to better maintain both the local (i.e., frame-level pose) and global (i.e., video-level semantic) similarities between the predicted motion sequence and the real one, the soft dynamic time warping (Soft-DTW) loss is introduced into the generator. Finally, we utilize a dual-discriminator to distinguish the predicted sequence at both frame and sequence levels. Extensive experiments on the public Human3.6M dataset demonstrate that our proposed BiTGAN achieves state-of-the-art performance on long-term (4s) human motion prediction, and reduces the average error of all actions by 4%.

Bidirectional Transformer GAN for Long-term Human Motion Prediction / Zhao, My.; Tang, H.; Xie, P.; Dai, Sl.; Sebe, N.; Wang, W.. - In: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS. - ISSN 1551-6857. - 19:5(2023), pp. 16301-16319. [10.1145/3579359]

Bidirectional Transformer GAN for Long-term Human Motion Prediction

Zhao, MY.;Tang, H.;Xie, P.;Dai, SL.;Sebe, N.;Wang, W.

2023-01-01

Abstract

The mainstream motion prediction methods usually focus on short-term prediction, and their predicted long-term motions often fall into an average pose, i.e., the freezing forecasting problem [27]. To mitigate this problem, we propose a novel Bidirectional Transformer-based Generative Adversarial Network (BiTGAN) for long-term human motion prediction. The bidirectional setup leads to consistent and smooth generation in both forward and backward directions. Besides, to make full use of the history motions, we split them into two parts. The first part is fed to the Transformer encoder in our BiTGAN while the second part is used as the decoder input. This strategy can alleviate the exposure problem [37]. Additionally, to better maintain both the local (i.e., frame-level pose) and global (i.e., video-level semantic) similarities between the predicted motion sequence and the real one, the soft dynamic time warping (Soft-DTW) loss is introduced into the generator. Finally, we utilize a dual-discriminator to distinguish the predicted sequence at both frame and sequence levels. Extensive experiments on the public Human3.6M dataset demonstrate that our proposed BiTGAN achieves state-of-the-art performance on long-term (4s) human motion prediction, and reduces the average error of all actions by 4%.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2023
			
	Titolo del periodico (Journal title)
	
				ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS
			
	Numero e parte del fascicolo (Issue number and part)
	
				5
			
	DOI
	
				https://dx.doi.org/10.1145/3579359
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85148734726
			
	Codice WOS (WOS identifier)
	
				WOS:001018509300012
			
	Tutti gli autori
	
						Zhao, My.; Tang, H.; Xie, P.; Dai, Sl.; Sebe, N.; Wang, W.
					
	Citazione
	
				Bidirectional Transformer GAN for Long-term Human Motion Prediction / Zhao, My.; Tang, H.; Xie, P.; Dai, Sl.; Sebe, N.; Wang, W.. - In: ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS AND APPLICATIONS. - ISSN 1551-6857. - 19:5(2023), pp. 16301-16319. [10.1145/3579359]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
Bidirectional_TOMM22 .pdf accesso aperto Descrizione: just accepted Tipologia: Post-print referato (Refereed author’s manuscript) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.62 MB Formato Adobe PDF Visualizza/Apri	1.62 MB	Adobe PDF	Visualizza/Apri
3579359.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 4.6 MB Formato Adobe PDF Visualizza/Apri	4.6 MB	Adobe PDF	Visualizza/Apri