Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis

IRIS

We propose a novel spatial-temporal graph Mamba (STG-Mamba) for the music-guided dance video synthesis task, i.e., to translate the input music to a dance video. STG-Mamba consists of two translation mappings: music-to-skeleton translation and skeleton-to-video translation. In the music-to-skeleton translation, we introduce a novel spatial-temporal graph Mamba (STGM) block to effectively construct skeleton sequences from the input music, capturing dependencies between joints in both the spatial and temporal dimensions. For the skeleton-to-video translation, we propose a novel self-supervised regularization network to translate the generated skeletons, along with a conditional image, into a dance video. Lastly, we collect a new skeleton-to-video translation dataset from the Internet, containing 54,944 video clips. Extensive experiments demonstrate that STG-Mamba achieves significantly better results than existing methods.

We propose a novel spatial-temporal graph Mamba (STG-Mamba) for the music-guided dance video synthesis task, i.e., to translate the input music to a dance video. STG-Mamba consists of two translation mappings: music-to-skeleton translation and skeleton-to-video translation. In the music-to-skeleton translation, we introduce a novel spatial-temporal graph Mamba (STGM) block to effectively construct skeleton sequences from the input music, capturing dependencies between joints in both the spatial and temporal dimensions. For the skeleton-to-video translation, we propose a novel self-supervised regularization network to translate the generated skeletons, along with a conditional image, into a dance video. Lastly, we collect a new skeleton-to-video translation dataset from the Internet, containing 54,944 video clips. Extensive experiments demonstrate that STG-Mamba achieves significantly better results than existing methods.

Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis / Tang, H.; Shao, L.; Zhang, Z.; Van Gool, L.; Sebe, N.. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 47:11(2025), pp. 9626-9636. [10.1109/TPAMI.2025.3588237]

Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis

Tang H.;Shao L.;Zhang Z.;Van Gool L.;Sebe N.

2025-01-01

Abstract

We propose a novel spatial-temporal graph Mamba (STG-Mamba) for the music-guided dance video synthesis task, i.e., to translate the input music to a dance video. STG-Mamba consists of two translation mappings: music-to-skeleton translation and skeleton-to-video translation. In the music-to-skeleton translation, we introduce a novel spatial-temporal graph Mamba (STGM) block to effectively construct skeleton sequences from the input music, capturing dependencies between joints in both the spatial and temporal dimensions. For the skeleton-to-video translation, we propose a novel self-supervised regularization network to translate the generated skeletons, along with a conditional image, into a dance video. Lastly, we collect a new skeleton-to-video translation dataset from the Internet, containing 54,944 video clips. Extensive experiments demonstrate that STG-Mamba achieves significantly better results than existing methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2025
			
	Titolo del periodico (Journal title)
	
				IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
			
	Numero e parte del fascicolo (Issue number and part)
	
				11
			
	DOI
	
				https://dx.doi.org/10.1109/TPAMI.2025.3588237
			
	Codice PubMed (PubMed Identifier)
	
				40663669
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-105011174125
			
	Tutti gli autori
	
						Tang, H.; Shao, L.; Zhang, Z.; Van Gool, L.; Sebe, N.
					
	Citazione
	
				Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis / Tang, H.; Shao, L.; Zhang, Z.; Van Gool, L.; Sebe, N.. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 47:11(2025), pp. 9626-9636. [10.1109/TPAMI.2025.3588237]

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/464941

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

1

ND

1

social impact