Unsupervised domain adaptation (UDA) methods have become very popular in computer vision. However, while several techniques have been proposed for images, much less attention has been devoted to videos. This paper introduces a novel UDA approach for action recognition from videos, inspired by recent literature on contrastive learning. In particular, we propose a novel two-headed deep architecture that simultaneously adopts cross-entropy and contrastive losses from different network branches to robustly learn a target classifier. Moreover, this work introduces a novel large-scale UDA dataset, Mixamo→Kinetics, which, to the best of our knowledge, is the first dataset that considers the domain shift arising when transferring knowledge from synthetic to real video sequences. Our extensive experimental evaluation conducted on three publicly available benchmarks and on our new Mixamo→Kinetics dataset demonstrate the effectiveness of our approach, which outperforms the current state-of-the-art methods. Code is available at https://github.com/vturrisi/CO2A.

Dual-Head Contrastive Domain Adaptation for Video Action Recognition / Costa, Da; Zara, G.; Rota, P.; Oliveira-Santos, T.; Sebe, N.; Murino, V.; Ricci, Elisa. - (2022), pp. 2234-2243. (Intervento presentato al convegno WACV2022 tenutosi a Waikoloa, HI, USA nel 3-8 January 2022) [10.1109/WACV48630.2021].

Dual-Head Contrastive Domain Adaptation for Video Action Recognition.

da Costa;Zara G.;Rota P.;Sebe N.;Ricci Elisa
2022-01-01

Abstract

Unsupervised domain adaptation (UDA) methods have become very popular in computer vision. However, while several techniques have been proposed for images, much less attention has been devoted to videos. This paper introduces a novel UDA approach for action recognition from videos, inspired by recent literature on contrastive learning. In particular, we propose a novel two-headed deep architecture that simultaneously adopts cross-entropy and contrastive losses from different network branches to robustly learn a target classifier. Moreover, this work introduces a novel large-scale UDA dataset, Mixamo→Kinetics, which, to the best of our knowledge, is the first dataset that considers the domain shift arising when transferring knowledge from synthetic to real video sequences. Our extensive experimental evaluation conducted on three publicly available benchmarks and on our new Mixamo→Kinetics dataset demonstrate the effectiveness of our approach, which outperforms the current state-of-the-art methods. Code is available at https://github.com/vturrisi/CO2A.
2022
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision
Waikoloa, HI, USA
IEEE
978-1-6654-0915-5
Costa, Da; Zara, G.; Rota, P.; Oliveira-Santos, T.; Sebe, N.; Murino, V.; Ricci, Elisa
Dual-Head Contrastive Domain Adaptation for Video Action Recognition / Costa, Da; Zara, G.; Rota, P.; Oliveira-Santos, T.; Sebe, N.; Murino, V.; Ricci, Elisa. - (2022), pp. 2234-2243. (Intervento presentato al convegno WACV2022 tenutosi a Waikoloa, HI, USA nel 3-8 January 2022) [10.1109/WACV48630.2021].
File in questo prodotto:
File Dimensione Formato  
da_Costa_Dual-Head_Contrastive_Domain_WACV_2022_supplemental.pdf

Solo gestori archivio

Descrizione: altro materiale
Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 860.73 kB
Formato Adobe PDF
860.73 kB Adobe PDF   Visualizza/Apri
Dual-Head_Contrastive_Domain_Adaptation_for_Video_Action_Recognition.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.93 MB
Formato Adobe PDF
1.93 MB Adobe PDF   Visualizza/Apri
da_Costa_Dual-Head_Contrastive_Domain_Adaptation_for_Video_Action_Recognition_WACV_2022_paper (1).pdf

accesso aperto

Descrizione: Computer vision Foundation
Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.3 MB
Formato Adobe PDF
1.3 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/329965
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 25
  • ???jsp.display-item.citation.isi??? 20
  • OpenAlex ND
social impact