Open-set Unsupervised Video Domain Adaptation (OU-VDA) deals with the task of adapting an action recognition model from a labelled source domain to an unlabelled target domain that contains "target-private" categories, which are present in the target but absent in the source. In this work we deviate from the prior work of training a specialized open-set classifier or weighted adversarial learning by proposing to use pre-trained Language and Vision Models (CLIP). The CLIP is well suited for OUVDA due to its rich representation and the zero-shot recognition capabilities. However, rejecting target-private instances with the CLIP's zero-shot protocol requires oracle knowledge about the target-private label names. To circumvent the impossibility of the knowledge of label names, we propose AutoLabel that automatically discovers and generates object-centric compositional candidate target-private class names. Despite its simplicity, we show that CLIP when equipped with AutoLabel can satisfactorily reject the target-private instances, thereby facilitating better alignment between the shared classes of the two domains. The code is available(1).

AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation / Zara, G; Roy, S; Rota, P; Ricci, E. - (2023), pp. 11504-11513. (Intervento presentato al convegno CVPR tenutosi a Vancouver BC, Canada nel 17-24 June 2023) [10.1109/CVPR52729.2023.01107].

AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation

Zara, G;Roy, S;Rota, P
;
Ricci, E
2023-01-01

Abstract

Open-set Unsupervised Video Domain Adaptation (OU-VDA) deals with the task of adapting an action recognition model from a labelled source domain to an unlabelled target domain that contains "target-private" categories, which are present in the target but absent in the source. In this work we deviate from the prior work of training a specialized open-set classifier or weighted adversarial learning by proposing to use pre-trained Language and Vision Models (CLIP). The CLIP is well suited for OUVDA due to its rich representation and the zero-shot recognition capabilities. However, rejecting target-private instances with the CLIP's zero-shot protocol requires oracle knowledge about the target-private label names. To circumvent the impossibility of the knowledge of label names, we propose AutoLabel that automatically discovers and generates object-centric compositional candidate target-private class names. Despite its simplicity, we show that CLIP when equipped with AutoLabel can satisfactorily reject the target-private instances, thereby facilitating better alignment between the shared classes of the two domains. The code is available(1).
2023
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA
IEEE COMPUTER SOC
979-8-3503-0129-8
Zara, G; Roy, S; Rota, P; Ricci, E
AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation / Zara, G; Roy, S; Rota, P; Ricci, E. - (2023), pp. 11504-11513. (Intervento presentato al convegno CVPR tenutosi a Vancouver BC, Canada nel 17-24 June 2023) [10.1109/CVPR52729.2023.01107].
File in questo prodotto:
File Dimensione Formato  
Zara_AutoLabel_CLIP-Based_Framework_for_Open-Set_Video_Domain_Adaptation_CVPR_2023_paper.pdf

accesso aperto

Descrizione: CVPR Open Access version
Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.58 MB
Formato Adobe PDF
1.58 MB Adobe PDF Visualizza/Apri
AutoLabel_CLIP-based_framework_for_Open-Set_Video_Domain_Adaptation.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 529.92 kB
Formato Adobe PDF
529.92 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/400408
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 1
social impact