Simplifying open-set video domain adaptation with contrastive learning

Zara, Giacomo; Turrisi Da Costa, Victor Guilherme; Roy, Subhankar; Rota, Paolo; Ricci, Elisa

doi:10.1016/j.cviu.2024.103953

In an effort to reduce annotation costs in action recognition, unsupervised video domain adaptation methods have been proposed that aim to adapt a predictive model from a labelled dataset (i.e., source domain) to an unlabelled dataset (i.e., target domain). In this work we address a more realistic scenario, called open-set video domain adaptation (OUVDA), where the target dataset contains “unknown” semantic categories that are not shared with the source. The challenge lies in aligning the shared classes of the two domains while separating the shared classes from the unknown ones. In this work we propose to address OUVDA with an unified contrastive learning framework that learns discriminative and well-clustered features. We also propose a video-oriented temporal contrastive loss that enables our method to better cluster the feature space by exploiting the freely available temporal information in video data. We show that discriminative feature space facilitates better separation of the ...

In an effort to reduce annotation costs in action recognition, unsupervised video domain adaptation methods have been proposed that aim to adapt a predictive model from a labelled dataset (i.e., source domain) to an unlabelled dataset (i.e., target domain). In this work we address a more realistic scenario, called open -set video domain adaptation (OUVDA), where the target dataset contains "unknown"semantic categories that are not shared with the source. The challenge lies in aligning the shared classes of the two domains while separating the shared classes from the unknown ones. In this work we propose to address OUVDA with an unified contrastive learning framework that learns discriminative and well -clustered features. We also propose a video -oriented temporal contrastive loss that enables our method to better cluster the feature space by exploiting the freely available temporal information in video data. We show that discriminative feature space facilitates better separation of the unknown classes, and thereby allows us to use a simple similarity based score to identify them. We conduct thorough experimental evaluation on multiple OUVDA benchmarks and show the effectiveness of our proposed method against the prior art.

Simplifying open-set video domain adaptation with contrastive learning / Zara, G., Turrisi da Costa, V.G., Roy, S., Rota, P., Ricci, E.. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 241:103953(2024). [10.1016/j.cviu.2024.103953]

Simplifying open-set video domain adaptation with contrastive learning

Zara, Giacomo;Turrisi da Costa, Victor Guilherme;Roy, Subhankar;Rota, Paolo;Ricci, Elisa

2024-01-01

Abstract

In an effort to reduce annotation costs in action recognition, unsupervised video domain adaptation methods have been proposed that aim to adapt a predictive model from a labelled dataset (i.e., source domain) to an unlabelled dataset (i.e., target domain). In this work we address a more realistic scenario, called open-set video domain adaptation (OUVDA), where the target dataset contains “unknown” semantic categories that are not shared with the source. The challenge lies in aligning the shared classes of the two domains while separating the shared classes from the unknown ones. In this work we propose to address OUVDA with an unified contrastive learning framework that learns discriminative and well-clustered features. We also propose a video-oriented temporal contrastive loss that enables our method to better cluster the feature space by exploiting the freely available temporal information in video data. We show that discriminative feature space facilitates better separation of the ...

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2024
			
	Titolo del periodico (Journal title)
	
				COMPUTER VISION AND IMAGE UNDERSTANDING
			
	Numero e parte del fascicolo (Issue number and part)
	
				103953
			
	DOI
	
				https://dx.doi.org/10.1016/j.cviu.2024.103953
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85185552775
			
	Codice WOS (WOS identifier)
	
				WOS:001186444200001
			
	Tutti gli autori
	
						Zara, Giacomo; Turrisi da Costa, Victor Guilherme; Roy, Subhankar; Rota, Paolo; Ricci, Elisa
					
	Citazione
	
				Simplifying open-set video domain adaptation with contrastive learning / Zara, G., Turrisi da Costa, V.G., Roy, S., Rota, P., Ricci, E.. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 241:103953(2024). [10.1016/j.cviu.2024.103953]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
2301.03322v1.pdf Solo gestori archivio Tipologia: Post-print referato (Refereed author’s manuscript) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.57 MB Formato Adobe PDF Visualizza/Apri	1.57 MB	Adobe PDF	Visualizza/Apri
1-s2.0-S1077314224000341-main.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.15 MB Formato Adobe PDF Visualizza/Apri	1.15 MB	Adobe PDF	Visualizza/Apri