Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Ermolov, Aleksandr; Mirvakhabova, Leyla; Khrulkov, Valentin; Sebe, Nicu; Oseledets, Ivan

doi:10.1109/CVPR52688.2022.00726

Metric learning aims to learn a highly discriminative model encouraging the embeddings of similar classes to be close in the chosen metrics and pushed apart for dissimilar ones. The common recipe is to use an encoder to extract embeddings and a distance-based loss function to match the representations - usually, the Euclidean distance is utilized. An emerging interest in learning hyperbolic data embeddings suggests that hyperbolic geometry can be beneficial for natural data. Following this line of work, we propose a new hyperbolic-based model for metric learning. At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space. These embeddings are directly optimized using modified pairwise cross-entropy loss. We evaluate the proposed model with six different formulations on four datasets achieving the new state-of-the-art performance. The source code is available at https://github.com/htdt/hyp_metric.

Hyperbolic Vision Transformers: Combining Improvements in Metric Learning / Ermolov, Aleksandr; Mirvakhabova, Leyla; Khrulkov, Valentin; Sebe, Nicu; Oseledets, Ivan. - 2022-:(2022), pp. 7399-7409. ( 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 New Orleans 2022) [10.1109/CVPR52688.2022.00726].

Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Ermolov, Aleksandr;Mirvakhabova, Leyla;Khrulkov, Valentin;Sebe, Nicu;Oseledets, Ivan

2022-01-01

Abstract

Metric learning aims to learn a highly discriminative model encouraging the embeddings of similar classes to be close in the chosen metrics and pushed apart for dissimilar ones. The common recipe is to use an encoder to extract embeddings and a distance-based loss function to match the representations - usually, the Euclidean distance is utilized. An emerging interest in learning hyperbolic data embeddings suggests that hyperbolic geometry can be beneficial for natural data. Following this line of work, we propose a new hyperbolic-based model for metric learning. At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space. These embeddings are directly optimized using modified pairwise cross-entropy loss. We evaluate the proposed model with six different formulations on four datasets achieving the new state-of-the-art performance. The source code is available at https://github.com/htdt/hyp_metric.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2022
			
	Titolo del volume (Proceedings title)
	
				IEEE/CVF Conference on Computer Vision and Pattern Recognition
			
	Luogo di edizione (Place of publication)
	
				Piscataway, NJ USA
			
	Casa editrice (Publisher)
	
				IEEE
			
	ISBN
	
				978-1-6654-6946-3
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85140203414
			
	Codice WOS (WOS identifier)
	
				WOS:000870759100024
			
	Tutti gli autori
	
						Ermolov, Aleksandr; Mirvakhabova, Leyla; Khrulkov, Valentin; Sebe, Nicu; Oseledets, Ivan
					
	Citazione
	
				Hyperbolic Vision Transformers: Combining Improvements in Metric Learning / Ermolov, Aleksandr; Mirvakhabova, Leyla; Khrulkov, Valentin; Sebe, Nicu; Oseledets, Ivan. - 2022-:(2022), pp. 7399-7409. ( 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 New Orleans 2022) [10.1109/CVPR52688.2022.00726].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
Ermolov_Hyperbolic_Vision_Transformers_Combining_Improvements_in_Metric_Learning_CVPR_2022_paper.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.77 MB Formato Adobe PDF Visualizza/Apri	2.77 MB	Adobe PDF	Visualizza/Apri