Why Approximate Matrix Square Roots Outperform Accurate SVD in Global Covariance Pooling?

IRIS

Global Covariance Pooling (GCP) aims at exploiting the second-order statistics of the convolutional feature. Its effectiveness has been demonstrated in boosting the classification performance of Convolutional Neural Networks (CNNs). Singular Value Decomposition (SVD) is used in GCP to compute the matrix square root. However, the approximate matrix square root calculated using Newton-Schulz iteration [14] outperforms the accurate one computed via SVD [15]. We empirically analyze the reason behind the performance gap from the perspectives of data precision and gradient smoothness. Various remedies for computing smooth SVD gradients are investigated. Based on our observation and analyses, a hybrid training protocol is proposed for SVD-based GCP meta-layers such that competitive performances can be achieved against Newton-Schulz iteration. Moreover, we propose a new GCP meta-layer that uses SVD in the forward pass, and Padé approximants in the backward propagation to compute the gradients. The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.

Why Approximate Matrix Square Roots Outperform Accurate SVD in Global Covariance Pooling? / Song, Yue; Sebe, Nicu; Wang, Wei. - (2021), pp. 1095-1103. (Intervento presentato al convegno 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 tenutosi a Virtual event nel 11th-17th October 2021) [10.1109/ICCV48922.2021.00115].

Why Approximate Matrix Square Roots Outperform Accurate SVD in Global Covariance Pooling?

Song, Yue;Sebe, Nicu;Wang, Wei

2021-01-01

Abstract

Global Covariance Pooling (GCP) aims at exploiting the second-order statistics of the convolutional feature. Its effectiveness has been demonstrated in boosting the classification performance of Convolutional Neural Networks (CNNs). Singular Value Decomposition (SVD) is used in GCP to compute the matrix square root. However, the approximate matrix square root calculated using Newton-Schulz iteration [14] outperforms the accurate one computed via SVD [15]. We empirically analyze the reason behind the performance gap from the perspectives of data precision and gradient smoothness. Various remedies for computing smooth SVD gradients are investigated. Based on our observation and analyses, a hybrid training protocol is proposed for SVD-based GCP meta-layers such that competitive performances can be achieved against Newton-Schulz iteration. Moreover, we propose a new GCP meta-layer that uses SVD in the forward pass, and Padé approximants in the backward propagation to compute the gradients. The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2021
			
	Titolo del volume (Proceedings title)
	
				Proceedings: 2021 IEEE/CVF International Conference on Computer Vision
			
	Luogo di edizione (Place of publication)
	
				Piscataway, NJ
			
	Casa editrice (Publisher)
	
				IEEE
			
	ISBN
	
				978-1-6654-2812-5
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85127758156
			
	Codice WOS (WOS identifier)
	
				WOS:000797698901028
			
	Tutti gli autori
	
						Song, Yue; Sebe, Nicu; Wang, Wei
					
	Citazione
	
				Why Approximate Matrix Square Roots Outperform Accurate SVD in Global Covariance Pooling? / Song, Yue; Sebe, Nicu; Wang, Wei. - (2021), pp. 1095-1103. (Intervento presentato al  convegno 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 tenutosi a Virtual event nel 11th-17th October 2021) [10.1109/ICCV48922.2021.00115].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
Song_Why_Approximate_Matrix_Square_Root_Outperforms_Accurate_SVD_in_Global_ICCV_2021_paper (1).pdf accesso aperto Tipologia: Post-print referato (Refereed author’s manuscript) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 4.84 MB Formato Adobe PDF Visualizza/Apri	4.84 MB	Adobe PDF	Visualizza/Apri
Why_Approximate_Matrix_Square_Root_Outperforms_Accurate_SVD_in_Global_Covariance_Pooling.pdf Solo gestori archivio Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 5.35 MB Formato Adobe PDF Visualizza/Apri	5.35 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/326204

Citazioni

ND

19

52

ND

social impact