Fast Differentiable Matrix Square Root

Song, Yue; Sebe, Nicu; Wang, Wei

Computing the matrix square root or its inverse in a differentiable manner is important in a variety of computer vision tasks. Previous methods either adopt the Singular Value Decomposition (SVD) to explicitly factorize the matrix or use the Newton-Schulz iteration (NS iteration) to derive the approximate solution. However, both methods are not computationally efficient enough in either the forward pass or in the backward pass. In this paper, we propose two more efficient variants to compute the differentiable matrix square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pade Approximants (MPA). The backward gradient is computed by ´ iteratively solving the continuous-time Lyapunov equation using the matrix sign function. Both methods yield considerable speed-up compared with the SVD or the Newton-Schulz iteration. Experimental results on the de-correlated batch normalization and second-order vision transformer demonstrate that our methods can also achieve competitive and even slightly better performances. The code is available at https://github.com/KingJamesSong/FastDifferentiableMatSqrt.

Fast Differentiable Matrix Square Root / Song, Yue; Sebe, Niculae; Wang, Wei. - (2022), pp. 1-19. ( 10th International Conference on Learning Representations, ICLR 2022 virtual 25- 29 April, 2022).

Fast Differentiable Matrix Square Root

Yue Song;Nicu Sebe;Wei Wang

2022-01-01

Abstract

Computing the matrix square root or its inverse in a differentiable manner is important in a variety of computer vision tasks. Previous methods either adopt the Singular Value Decomposition (SVD) to explicitly factorize the matrix or use the Newton-Schulz iteration (NS iteration) to derive the approximate solution. However, both methods are not computationally efficient enough in either the forward pass or in the backward pass. In this paper, we propose two more efficient variants to compute the differentiable matrix square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pade Approximants (MPA). The backward gradient is computed by ´ iteratively solving the continuous-time Lyapunov equation using the matrix sign function. Both methods yield considerable speed-up compared with the SVD or the Newton-Schulz iteration. Experimental results on the de-correlated batch normalization and second-order vision transformer demonstrate that our methods can also achieve competitive and even slightly better performances. The code is available at https://github.com/KingJamesSong/FastDifferentiableMatSqrt.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2022
			
	Titolo del volume (Proceedings title)
	
				International Conference on Learning Representation (ICLR’22)
			
	Luogo di edizione (Place of publication)
	
				S.l.
			
	Casa editrice (Publisher)
	
				International Conference on Learning Representations, ICLR
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85148193947
			
	Tutti gli autori
	
						Song, Yue; Sebe, Niculae; Wang, Wei
					
	Citazione
	
				Fast Differentiable Matrix Square Root / Song, Yue; Sebe, Niculae; Wang, Wei. - (2022), pp. 1-19. ( 10th International Conference on Learning Representations, ICLR 2022 virtual 25- 29 April, 2022).
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
352_fast_differentiable1.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 9.82 MB Formato Adobe PDF Visualizza/Apri	9.82 MB	Adobe PDF	Visualizza/Apri