Numerical methods, the collective name for numerical analysis and optimization techniques, have been widely used in the field of computer vision and deep learning. In this thesis, we investigate the algorithms of some numerical methods and their relevant applications in deep learning. These studied numerical techniques mainly include differentiable matrix power functions, differentiable eigendecomposition (ED), feasible orthogonal matrix constraints in optimization and latent semantics discovery, and physics-informed techniques for solving partial differential equations in disentangled and equivariant representation learning. We first propose two numerical solvers for the faster computation of matrix square root and its inverse. The proposed algorithms are demonstrated to have considerable speedup in practical computer vision tasks. Then we turn to resolve the main issues when integrating differentiable ED into deep learning -- backpropagation instability, slow decomposition for batched matrices, and ill-conditioned input throughout the training. Some approximation techniques are first leveraged to closely approximate the backward gradients while avoiding gradient explosion, which resolves the issue of backpropagation instability. To improve the computational efficiency of ED, we propose an efficient ED solver dedicated to small and medium batched matrices that are frequently encountered as input in deep learning. Some orthogonality techniques are also proposed to improve input conditioning. All of these techniques combine to mitigate the difficulty of applying differentiable ED in deep learning. In the last part of the thesis, we rethink some key concepts in disentangled representation learning. We first investigate the relation between disentanglement and orthogonality -- the generative models are enforced with different proposed orthogonality to show that the disentanglement performance is indeed improved. We also challenge the linear assumption of the latent traversal paths and propose to model the traversal process as dynamic spatiotemporal flows on the potential landscapes. Finally, we build probabilistic generative models of sequences that allow for novel understandings of equivariance and disentanglement. We expect our investigation could pave the way for more in-depth and impactful research at the intersection of numerical methods and deep learning.

Numerical Methods in Deep Learning and Computer Vision / Song, Yue. - (2024 Apr 23), pp. 1-156. [10.15168/11572_406633]

Numerical Methods in Deep Learning and Computer Vision

Song, Yue
2024-04-23

Abstract

Numerical methods, the collective name for numerical analysis and optimization techniques, have been widely used in the field of computer vision and deep learning. In this thesis, we investigate the algorithms of some numerical methods and their relevant applications in deep learning. These studied numerical techniques mainly include differentiable matrix power functions, differentiable eigendecomposition (ED), feasible orthogonal matrix constraints in optimization and latent semantics discovery, and physics-informed techniques for solving partial differential equations in disentangled and equivariant representation learning. We first propose two numerical solvers for the faster computation of matrix square root and its inverse. The proposed algorithms are demonstrated to have considerable speedup in practical computer vision tasks. Then we turn to resolve the main issues when integrating differentiable ED into deep learning -- backpropagation instability, slow decomposition for batched matrices, and ill-conditioned input throughout the training. Some approximation techniques are first leveraged to closely approximate the backward gradients while avoiding gradient explosion, which resolves the issue of backpropagation instability. To improve the computational efficiency of ED, we propose an efficient ED solver dedicated to small and medium batched matrices that are frequently encountered as input in deep learning. Some orthogonality techniques are also proposed to improve input conditioning. All of these techniques combine to mitigate the difficulty of applying differentiable ED in deep learning. In the last part of the thesis, we rethink some key concepts in disentangled representation learning. We first investigate the relation between disentanglement and orthogonality -- the generative models are enforced with different proposed orthogonality to show that the disentanglement performance is indeed improved. We also challenge the linear assumption of the latent traversal paths and propose to model the traversal process as dynamic spatiotemporal flows on the potential landscapes. Finally, we build probabilistic generative models of sequences that allow for novel understandings of equivariance and disentanglement. We expect our investigation could pave the way for more in-depth and impactful research at the intersection of numerical methods and deep learning.
23-apr-2024
XXXVI
2023-2024
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Sebe, Niculae
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
2024_Thesis_Yue.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 61.2 MB
Formato Adobe PDF
61.2 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/406633
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact