Theoretical Properties of Equivariant Neural Networks

Pacini, Marco

Modern deep neural networks often operate in highly overparameterized regimes, where they can interpolate training data while still generalizing effectively. This empirical phenomenon motivates the study of model expressivity and, in particular, of how architectural design choices shape approximation capabilities. This thesis develops a mathematical framework to analyze the expressivity of deep learning architectures through the lens of approximation theory, with a primary focus on symmetry-preserving models. We study equivariant neural networks, where inductive biases can be encoded precisely in representation-theoretic terms by restricting the hypothesis space to functions compatible with prescribed symmetries. This setting provides a principled framework to investigate how architectural choices—such as depth, intermediate representations, nonlinearities, and readout mechanisms—affect approximation properties. The first contribution of the thesis concerns the interaction between equivariance and pointwise nonlinearities. We characterize which pairs of group representations and pointwise activations induce non-trivial equivariant nonlinear maps, thereby identifying when standard layer constructions lead to non-degenerate hypothesis spaces. The second contribution focuses on separation power, namely the ability of an architecture to distinguish non-equivalent inputs under the action of the symmetry group. We develop tools to study separation in a general equivariant framework beyond graph-specific Weisfeiler--Leman techniques, and analyze how architectural hyperparameters and design choices influence this property. The third contribution studies universality under symmetry constraints. We show that separation alone does not always suffice to guarantee approximation of all continuous functions compatible with the prescribed inductive bias. In particular, we establish that sufficient depth or suitable readout layers are necessary to recover universality, and we prove that this stronger form of universality can always be achieved under prescribed separation constraints. Overall, the thesis provides a rigorous approximation-theoretic account of how symmetry-aware architectural design affects expressivity, yielding new theoretical insights and practical design principles for equivariant deep learning models.

Theoretical Properties of Equivariant Neural Networks / Pacini, Marco. - (2026 Apr 17), pp. 1-133.

Theoretical Properties of Equivariant Neural Networks

Pacini, Marco

2026-04-17

Abstract

Modern deep neural networks often operate in highly overparameterized regimes, where they can interpolate training data while still generalizing effectively. This empirical phenomenon motivates the study of model expressivity and, in particular, of how architectural design choices shape approximation capabilities. This thesis develops a mathematical framework to analyze the expressivity of deep learning architectures through the lens of approximation theory, with a primary focus on symmetry-preserving models. We study equivariant neural networks, where inductive biases can be encoded precisely in representation-theoretic terms by restricting the hypothesis space to functions compatible with prescribed symmetries. This setting provides a principled framework to investigate how architectural choices—such as depth, intermediate representations, nonlinearities, and readout mechanisms—affect approximation properties. The first contribution of the thesis concerns the interaction between equivariance and pointwise nonlinearities. We characterize which pairs of group representations and pointwise activations induce non-trivial equivariant nonlinear maps, thereby identifying when standard layer constructions lead to non-degenerate hypothesis spaces. The second contribution focuses on separation power, namely the ability of an architecture to distinguish non-equivalent inputs under the action of the symmetry group. We develop tools to study separation in a general equivariant framework beyond graph-specific Weisfeiler--Leman techniques, and analyze how architectural hyperparameters and design choices influence this property. The third contribution studies universality under symmetry constraints. We show that separation alone does not always suffice to guarantee approximation of all continuous functions compatible with the prescribed inductive bias. In particular, we establish that sufficient depth or suitable readout layers are necessary to recover universality, and we prove that this stronger form of universality can always be achieved under prescribed separation constraints. Overall, the thesis provides a rigorous approximation-theoretic account of how symmetry-aware architectural design affects expressivity, yielding new theoretical insights and practical design principles for equivariant deep learning models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				17-apr-2026
			
	Ciclo
	
				XXXVII
			
	Anno Accademico
	
				2024-2025
			
	Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di dottorato
	
				Information and Communication Technology
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Lepri, Bruno
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
PhD_Thesis (5).pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Creative commons Dimensione 1.12 MB Formato Adobe PDF Visualizza/Apri	1.12 MB	Adobe PDF	Visualizza/Apri