Modern deep neural networks often operate in highly overparameterized regimes, where they can interpolate training data while still generalizing effectively. This empirical phenomenon motivates the study of model expressivity and, in particular, of how architectural design choices shape approximation capabilities. This thesis develops a mathematical framework to analyze the expressivity of deep learning architectures through the lens of approximation theory, with a primary focus on symmetry-preserving models. We study equivariant neural networks, where inductive biases can be encoded precisely in representation-theoretic terms by restricting the hypothesis space to functions compatible with prescribed symmetries. This setting provides a principled framework to investigate how architectural choices—such as depth, intermediate representations, nonlinearities, and readout mechanisms—affect approximation properties. The first contribution of the thesis concerns the interaction between equivariance and pointwise nonlinearities. We characterize which pairs of group representations and pointwise activations induce non-trivial equivariant nonlinear maps, thereby identifying when standard layer constructions lead to non-degenerate hypothesis spaces. The second contribution focuses on separation power, namely the ability of an architecture to distinguish non-equivalent inputs under the action of the symmetry group. We develop tools to study separation in a general equivariant framework beyond graph-specific Weisfeiler--Leman techniques, and analyze how architectural hyperparameters and design choices influence this property. The third contribution studies universality under symmetry constraints. We show that separation alone does not always suffice to guarantee approximation of all continuous functions compatible with the prescribed inductive bias. In particular, we establish that sufficient depth or suitable readout layers are necessary to recover universality, and we prove that this stronger form of universality can always be achieved under prescribed separation constraints. Overall, the thesis provides a rigorous approximation-theoretic account of how symmetry-aware architectural design affects expressivity, yielding new theoretical insights and practical design principles for equivariant deep learning models.

Theoretical Properties of Equivariant Neural Networks / Pacini, Marco. - (2026 Apr 17), pp. 1-133.

Theoretical Properties of Equivariant Neural Networks

Pacini, Marco
2026-04-17

Abstract

Modern deep neural networks often operate in highly overparameterized regimes, where they can interpolate training data while still generalizing effectively. This empirical phenomenon motivates the study of model expressivity and, in particular, of how architectural design choices shape approximation capabilities. This thesis develops a mathematical framework to analyze the expressivity of deep learning architectures through the lens of approximation theory, with a primary focus on symmetry-preserving models. We study equivariant neural networks, where inductive biases can be encoded precisely in representation-theoretic terms by restricting the hypothesis space to functions compatible with prescribed symmetries. This setting provides a principled framework to investigate how architectural choices—such as depth, intermediate representations, nonlinearities, and readout mechanisms—affect approximation properties. The first contribution of the thesis concerns the interaction between equivariance and pointwise nonlinearities. We characterize which pairs of group representations and pointwise activations induce non-trivial equivariant nonlinear maps, thereby identifying when standard layer constructions lead to non-degenerate hypothesis spaces. The second contribution focuses on separation power, namely the ability of an architecture to distinguish non-equivalent inputs under the action of the symmetry group. We develop tools to study separation in a general equivariant framework beyond graph-specific Weisfeiler--Leman techniques, and analyze how architectural hyperparameters and design choices influence this property. The third contribution studies universality under symmetry constraints. We show that separation alone does not always suffice to guarantee approximation of all continuous functions compatible with the prescribed inductive bias. In particular, we establish that sufficient depth or suitable readout layers are necessary to recover universality, and we prove that this stronger form of universality can always be achieved under prescribed separation constraints. Overall, the thesis provides a rigorous approximation-theoretic account of how symmetry-aware architectural design affects expressivity, yielding new theoretical insights and practical design principles for equivariant deep learning models.
17-apr-2026
XXXVII
2024-2025
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Lepri, Bruno
no
Inglese
Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis (5).pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Creative commons
Dimensione 1.12 MB
Formato Adobe PDF
1.12 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/482950
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact