Modern deep neural networks often operate in highly overparameterized regimes, where they can interpolate training data while still generalizing effectively. This empirical phenomenon motivates the study of model expressivity and, in particular, of how architectural design choices shape approximation capabilities. This thesis develops a mathematical framework to analyze the expressivity of deep learning architectures through the lens of approximation theory, with a primary focus on symmetry-preserving models. We study equivariant neural networks, where inductive biases can be encoded precisely in representation-theoretic terms by restricting the hypothesis space to functions compatible with prescribed symmetries. This setting provides a principled framework to investigate how architectural choices—such as depth, intermediate representations, nonlinearities, and readout mechanisms—affect approximation properties. The first contribution of the thesis concerns the interaction between equivariance and pointwise nonlinearities. We characterize which pairs of group representations and pointwise activations induce non-trivial equivariant nonlinear maps, thereby identifying when standard layer constructions lead to non-degenerate hypothesis spaces. The second contribution focuses on separation power, namely the ability of an architecture to distinguish non-equivalent inputs under the action of the symmetry group. We develop tools to study separation in a general equivariant framework beyond graph-specific Weisfeiler--Leman techniques, and analyze how architectural hyperparameters and design choices influence this property. The third contribution studies universality under symmetry constraints. We show that separation alone does not always suffice to guarantee approximation of all continuous functions compatible with the prescribed inductive bias. In particular, we establish that sufficient depth or suitable readout layers are necessary to recover universality, and we prove that this stronger form of universality can always be achieved under prescribed separation constraints. Overall, the thesis provides a rigorous approximation-theoretic account of how symmetry-aware architectural design affects expressivity, yielding new theoretical insights and practical design principles for equivariant deep learning models.
Theoretical Properties of Equivariant Neural Networks / Pacini, Marco. - (2026 Apr 17), pp. 1-133.
Theoretical Properties of Equivariant Neural Networks
Pacini, Marco
2026-04-17
Abstract
Modern deep neural networks often operate in highly overparameterized regimes, where they can interpolate training data while still generalizing effectively. This empirical phenomenon motivates the study of model expressivity and, in particular, of how architectural design choices shape approximation capabilities. This thesis develops a mathematical framework to analyze the expressivity of deep learning architectures through the lens of approximation theory, with a primary focus on symmetry-preserving models. We study equivariant neural networks, where inductive biases can be encoded precisely in representation-theoretic terms by restricting the hypothesis space to functions compatible with prescribed symmetries. This setting provides a principled framework to investigate how architectural choices—such as depth, intermediate representations, nonlinearities, and readout mechanisms—affect approximation properties. The first contribution of the thesis concerns the interaction between equivariance and pointwise nonlinearities. We characterize which pairs of group representations and pointwise activations induce non-trivial equivariant nonlinear maps, thereby identifying when standard layer constructions lead to non-degenerate hypothesis spaces. The second contribution focuses on separation power, namely the ability of an architecture to distinguish non-equivalent inputs under the action of the symmetry group. We develop tools to study separation in a general equivariant framework beyond graph-specific Weisfeiler--Leman techniques, and analyze how architectural hyperparameters and design choices influence this property. The third contribution studies universality under symmetry constraints. We show that separation alone does not always suffice to guarantee approximation of all continuous functions compatible with the prescribed inductive bias. In particular, we establish that sufficient depth or suitable readout layers are necessary to recover universality, and we prove that this stronger form of universality can always be achieved under prescribed separation constraints. Overall, the thesis provides a rigorous approximation-theoretic account of how symmetry-aware architectural design affects expressivity, yielding new theoretical insights and practical design principles for equivariant deep learning models.| File | Dimensione | Formato | |
|---|---|---|---|
|
PhD_Thesis (5).pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Creative commons
Dimensione
1.12 MB
Formato
Adobe PDF
|
1.12 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione



