This thesis is divided into two parts. In the first part, we introduce a novel capacity measure 2sED for statistical models based on the effective dimension. This new quantity provably bounds the generalization error under mild assumptions on the model. Furthermore, simulations on standard data sets and popular model architectures show that 2sED correlates well with the training error. For Markovian models, we show how to efficiently approximate 2sED from below through a layer-wise iterative approach, which allows us to tackle deep learning models with a large number of parameters. Simulation results suggest that the approximation is good for different prominent models and data sets. In the second part, we present GeoPTQ, a novel approach to post-training quantization. Standard quantization methods often fail to minimize performance loss optimally. We precisely formulate the weight post-training quantization problem, showing that conventional techniques can be suboptimal, particularly for linear models. By leveraging the Gauss-Newton-induced distance induced by the problem, we quantitatively (in the model dimension) bound the quantization error in the NTK framework, that is scaled neural networks obtained via gradient descend training with residual error starting gaussian i.i.d initialization. Furthermore, we introduce a theoretically grounded quantization algorithm that preserves generalization performance while minimizing quantization error. Experimental results confirm that GeoPTQ consistently outperforms traditional methods across a range of models and datasets. Even tough these two works address different machine learning topics, they share a common approach of examining the problem’s geometry to understand how model complexity and parameters influence performance.
A Geometrical Approach to Machine Learning: From Complexity Measures to Quantization / Datres, Massimiliano. - (2025 Apr 10), pp. 1-103.
A Geometrical Approach to Machine Learning: From Complexity Measures to Quantization
Datres, Massimiliano
2025-04-10
Abstract
This thesis is divided into two parts. In the first part, we introduce a novel capacity measure 2sED for statistical models based on the effective dimension. This new quantity provably bounds the generalization error under mild assumptions on the model. Furthermore, simulations on standard data sets and popular model architectures show that 2sED correlates well with the training error. For Markovian models, we show how to efficiently approximate 2sED from below through a layer-wise iterative approach, which allows us to tackle deep learning models with a large number of parameters. Simulation results suggest that the approximation is good for different prominent models and data sets. In the second part, we present GeoPTQ, a novel approach to post-training quantization. Standard quantization methods often fail to minimize performance loss optimally. We precisely formulate the weight post-training quantization problem, showing that conventional techniques can be suboptimal, particularly for linear models. By leveraging the Gauss-Newton-induced distance induced by the problem, we quantitatively (in the model dimension) bound the quantization error in the NTK framework, that is scaled neural networks obtained via gradient descend training with residual error starting gaussian i.i.d initialization. Furthermore, we introduce a theoretically grounded quantization algorithm that preserves generalization performance while minimizing quantization error. Experimental results confirm that GeoPTQ consistently outperforms traditional methods across a range of models and datasets. Even tough these two works address different machine learning topics, they share a common approach of examining the problem’s geometry to understand how model complexity and parameters influence performance.File | Dimensione | Formato | |
---|---|---|---|
PhDThesis_Datres-2.pdf
embargo fino al 10/04/2026
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.55 MB
Formato
Adobe PDF
|
1.55 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione