Generalizing Under Data Scarcity. Enhancing the representation capability from few samples.

Braccaioli, Lorenzo

The widespread adoption of deep learning in both research and industrial contexts has revealed a central limitation: many real-world applications lack large, diverse, and reliably labeled datasets. This challenge is particularly evident in domains where data acquisition is costly, error-prone, or inherently scarce, such as industrial inspection, anomaly detection and localization. This thesis investigates how learning systems can be designed to operate effectively when only a small number of samples are available at training or inference time. The first part of the thesis focuses on meta-learning transformers for supervised and unsupervised few-shot tasks. We explore how transformers behave when trained on structured, multi-domain datasets under controlled conditions, where train/test contamination can be explicitly avoided. By reframing few-shot learning as a sequence modeling problem, we analyze the generalization capabilities of in-context learners across domains, and study how training order influences performance. We propose the GEOM framework for supervised few-shot classification and extend its principles to the unsupervised setting with CAMeLU, demonstrating state-of-the-art performance in cross-domain scenarios. The second part of the thesis addresses the gap between academic research and real-world industrial constraints. Working with an Italian company specializing in glass inspection systems, we propose two domain-specific solutions. The first is a few-shot approach for structural glass defect classification, enabling flexible adaptation to new defect types and variations in glass materials. The second is a reconstruction-based anomaly detection pipeline for identifying irregularities in silk-screen printed patterns, where labeled data are extremely scarce. This dissertation highlights the importance of designing models that do not rely on large-scale datasets, but instead leverage task structure, adaptation mechanisms, and data-efficient learning strategies. By bridging foundational research on meta-learning with concrete industrial use cases, the thesis demonstrates that few-shot paradigms can be robust, scalable, and practically applicable in demanding environments.

Generalizing Under Data Scarcity. Enhancing the representation capability from few samples / Braccaioli, Lorenzo. - (2026 Apr 15), pp. 1-140.

Generalizing Under Data Scarcity. Enhancing the representation capability from few samples.

Braccaioli, Lorenzo

2026-04-15

Abstract

The widespread adoption of deep learning in both research and industrial contexts has revealed a central limitation: many real-world applications lack large, diverse, and reliably labeled datasets. This challenge is particularly evident in domains where data acquisition is costly, error-prone, or inherently scarce, such as industrial inspection, anomaly detection and localization. This thesis investigates how learning systems can be designed to operate effectively when only a small number of samples are available at training or inference time. The first part of the thesis focuses on meta-learning transformers for supervised and unsupervised few-shot tasks. We explore how transformers behave when trained on structured, multi-domain datasets under controlled conditions, where train/test contamination can be explicitly avoided. By reframing few-shot learning as a sequence modeling problem, we analyze the generalization capabilities of in-context learners across domains, and study how training order influences performance. We propose the GEOM framework for supervised few-shot classification and extend its principles to the unsupervised setting with CAMeLU, demonstrating state-of-the-art performance in cross-domain scenarios. The second part of the thesis addresses the gap between academic research and real-world industrial constraints. Working with an Italian company specializing in glass inspection systems, we propose two domain-specific solutions. The first is a few-shot approach for structural glass defect classification, enabling flexible adaptation to new defect types and variations in glass materials. The second is a reconstruction-based anomaly detection pipeline for identifying irregularities in silk-screen printed patterns, where labeled data are extremely scarce. This dissertation highlights the importance of designing models that do not rely on large-scale datasets, but instead leverage task structure, adaptation mechanisms, and data-efficient learning strategies. By bridging foundational research on meta-learning with concrete industrial use cases, the thesis demonstrates that few-shot paradigms can be robust, scalable, and practically applicable in demanding environments.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				15-apr-2026
			
	Ciclo
	
				XXXVIII
			
	Anno Accademico
	
				2024-2025
			
	Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di dottorato
	
				Innovazione Industriale
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Conci, Nicola
			
	Supervisore aggiunto/Correlatore esterno (External Co-supervisor)
	
				co-supervisor: Chiara Corridori
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Lingua (Language)
	
				Inglese
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
phd_unitn_braccaioli_lorenzo.pdf embargo fino al 25/03/2028 Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 11.37 MB Formato Adobe PDF Visualizza/Apri	11.37 MB	Adobe PDF	Visualizza/Apri