Learning in neural networks critically hinges on the intricate geometry of the loss landscape associated with a given task. Traditionally, most research has focused on finding specific weight configurations that minimize the loss. In this work, born from the cross-fertilization of machine learning and theoretical soft matter physics, we introduce a novel approach to examine the weight space across all loss values. Employing the Wang-Landau enhanced sampling algorithm, we explore the neural network density of states – the number of network parameter configurations that produce a given loss value – and analyze how it depends on specific features of the training set. Using both real-world and synthetic data, we quantitatively elucidate the relation between data structure and network density of states across different sizes and depths of binary-state networks. This work presents and illustrates a novel, informative analysis method that aims at paving the way for a better understanding of the interplay between structured data and the networks that process, learn, and generate them.
Density of states in neural networks: an in-depth exploration of learning in parameter space / Mele, M.; Menichetti, R.; Ingrosso, A.; Potestio, R.. - In: TRANSACTIONS ON MACHINE LEARNING RESEARCH. - ISSN 2835-8856. - 2025:(2025).
Density of states in neural networks: an in-depth exploration of learning in parameter space
Mele M.Primo
;Menichetti R.Secondo
;Potestio R.
Co-ultimo
2025-01-01
Abstract
Learning in neural networks critically hinges on the intricate geometry of the loss landscape associated with a given task. Traditionally, most research has focused on finding specific weight configurations that minimize the loss. In this work, born from the cross-fertilization of machine learning and theoretical soft matter physics, we introduce a novel approach to examine the weight space across all loss values. Employing the Wang-Landau enhanced sampling algorithm, we explore the neural network density of states – the number of network parameter configurations that produce a given loss value – and analyze how it depends on specific features of the training set. Using both real-world and synthetic data, we quantitatively elucidate the relation between data structure and network density of states across different sizes and depths of binary-state networks. This work presents and illustrates a novel, informative analysis method that aims at paving the way for a better understanding of the interplay between structured data and the networks that process, learn, and generate them.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione



