Low-resolution, coarse-grained models are powerful computational tools to investigate the behavior of biological systems over time and length scales that are not accessible to all-atom Molecular Dynamics simulations. While several algorithms exist that aim at constructing accurate coarse-grained potentials, few works focus on the choice of the reduced representation, or mapping, to be employed to describe the high-resolution system with a lower number of degrees of freedom. This thesis proposes a series of approaches to investigate and characterise the representation problem in coarse-grained modelling of proteins. This is achieved by employing a collection of diverse methods, including statistical mechanics, machine learning algorithms and information-theoretical tools. The central mathematical object of this work is the mapping entropy, a Kullback-Leibler divergence that measures the intrinsic quality of a given reduced representation. When this quantity is minimised, we obtain the maximally informative coarse-grained mappings of a biomolecule, which cover the structure with an uneven level of detail. Tests conducted over a set of well-known proteins show that regions preserved with high probability are often related to important functional mechanisms of the molecule. Applications of the mapping entropy outside of the field of structural biology show promising results, leading to the identification of those combinations of features that retain the maximum amount of information about the high-resolution system. Additionally, a purely structural notion of scalar product and distance between coarse-grained mappings is introduced, which allow to analyse the metric and topological properties of the mapping space. The thorough exploration of such space leads to the discovery of qualitatively different reduced representations of the biomolecule of interest.
The mapping problem in coarse-grained modelling of biomolecules / Giulini, Marco. - (2022 Feb 14), pp. 1-174. [10.15168/11572_330532]
The mapping problem in coarse-grained modelling of biomolecules
Giulini, Marco
2022-02-14
Abstract
Low-resolution, coarse-grained models are powerful computational tools to investigate the behavior of biological systems over time and length scales that are not accessible to all-atom Molecular Dynamics simulations. While several algorithms exist that aim at constructing accurate coarse-grained potentials, few works focus on the choice of the reduced representation, or mapping, to be employed to describe the high-resolution system with a lower number of degrees of freedom. This thesis proposes a series of approaches to investigate and characterise the representation problem in coarse-grained modelling of proteins. This is achieved by employing a collection of diverse methods, including statistical mechanics, machine learning algorithms and information-theoretical tools. The central mathematical object of this work is the mapping entropy, a Kullback-Leibler divergence that measures the intrinsic quality of a given reduced representation. When this quantity is minimised, we obtain the maximally informative coarse-grained mappings of a biomolecule, which cover the structure with an uneven level of detail. Tests conducted over a set of well-known proteins show that regions preserved with high probability are often related to important functional mechanisms of the molecule. Applications of the mapping entropy outside of the field of structural biology show promising results, leading to the identification of those combinations of features that retain the maximum amount of information about the high-resolution system. Additionally, a purely structural notion of scalar product and distance between coarse-grained mappings is introduced, which allow to analyse the metric and topological properties of the mapping space. The thorough exploration of such space leads to the discovery of qualitatively different reduced representations of the biomolecule of interest.File | Dimensione | Formato | |
---|---|---|---|
phd_unitn_marco_giulini.pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
35.49 MB
Formato
Adobe PDF
|
35.49 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione