Low-resolution, coarse-grained models are powerful computational tools to investigate the behavior of biological systems over time and length scales that are not accessible to all-atom Molecular Dynamics simulations. While several algorithms exist that aim at constructing accurate coarse-grained potentials, few works focus on the choice of the reduced representation, or mapping, to be employed to describe the high-resolution system with a lower number of degrees of freedom. This thesis proposes a series of approaches to investigate and characterise the representation problem in coarse-grained modelling of proteins. This is achieved by employing a collection of diverse methods, including statistical mechanics, machine learning algorithms and information-theoretical tools. The central mathematical object of this work is the mapping entropy, a Kullback-Leibler divergence that measures the intrinsic quality of a given reduced representation. When this quantity is minimised, we obtain the maximally informative coarse-grained mappings of a biomolecule, which cover the structure with an uneven level of detail. Tests conducted over a set of well-known proteins show that regions preserved with high probability are often related to important functional mechanisms of the molecule. Applications of the mapping entropy outside of the field of structural biology show promising results, leading to the identification of those combinations of features that retain the maximum amount of information about the high-resolution system. Additionally, a purely structural notion of scalar product and distance between coarse-grained mappings is introduced, which allow to analyse the metric and topological properties of the mapping space. The thorough exploration of such space leads to the discovery of qualitatively different reduced representations of the biomolecule of interest.

The mapping problem in coarse-grained modelling of biomolecules / Giulini, Marco. - (2022 Feb 14), pp. 1-174. [10.15168/11572_330532]

The mapping problem in coarse-grained modelling of biomolecules

Giulini, Marco
2022-02-14

Abstract

Low-resolution, coarse-grained models are powerful computational tools to investigate the behavior of biological systems over time and length scales that are not accessible to all-atom Molecular Dynamics simulations. While several algorithms exist that aim at constructing accurate coarse-grained potentials, few works focus on the choice of the reduced representation, or mapping, to be employed to describe the high-resolution system with a lower number of degrees of freedom. This thesis proposes a series of approaches to investigate and characterise the representation problem in coarse-grained modelling of proteins. This is achieved by employing a collection of diverse methods, including statistical mechanics, machine learning algorithms and information-theoretical tools. The central mathematical object of this work is the mapping entropy, a Kullback-Leibler divergence that measures the intrinsic quality of a given reduced representation. When this quantity is minimised, we obtain the maximally informative coarse-grained mappings of a biomolecule, which cover the structure with an uneven level of detail. Tests conducted over a set of well-known proteins show that regions preserved with high probability are often related to important functional mechanisms of the molecule. Applications of the mapping entropy outside of the field of structural biology show promising results, leading to the identification of those combinations of features that retain the maximum amount of information about the high-resolution system. Additionally, a purely structural notion of scalar product and distance between coarse-grained mappings is introduced, which allow to analyse the metric and topological properties of the mapping space. The thorough exploration of such space leads to the discovery of qualitatively different reduced representations of the biomolecule of interest.
14-feb-2022
XXXIV
2020-2021
Fisica (29/10/12-)
Physics
Potestio, Raffaello
Menichetti, Roberto
no
Inglese
Settore FIS/02 - Fisica Teorica, Modelli e Metodi Matematici
File in questo prodotto:
File Dimensione Formato  
phd_unitn_marco_giulini.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 35.49 MB
Formato Adobe PDF
35.49 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/330532
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact