The choice of structural resolution is a fundamental aspect of protein modeling, determining the balance between descriptive power and interpretability. Although atomistic simulations provide maximal detail, much of this information is redundant to understand the relevant large-scale motions and conformational states. Here, we introduce an unsupervised information-theoretic framework that determines the minimal number of atoms required to retain a maximally informative description of the configurational space sampled by a protein. This framework quantifies the informativeness of coarse-grained representations obtained by systematically decimating atomic degrees of freedom and evaluating the resulting clustering of sampled conformations. Application to molecular dynamics trajectories of dynamically diverse proteins shows that the optimal number of retained atoms scales linearly with system size, averaging about four heavy atoms per residue, remarkably consistent with the resolution of well-established coarse-grained models, such as MARTINI and SIRAH. Furthermore, the analysis shows that the optimal retained atom number depends not only on molecular size but also on the extent of conformational exploration, decreasing for systems dominated by collective motions. The proposed method establishes a general criterion to identify the minimal structural detail that preserves the essential configurational information, thereby offering a new viewpoint on the structure–dynamics–function relationship in proteins and guiding the construction of parsimonious yet informative multiscale models.

Determining the Optimal Structural Resolution of Proteins through an Information-Theoretic Analysis of Their Conformational Ensemble / Mele, Margherita; Fiorentini, Raffaele; Tarenzi, Thomas; Mattiotti, Giovanni; Potestio, Raffaello. - In: JOURNAL OF CHEMICAL THEORY AND COMPUTATION. - ISSN 1549-9626. - 22:3(2026), pp. 1244-1257. [10.1021/acs.jctc.5c01773]

Determining the Optimal Structural Resolution of Proteins through an Information-Theoretic Analysis of Their Conformational Ensemble

Margherita Mele;Raffaele Fiorentini;Thomas Tarenzi;Giovanni Mattiotti;Raffaello Potestio
2026-01-01

Abstract

The choice of structural resolution is a fundamental aspect of protein modeling, determining the balance between descriptive power and interpretability. Although atomistic simulations provide maximal detail, much of this information is redundant to understand the relevant large-scale motions and conformational states. Here, we introduce an unsupervised information-theoretic framework that determines the minimal number of atoms required to retain a maximally informative description of the configurational space sampled by a protein. This framework quantifies the informativeness of coarse-grained representations obtained by systematically decimating atomic degrees of freedom and evaluating the resulting clustering of sampled conformations. Application to molecular dynamics trajectories of dynamically diverse proteins shows that the optimal number of retained atoms scales linearly with system size, averaging about four heavy atoms per residue, remarkably consistent with the resolution of well-established coarse-grained models, such as MARTINI and SIRAH. Furthermore, the analysis shows that the optimal retained atom number depends not only on molecular size but also on the extent of conformational exploration, decreasing for systems dominated by collective motions. The proposed method establishes a general criterion to identify the minimal structural detail that preserves the essential configurational information, thereby offering a new viewpoint on the structure–dynamics–function relationship in proteins and guiding the construction of parsimonious yet informative multiscale models.
2026
3
Mele, Margherita; Fiorentini, Raffaele; Tarenzi, Thomas; Mattiotti, Giovanni; Potestio, Raffaello
Determining the Optimal Structural Resolution of Proteins through an Information-Theoretic Analysis of Their Conformational Ensemble / Mele, Margherita; Fiorentini, Raffaele; Tarenzi, Thomas; Mattiotti, Giovanni; Potestio, Raffaello. - In: JOURNAL OF CHEMICAL THEORY AND COMPUTATION. - ISSN 1549-9626. - 22:3(2026), pp. 1244-1257. [10.1021/acs.jctc.5c01773]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/479770
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact