Introduction Understanding the intricate workings of biological systems at the molecular level is crucial for unraveling the complex mechanisms that underlie Life itself. Proteins and RNA, two essential components of cellular structure and processes, exhibit remarkable structural and functional diversity. Traditional experimental techniques have provided valuable insights into their behaviors; however, they often fall short in capturing the dynamic nature of these biomolecules. Over the past few decades, multiscale molecular dynamics (MD) simulations have emerged as a powerful computational tool to bridge this gap, enabling the study of biological systems at an atomistic resolution. My Ph.D. thesis aims to delve into the realm of multiscale MD simulations to unravel the dynamic landscape of proteins and RNA, shedding light on their folding mechanisms, conformational transitions, and functional dynamics. By integrating the principles of classical, atomistic MD and more advanced modelling techniques, such as coarse-grained models, this research endeavors to highlight and possibly propose ways to overcome the limitations of conventional simulations and offer a comprehensive understanding of the complex dynamics governing these biomolecules. Chapter 1 The first chapter of the thesis provides a comprehensive overview of the theoretical foundations and practical aspects of classical all-atom molecular dynamics (MD) simulations. It begins with a schematic derivation of the all-atom MD equations, emphasizing the integration of electronic degrees of freedom and their contribution to the potential energy. The chapter then focuses on the energy terms utilized in all-atom force fields, highlighting their significance in accurately representing the interactions among atoms. The discussion extends to the theoretical underpinnings of thermostats, drawing from statistical mechanics principles to elucidate their role in controlling temperature during simulations. Furthermore, the use of periodic boundary conditions and the particle mesh Ewald method are discussed, highlighting their importance in simulating (or at least mimicking) large systems and accounting for long-range electrostatic interactions. By delving into these foundational concepts and techniques, this chapter establishes the groundwork for subsequent investigations in multiscale molecular dynamics simulations. Chapter 2 The second chapter focuses on the characterization of the conformational space of the Shwachman-Bodian-Diamond syndrome (SBDS) protein, a critical component involved in cellular processes. Specifically, this research employs all-atom molecular dynamics simulations to study the wild-type SBDS and 12 missense mutations of clinical relevance. The simulations are initiated with two distinct NMR structures representing an open and a closed conformation, respectively, to capture a wide range of conformational variability. Each starting conformation is simulated for the wild type and all 12 mutations, resulting in a total of 26 simulations with a cumulative sampling time of 13$\mu s$. The analyses of these extensive simulations provide valuable insights into the effects of missense mutations on SBDS dynamics and function. The investigation reveals a common trend among all mutations, characterized by increased residue fluctuations in the hinge I-II region. This observation suggests potential interference with the conformational changes involving the reorientation of domains II-III and the detachment of eIF6 from the 60S subunit. Furthermore, the study highlights the structural similarity of the K67E mutation to the wild type, despite a lower exposed positive charge. This finding, supported by free energy analysis, suggests that the pathological mechanism associated with this mutation may be linked more closely to a decrease in binding affinity rather than structural deformation. Additionally, the simulations of R19Q and C84R exhibit lower binding affinity specifically in closed trajectories, corroborating experimental observations regarding their potential impact on RNA binding. Moreover, K151 and R218 reveal importance in stabilizing the conformation assumed by SBDS upon binding with the 60S subunit. Notably, the dynamics-based clustering and free energy analysis highlight the distinct behavior of the K151N mutation, both in open and closed simulations, suggesting that compromised dynamics may hinder the protein's ability to stabilize a functional conformation for effective cooperation with EFL1. Collectively, this chapter contributes to our understanding of SBDS dynamics and the effects of missense mutations, paving the way for further investigations into the molecular mechanisms underlying Shwachman-Diamond syndrome. Chapter 3 The third chapter explores the principles and applications of coarse-graining (CG) and multiscale modeling techniques in computational biophysics. After providing a theoretical foundation for CG, the chapter presents several examples of CG models employed in the research. This includes the implementation of Elastic Network Models (ENMs), which capture the essential dynamics of proteins by simplifying their atomistic representation. Additionally, the oxRNA model, designed specifically for RNA molecules, and the CANVAS multi-resolution model for proteins are introduced, showcasing their ability to capture the key features of the molecular system while significantly reducing computational complexity. Furthermore, the chapter delves into the theory of implicit solvation, as it plays a crucial role in some of the aforementioned models. Implicit solvation methods enable the efficient treatment of solvent effects without explicitly simulating water molecules, thereby reducing computational costs. Notably, the chapter sets the stage for the subsequent chapter by introducing the concept of implicit solvation, as chapter 5 presents a novel technique for implicit solvation based on Artificial Neural Networks. By providing an in-depth exploration of CG and multiscale modeling techniques, along with their associated solvation models, this chapter equips readers with the necessary tools to understand the methods developed to effectively study large-scale biological systems with reduced computational demands. Chapter 4 The fourth chapter, extracted from the paper \textit{``In search of a dynamical vocabulary: a pipeline to construct a basis of shared traits in large-scale motions of proteins'', published in Applied Sciences, introduces a structure-based pipeline for capturing the main features of large-scale protein motions. The pipeline aims to provide a general description of protein motion, not only for those proteins whose structures are used as input but also for structurally similar proteins that were not included in the initial dataset. To demonstrate the effectiveness of the pipeline, the research applied it to a set of 116 chymotrypsin-related proteases. By employing the presented workflow, the study successfully captured dynamical features of proteins that are structurally similar to, but not part of, the input structures used to build the basis set of the dynamical space of the proteins. This allows a comprehensive understanding of the shared traits in large-scale motions, facilitating the characterization of protein dynamics beyond the limitations of specific protein structures. Overall, this chapter highlights the development and application of a structure-based pipeline that enables the extraction of essential dynamic features from proteins, contributing to the establishment of a comprehensive dynamical vocabulary in the study of protein motion. Chapter 5 The fifth chapter introduces a novel method for implicit solvation of biomolecules in molecular dynamics simulations, leveraging the power of artificial neural networks (ANN). The chapter begins by formally describing the methodology, starting with the architecture of ANN. The latter are trained to predict the free energy of solvation for each atom in the system at every time step of the simulation. The inputs to the network are derived from special symmetry functions, which capture the local environment of each atom. The output of the ANN provides the necessary information to extract forces, which are subsequently integrated into the equations of motion. Moreover, the chapter presents the algorithmic implementation of the method into the LAMMPS molecular dynamics software package, enabling its practical application to a wide range of molecular systems. To assess the performance and accuracy of the method, three test cases are examined: the alanine dipeptide, the icosalanine (a polymer composed of 20 alanine amino acids), and a small RNA fragment containing approximately 1000 atoms. The results of the tests indicate that the method performs well in describing general macroscopic features of the molecules. However, it exhibits limitations in accurately predicting high-resolution properties, such as specific minima in the Ramachandran space of dihedral angles for the alanine dipeptide or the precise hydrogen bond network among the bases in the RNA fragment. Despite these limitations, the method demonstrates promising computational performance and scalability, making it a valuable tool for efficient implicit solvation simulations. Overall, this chapter presents a new approach to implicit solvation using artificial neural networks, showcasing its potential for accurately describing general molecular features while highlighting areas for further refinement. The method holds promise for enabling large-scale simulations with reduced computational costs, thereby expanding the scope of molecular dynamics studies. Chapter 6 The sixth chapter presents the results of a comprehensive and multi-resolution molecular dynamics study focusing on a virion particle of the Chlorotic Cowpea Mottle Virus (CCMV) and its constituent molecules. The chapter encompasses various MD simulations, each offering unique insights into the dynamics and behavior of the viral components. The first set of simulations investigates the coarse-grained folding and relaxation of the RNA2 viral single-stranded RNA (ssRNA) fragment. The simulations start with a free, rod-like polymer chain configuration, and the subsequent folding and relaxation processes are examined. Furthermore, non-equilibrium squeezing of the folded RNA structure into a spherical region of space, mimicking the confinement within the capsid, is explored. These simulations are aimed at shedding light on the structural and dynamic aspects of viral RNA during the self-assembly process of the virus. The chapter then proceeds to present multi-resolution equilibrium simulations of a trimer, which consists of three capsid molecules. The trimer is studied using five different approaches: all-atom representation in explicit solvent, all-atom representation in implicit solvent employing Debye-Huckel electrostatics, and three different applications of the CANVAS model. The latter is a model multi-resolution for proteins, developed in my group, which combines atomistic force fields together with an elastic network model to describe protein dynamics with manually deployed levels of resolution. These simulations allow for a comprehensive analysis of the trimer's dynamics and interactions, highlighting the influence of different models on the observed behavior, as well as testing the applicability of the CANVAS model to this kind of system. Additionally, all-atom simulations are performed for both the capsid and the virion, which includes the RNA2 fragment within the capsid, in explicit solvent. These simulations provide detailed information about the structural and dynamic properties of the viral capsid and the interactions between the capsid and the encapsulated RNA2 fragment. By leveraging multi-resolution approaches and conducting various simulations at different levels of detail, this chapter offers a comprehensive understanding of the RNA dynamics, folding, relaxation, and interactions within the virion particle of CCMV. These findings contribute to our knowledge of viral assembly and the behavior of viral constituents, facilitating further insights into the functioning and stability of viral systems.

COMPUTATIONAL MULTISCALE INVESTIGATIONS OF BIOLOGICAL MOLECULES / Mattiotti, Giovanni. - (2023 Nov 20), pp. 1-224. [10.15168/11572_396649]

COMPUTATIONAL MULTISCALE INVESTIGATIONS OF BIOLOGICAL MOLECULES

Mattiotti, Giovanni
2023-11-20

Abstract

Introduction Understanding the intricate workings of biological systems at the molecular level is crucial for unraveling the complex mechanisms that underlie Life itself. Proteins and RNA, two essential components of cellular structure and processes, exhibit remarkable structural and functional diversity. Traditional experimental techniques have provided valuable insights into their behaviors; however, they often fall short in capturing the dynamic nature of these biomolecules. Over the past few decades, multiscale molecular dynamics (MD) simulations have emerged as a powerful computational tool to bridge this gap, enabling the study of biological systems at an atomistic resolution. My Ph.D. thesis aims to delve into the realm of multiscale MD simulations to unravel the dynamic landscape of proteins and RNA, shedding light on their folding mechanisms, conformational transitions, and functional dynamics. By integrating the principles of classical, atomistic MD and more advanced modelling techniques, such as coarse-grained models, this research endeavors to highlight and possibly propose ways to overcome the limitations of conventional simulations and offer a comprehensive understanding of the complex dynamics governing these biomolecules. Chapter 1 The first chapter of the thesis provides a comprehensive overview of the theoretical foundations and practical aspects of classical all-atom molecular dynamics (MD) simulations. It begins with a schematic derivation of the all-atom MD equations, emphasizing the integration of electronic degrees of freedom and their contribution to the potential energy. The chapter then focuses on the energy terms utilized in all-atom force fields, highlighting their significance in accurately representing the interactions among atoms. The discussion extends to the theoretical underpinnings of thermostats, drawing from statistical mechanics principles to elucidate their role in controlling temperature during simulations. Furthermore, the use of periodic boundary conditions and the particle mesh Ewald method are discussed, highlighting their importance in simulating (or at least mimicking) large systems and accounting for long-range electrostatic interactions. By delving into these foundational concepts and techniques, this chapter establishes the groundwork for subsequent investigations in multiscale molecular dynamics simulations. Chapter 2 The second chapter focuses on the characterization of the conformational space of the Shwachman-Bodian-Diamond syndrome (SBDS) protein, a critical component involved in cellular processes. Specifically, this research employs all-atom molecular dynamics simulations to study the wild-type SBDS and 12 missense mutations of clinical relevance. The simulations are initiated with two distinct NMR structures representing an open and a closed conformation, respectively, to capture a wide range of conformational variability. Each starting conformation is simulated for the wild type and all 12 mutations, resulting in a total of 26 simulations with a cumulative sampling time of 13$\mu s$. The analyses of these extensive simulations provide valuable insights into the effects of missense mutations on SBDS dynamics and function. The investigation reveals a common trend among all mutations, characterized by increased residue fluctuations in the hinge I-II region. This observation suggests potential interference with the conformational changes involving the reorientation of domains II-III and the detachment of eIF6 from the 60S subunit. Furthermore, the study highlights the structural similarity of the K67E mutation to the wild type, despite a lower exposed positive charge. This finding, supported by free energy analysis, suggests that the pathological mechanism associated with this mutation may be linked more closely to a decrease in binding affinity rather than structural deformation. Additionally, the simulations of R19Q and C84R exhibit lower binding affinity specifically in closed trajectories, corroborating experimental observations regarding their potential impact on RNA binding. Moreover, K151 and R218 reveal importance in stabilizing the conformation assumed by SBDS upon binding with the 60S subunit. Notably, the dynamics-based clustering and free energy analysis highlight the distinct behavior of the K151N mutation, both in open and closed simulations, suggesting that compromised dynamics may hinder the protein's ability to stabilize a functional conformation for effective cooperation with EFL1. Collectively, this chapter contributes to our understanding of SBDS dynamics and the effects of missense mutations, paving the way for further investigations into the molecular mechanisms underlying Shwachman-Diamond syndrome. Chapter 3 The third chapter explores the principles and applications of coarse-graining (CG) and multiscale modeling techniques in computational biophysics. After providing a theoretical foundation for CG, the chapter presents several examples of CG models employed in the research. This includes the implementation of Elastic Network Models (ENMs), which capture the essential dynamics of proteins by simplifying their atomistic representation. Additionally, the oxRNA model, designed specifically for RNA molecules, and the CANVAS multi-resolution model for proteins are introduced, showcasing their ability to capture the key features of the molecular system while significantly reducing computational complexity. Furthermore, the chapter delves into the theory of implicit solvation, as it plays a crucial role in some of the aforementioned models. Implicit solvation methods enable the efficient treatment of solvent effects without explicitly simulating water molecules, thereby reducing computational costs. Notably, the chapter sets the stage for the subsequent chapter by introducing the concept of implicit solvation, as chapter 5 presents a novel technique for implicit solvation based on Artificial Neural Networks. By providing an in-depth exploration of CG and multiscale modeling techniques, along with their associated solvation models, this chapter equips readers with the necessary tools to understand the methods developed to effectively study large-scale biological systems with reduced computational demands. Chapter 4 The fourth chapter, extracted from the paper \textit{``In search of a dynamical vocabulary: a pipeline to construct a basis of shared traits in large-scale motions of proteins'', published in Applied Sciences, introduces a structure-based pipeline for capturing the main features of large-scale protein motions. The pipeline aims to provide a general description of protein motion, not only for those proteins whose structures are used as input but also for structurally similar proteins that were not included in the initial dataset. To demonstrate the effectiveness of the pipeline, the research applied it to a set of 116 chymotrypsin-related proteases. By employing the presented workflow, the study successfully captured dynamical features of proteins that are structurally similar to, but not part of, the input structures used to build the basis set of the dynamical space of the proteins. This allows a comprehensive understanding of the shared traits in large-scale motions, facilitating the characterization of protein dynamics beyond the limitations of specific protein structures. Overall, this chapter highlights the development and application of a structure-based pipeline that enables the extraction of essential dynamic features from proteins, contributing to the establishment of a comprehensive dynamical vocabulary in the study of protein motion. Chapter 5 The fifth chapter introduces a novel method for implicit solvation of biomolecules in molecular dynamics simulations, leveraging the power of artificial neural networks (ANN). The chapter begins by formally describing the methodology, starting with the architecture of ANN. The latter are trained to predict the free energy of solvation for each atom in the system at every time step of the simulation. The inputs to the network are derived from special symmetry functions, which capture the local environment of each atom. The output of the ANN provides the necessary information to extract forces, which are subsequently integrated into the equations of motion. Moreover, the chapter presents the algorithmic implementation of the method into the LAMMPS molecular dynamics software package, enabling its practical application to a wide range of molecular systems. To assess the performance and accuracy of the method, three test cases are examined: the alanine dipeptide, the icosalanine (a polymer composed of 20 alanine amino acids), and a small RNA fragment containing approximately 1000 atoms. The results of the tests indicate that the method performs well in describing general macroscopic features of the molecules. However, it exhibits limitations in accurately predicting high-resolution properties, such as specific minima in the Ramachandran space of dihedral angles for the alanine dipeptide or the precise hydrogen bond network among the bases in the RNA fragment. Despite these limitations, the method demonstrates promising computational performance and scalability, making it a valuable tool for efficient implicit solvation simulations. Overall, this chapter presents a new approach to implicit solvation using artificial neural networks, showcasing its potential for accurately describing general molecular features while highlighting areas for further refinement. The method holds promise for enabling large-scale simulations with reduced computational costs, thereby expanding the scope of molecular dynamics studies. Chapter 6 The sixth chapter presents the results of a comprehensive and multi-resolution molecular dynamics study focusing on a virion particle of the Chlorotic Cowpea Mottle Virus (CCMV) and its constituent molecules. The chapter encompasses various MD simulations, each offering unique insights into the dynamics and behavior of the viral components. The first set of simulations investigates the coarse-grained folding and relaxation of the RNA2 viral single-stranded RNA (ssRNA) fragment. The simulations start with a free, rod-like polymer chain configuration, and the subsequent folding and relaxation processes are examined. Furthermore, non-equilibrium squeezing of the folded RNA structure into a spherical region of space, mimicking the confinement within the capsid, is explored. These simulations are aimed at shedding light on the structural and dynamic aspects of viral RNA during the self-assembly process of the virus. The chapter then proceeds to present multi-resolution equilibrium simulations of a trimer, which consists of three capsid molecules. The trimer is studied using five different approaches: all-atom representation in explicit solvent, all-atom representation in implicit solvent employing Debye-Huckel electrostatics, and three different applications of the CANVAS model. The latter is a model multi-resolution for proteins, developed in my group, which combines atomistic force fields together with an elastic network model to describe protein dynamics with manually deployed levels of resolution. These simulations allow for a comprehensive analysis of the trimer's dynamics and interactions, highlighting the influence of different models on the observed behavior, as well as testing the applicability of the CANVAS model to this kind of system. Additionally, all-atom simulations are performed for both the capsid and the virion, which includes the RNA2 fragment within the capsid, in explicit solvent. These simulations provide detailed information about the structural and dynamic properties of the viral capsid and the interactions between the capsid and the encapsulated RNA2 fragment. By leveraging multi-resolution approaches and conducting various simulations at different levels of detail, this chapter offers a comprehensive understanding of the RNA dynamics, folding, relaxation, and interactions within the virion particle of CCMV. These findings contribute to our knowledge of viral assembly and the behavior of viral constituents, facilitating further insights into the functioning and stability of viral systems.
20-nov-2023
XXXVI
2022-2023
Fisica (29/10/12-)
Physics
Potestio, Raffaello
no
Inglese
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_v3.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 70.24 MB
Formato Adobe PDF
70.24 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/396649
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact