Metagenomic sequencing has revolutionized gut microbiome research by providing comprehensive access to the entire genomic content of any biological sample, namely a metagenome. Thanks to the possibility of studying microbial ecosystems in-depth without requiring direct isolation or cultivation of their members, metagenomics has greatly expanded knowledge on the taxonomic and functional diversity of the human gut microbiome and how deeply it is involved in human physiology. Metagenomic assembly is a computational technique that enables the reconstruction of bacterial genomes, known as metagenome-assembled genomes (MAGs). Systematically recovering MAGs from gut metagenomes has allowed researchers to progressively unfold the complexity of the microbiome-host system by cataloging and characterizing the genomes of thousands of previously unknown bacterial lineages that comprise it. Despite its importance, this task faces computational limitations that complicate the recovery of microbial diversity associated with rare and low-abundance species, popularly known as the 'microbial dark matter'. Consequently, optimizing available metagenomic data to maximize observable diversity and genome reconstruction is crucial for comprehensive microbiome analysis. In this doctoral thesis, I explore how the concurrent processing of multiple biologically similar metagenomes, when available, using reference- and assembly-based approaches can help in the identification of previously undetected bacterial species. More specifically, I performed metagenomic (co)assembly and (co)binning and applied it to a cohort of ultra-deep, redundantly sequenced gut metagenomes from a small number of individuals. I demonstrate that the careful application of this approach allows for the recovery of high-quality MAGs from novel and under-characterized bacterial species that would otherwise be missed with a single sample. This allowed for the reconstruction of genomes from 198 species lacking reference genomes and 39 completely novel microbial species from gut communities that should already be well represented, highlighting how a significant amount of phylogenetic diversity has remained hidden primarily due to the low sequencing depth of most studies, rather than an insufficient number of sampled individuals. Although multi-sample approaches have been applied in numerous studies for the aforementioned reasons, this work outlines the ideal conditions to apply them in cross-sectional and longitudinal contexts to minimize the occurrence of assembly errors. I show that (co)assembly is most effective with samples from the same subject, as combinations of samples from unrelated subjects generates strain-chimeric MAGs that do not represent actual strains populations. In parallel, I also provide estimates of the sequencing requirements needed to capture this diversity by complementing (co)assembly with reference-based methods. The findings in this thesis advance our understanding of metagenomic assembly techniques and highlight the importance of optimizing data usage in microbiome studies. The recovery of high-quality MAGs empowers various applications, from surveying unknown species to guiding their experimental isolation and characterization. Furthermore, integrating these MAGs into reference-based approaches enables large-scale screening to draw associations with host-related variables, ultimately contributing to a more comprehensive understanding of the gut microbiome.

Exploiting the potential of metagenomics to uncover novel and uncharacterized gut microbiome diversity / Golzato, Davide. - (2024 Dec 16), pp. 1-113.

Exploiting the potential of metagenomics to uncover novel and uncharacterized gut microbiome diversity

Golzato, Davide
2024-12-16

Abstract

Metagenomic sequencing has revolutionized gut microbiome research by providing comprehensive access to the entire genomic content of any biological sample, namely a metagenome. Thanks to the possibility of studying microbial ecosystems in-depth without requiring direct isolation or cultivation of their members, metagenomics has greatly expanded knowledge on the taxonomic and functional diversity of the human gut microbiome and how deeply it is involved in human physiology. Metagenomic assembly is a computational technique that enables the reconstruction of bacterial genomes, known as metagenome-assembled genomes (MAGs). Systematically recovering MAGs from gut metagenomes has allowed researchers to progressively unfold the complexity of the microbiome-host system by cataloging and characterizing the genomes of thousands of previously unknown bacterial lineages that comprise it. Despite its importance, this task faces computational limitations that complicate the recovery of microbial diversity associated with rare and low-abundance species, popularly known as the 'microbial dark matter'. Consequently, optimizing available metagenomic data to maximize observable diversity and genome reconstruction is crucial for comprehensive microbiome analysis. In this doctoral thesis, I explore how the concurrent processing of multiple biologically similar metagenomes, when available, using reference- and assembly-based approaches can help in the identification of previously undetected bacterial species. More specifically, I performed metagenomic (co)assembly and (co)binning and applied it to a cohort of ultra-deep, redundantly sequenced gut metagenomes from a small number of individuals. I demonstrate that the careful application of this approach allows for the recovery of high-quality MAGs from novel and under-characterized bacterial species that would otherwise be missed with a single sample. This allowed for the reconstruction of genomes from 198 species lacking reference genomes and 39 completely novel microbial species from gut communities that should already be well represented, highlighting how a significant amount of phylogenetic diversity has remained hidden primarily due to the low sequencing depth of most studies, rather than an insufficient number of sampled individuals. Although multi-sample approaches have been applied in numerous studies for the aforementioned reasons, this work outlines the ideal conditions to apply them in cross-sectional and longitudinal contexts to minimize the occurrence of assembly errors. I show that (co)assembly is most effective with samples from the same subject, as combinations of samples from unrelated subjects generates strain-chimeric MAGs that do not represent actual strains populations. In parallel, I also provide estimates of the sequencing requirements needed to capture this diversity by complementing (co)assembly with reference-based methods. The findings in this thesis advance our understanding of metagenomic assembly techniques and highlight the importance of optimizing data usage in microbiome studies. The recovery of high-quality MAGs empowers various applications, from surveying unknown species to guiding their experimental isolation and characterization. Furthermore, integrating these MAGs into reference-based approaches enables large-scale screening to draw associations with host-related variables, ultimately contributing to a more comprehensive understanding of the gut microbiome.
16-dic-2024
XXXVII
2023-2024
CIBIO (29/10/12-)
Biomolecular Sciences
Segata, Nicola
Valles Colomer, Mireia
no
ITALIA
Inglese
File in questo prodotto:
File Dimensione Formato  
GolzatoDavide_PHD_thesis.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Creative commons
Dimensione 10.39 MB
Formato Adobe PDF
10.39 MB Adobe PDF Visualizza/Apri
Supplemental_table.pdf

accesso aperto

Tipologia: Altro materiale allegato (Other attachments)
Licenza: Creative commons
Dimensione 869.48 kB
Formato Adobe PDF
869.48 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/439471
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact