The role of microbiome in disease onset and in equilibrium is being exposed by a wealth of high-throughput omics methods. All key research directions, e.g., the study of gut microbiome dysbiosis in IBD/IBS, indicate the need for bioinformatics methods that can model the complexity of the microbial communities ecology and unravel its disease-associated perturbations. A most promising direction is the “meta-omics” approach, that allows a profiling based on various biological molecules at the metagenomic scale (e.g., metaproteomics, metametabolomics) as well as different “microbial” omes (eukaryotes and viruses) within a system biology approach. This thesis introduces a bioinformatic framework for microbiota datasets that combines predictive profiling, differential network analysis and meta-omics integration. In detail, the framework identifies biomarkers discriminating amongst clinical phenotypes, through machine learning techniques (Random Forest or SVM) based on a complete Data Analysis Protocol derived by two initiatives funded by FDA: the MicroArray Quality Control-II and Sequencing Quality Control projects. The biomarkers are interpreted in terms of biological networks: the framework provides a setup for networks inference, quantification of networks differences based on the glocal Hamming and Ipsen-Mikhailov (HIM) distance and detection of network communities. The differential analysis of networks allows the study of microbiota structural organization as well as the evolving trajectories of microbial communities associated to the dynamics of the target phenotypes. Moreover, the framework combines a novel similarity network fusion method and machine learning to identify biomarkers from the integration of multiple meta-omics data. The framework implementation requires only standard open source computational biology tools, as a combination of R/Bioconductor and Python functions. In particular, full scripts for meta-omics integration are available in a GitHub repository to ease reuse (https://github.com/AleZandona/INF). The pipeline has been validated on original data from three different clinical datasets. First, the predictive profiling and the network differential analysis have been applied on a pediatric Inflammatory Bowel Disease (IBD) cohort (in faecal vs biopsy environments) and controls, in collaboration with a multidisciplinary team at the Ospedale Pediatrico Bambino Gesú (Rome, I). Then, the meta-omics integration has been tested on a paired bacterial and fungal gut microbiota human IBD datasets from the Gastroenterology Department of the Saint Antoine Hospital (Paris, F), thanks to the collaboration with “Commensals and Probiotics-Host Interactions” team at INRA (Jouy-en-Josas, F). Finally, the framework has been validated on a bacterial-fungal gut microbiota dataset from children affected by Rett syndrome. The different nature of datasets used for validation naturally supports the extension of the framework on different omics datasets. Besides, clinical practice can take advantage of our framework, given the reproducibility and robustness of results, ensured by the adopted Data Analysis Protocol, as well as the biological relevance of the findings, confirmed by the clinical collaborators. Specifically, the omics-based dysbiosis profiles and the inferred biological networks can support the current diagnostic tools to reveal disease-associated perturbations at a much prodromal earlier stage of disease and may be used for disease prevention, diagnosis and prognosis.

Predictive networks for multi meta-omics data integration / Zandonà, Alessandro. - (2017), pp. 1-228.

Predictive networks for multi meta-omics data integration

Zandonà, Alessandro
2017-01-01

Abstract

The role of microbiome in disease onset and in equilibrium is being exposed by a wealth of high-throughput omics methods. All key research directions, e.g., the study of gut microbiome dysbiosis in IBD/IBS, indicate the need for bioinformatics methods that can model the complexity of the microbial communities ecology and unravel its disease-associated perturbations. A most promising direction is the “meta-omics” approach, that allows a profiling based on various biological molecules at the metagenomic scale (e.g., metaproteomics, metametabolomics) as well as different “microbial” omes (eukaryotes and viruses) within a system biology approach. This thesis introduces a bioinformatic framework for microbiota datasets that combines predictive profiling, differential network analysis and meta-omics integration. In detail, the framework identifies biomarkers discriminating amongst clinical phenotypes, through machine learning techniques (Random Forest or SVM) based on a complete Data Analysis Protocol derived by two initiatives funded by FDA: the MicroArray Quality Control-II and Sequencing Quality Control projects. The biomarkers are interpreted in terms of biological networks: the framework provides a setup for networks inference, quantification of networks differences based on the glocal Hamming and Ipsen-Mikhailov (HIM) distance and detection of network communities. The differential analysis of networks allows the study of microbiota structural organization as well as the evolving trajectories of microbial communities associated to the dynamics of the target phenotypes. Moreover, the framework combines a novel similarity network fusion method and machine learning to identify biomarkers from the integration of multiple meta-omics data. The framework implementation requires only standard open source computational biology tools, as a combination of R/Bioconductor and Python functions. In particular, full scripts for meta-omics integration are available in a GitHub repository to ease reuse (https://github.com/AleZandona/INF). The pipeline has been validated on original data from three different clinical datasets. First, the predictive profiling and the network differential analysis have been applied on a pediatric Inflammatory Bowel Disease (IBD) cohort (in faecal vs biopsy environments) and controls, in collaboration with a multidisciplinary team at the Ospedale Pediatrico Bambino Gesú (Rome, I). Then, the meta-omics integration has been tested on a paired bacterial and fungal gut microbiota human IBD datasets from the Gastroenterology Department of the Saint Antoine Hospital (Paris, F), thanks to the collaboration with “Commensals and Probiotics-Host Interactions” team at INRA (Jouy-en-Josas, F). Finally, the framework has been validated on a bacterial-fungal gut microbiota dataset from children affected by Rett syndrome. The different nature of datasets used for validation naturally supports the extension of the framework on different omics datasets. Besides, clinical practice can take advantage of our framework, given the reproducibility and robustness of results, ensured by the adopted Data Analysis Protocol, as well as the biological relevance of the findings, confirmed by the clinical collaborators. Specifically, the omics-based dysbiosis profiles and the inferred biological networks can support the current diagnostic tools to reveal disease-associated perturbations at a much prodromal earlier stage of disease and may be used for disease prevention, diagnosis and prognosis.
2017
XXIX
2017-2018
CIBIO (29/10/12-)
Biomolecular Sciences
Furlanello, Cesare
Chierici, Marco
no
Inglese
Settore INF/01 - Informatica
Settore MAT/06 - Probabilita' e Statistica Matematica
File in questo prodotto:
File Dimensione Formato  
zandona2017_phdthesis.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 10.45 MB
Formato Adobe PDF
10.45 MB Adobe PDF Visualizza/Apri
zandona2017_disclaimer.pdf

Solo gestori archivio

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 977.34 kB
Formato Adobe PDF
977.34 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/367893
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact