High dimensional and heterogeneous count data are collected in various applied fields. In this paper, we look closely at high-resolution sequencing data on the microbiome, which have enabled researchers to study the genomes of entire microbial communities. Revealing the underlying interactions between these communities is of vital importance to learn how microbes influence human health. To perform structural learning from multivariate count data such as these, we develop a novel Gaussian copula graphical model with two key elements. Firstly, we employ parametric regression to characterize the marginal distributions. This step is crucial for accommodating the impact of external covariates. Neglecting this adjustment could potentially introduce distortions in the inference of the underlying network of dependences. Secondly, we advance a Bayesian structure learning framework, based on a computationally efficient search algorithm that is suited to high dimensionality. The approach returns simultaneous inference of the marginal effects and of the dependence structure, including graph uncertainty estimates. A simulation study and a real data analysis of microbiome data highlight the applicability of the proposed approach at inferring networks from multivariate count data in general, and its relevance to microbiome analyses in particular. The proposed method is implemented in the R package BDgraph.

Bayesian Structural Learning with Parametric Marginals for Count Data: An Application to Microbiota Systems / Vinciotti, Veronica; Behrouzi, Pariya; Mohammadi, Reza. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1533-7928. - 2024, 25:(2024), pp. 1-26.

Bayesian Structural Learning with Parametric Marginals for Count Data: An Application to Microbiota Systems.

Vinciotti, Veronica
;
2024-01-01

Abstract

High dimensional and heterogeneous count data are collected in various applied fields. In this paper, we look closely at high-resolution sequencing data on the microbiome, which have enabled researchers to study the genomes of entire microbial communities. Revealing the underlying interactions between these communities is of vital importance to learn how microbes influence human health. To perform structural learning from multivariate count data such as these, we develop a novel Gaussian copula graphical model with two key elements. Firstly, we employ parametric regression to characterize the marginal distributions. This step is crucial for accommodating the impact of external covariates. Neglecting this adjustment could potentially introduce distortions in the inference of the underlying network of dependences. Secondly, we advance a Bayesian structure learning framework, based on a computationally efficient search algorithm that is suited to high dimensionality. The approach returns simultaneous inference of the marginal effects and of the dependence structure, including graph uncertainty estimates. A simulation study and a real data analysis of microbiome data highlight the applicability of the proposed approach at inferring networks from multivariate count data in general, and its relevance to microbiome analyses in particular. The proposed method is implemented in the R package BDgraph.
2024
Vinciotti, Veronica; Behrouzi, Pariya; Mohammadi, Reza
Bayesian Structural Learning with Parametric Marginals for Count Data: An Application to Microbiota Systems / Vinciotti, Veronica; Behrouzi, Pariya; Mohammadi, Reza. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1533-7928. - 2024, 25:(2024), pp. 1-26.
File in questo prodotto:
File Dimensione Formato  
23-0056.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 1.43 MB
Formato Adobe PDF
1.43 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/439670
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 1
  • OpenAlex ND
social impact