In many applied fields, such as genomics, different types of data are collected on the same system, and it is not uncommon that some of these datasets are subject to censoring as a result of the measurement technologies used, such as data generated by polymerase chain reactions and flow cytometer. When the overall objective is that of network inference, at possibly different levels of a system, information coming from different sources and/or different steps of the analysis can be integrated into one model with the use of conditional graphical models. In this paper, we develop a doubly penalized inferential procedure for a conditional Gaussian graphical model when data can be subject to censoring. The computational challenges of handling censored data in high dimensionality are met with the development of an efficient expectation-maximization algorithm, based on approximate calculations of the moments of truncated Gaussian distributions and on a suitably derived two-step procedure alternating graphical lasso with a novel block-coordinate multivariate lasso approach. We evaluate the performance of this approach on an extensive simulation study and on gene expression data generated by RT-qPCR technologies, where we are able to integrate network inference, differential expression detection and data normalization into one model.
The conditional censored graphical lasso estimator / Augugliaro, L.; Sottile, G.; Vinciotti, V.. - In: STATISTICS AND COMPUTING. - ISSN 0960-3174. - 30:5(2020), pp. 1273-1289. [10.1007/s11222-020-09945-7]
The conditional censored graphical lasso estimator
Vinciotti V.
2020-01-01
Abstract
In many applied fields, such as genomics, different types of data are collected on the same system, and it is not uncommon that some of these datasets are subject to censoring as a result of the measurement technologies used, such as data generated by polymerase chain reactions and flow cytometer. When the overall objective is that of network inference, at possibly different levels of a system, information coming from different sources and/or different steps of the analysis can be integrated into one model with the use of conditional graphical models. In this paper, we develop a doubly penalized inferential procedure for a conditional Gaussian graphical model when data can be subject to censoring. The computational challenges of handling censored data in high dimensionality are met with the development of an efficient expectation-maximization algorithm, based on approximate calculations of the moments of truncated Gaussian distributions and on a suitably derived two-step procedure alternating graphical lasso with a novel block-coordinate multivariate lasso approach. We evaluate the performance of this approach on an extensive simulation study and on gene expression data generated by RT-qPCR technologies, where we are able to integrate network inference, differential expression detection and data normalization into one model.File | Dimensione | Formato | |
---|---|---|---|
s11222-020-09945-7.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.07 MB
Formato
Adobe PDF
|
1.07 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione