Background: Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed. Results: PaCBAM is a command line tool written in C and designed for the characterization of genomic regions and single nucleotide positions from whole exome and targeted sequencing data. PaCBAM computes depth of coverage and allele-specific pileup statistics, implements a fast and scalable multi-core computational engine, introduces an innovative and efficient on-the-fly read duplicates filtering strategy and provides comprehensive text output files and visual reports. We demonstrate that PaCBAM exploits parallel computation resources better than existing tools, resulting in important reductions of processing time and memory usage, hence enabling an efficient and fast exploration of large datasets. Conclusions: PaCBAM is a fast and scalable tool designed to process genomic regions from NGS data files and generate coverage and pileup comprehensive statistics for downstream analysis. The tool can be easily integrated in NGS processing pipelines and is available from Bitbucket and Docker/Singularity hubs.

PaCBAM: Fast and scalable processing of whole exome and targeted sequencing data / Valentini, S.; Fedrizzi, T.; Demichelis, F.; Romanel, A.. - In: BMC GENOMICS. - ISSN 1471-2164. - ELETTRONICO. - 20:1(2019), pp. 10181-10185. [10.1186/s12864-019-6386-6]

PaCBAM: Fast and scalable processing of whole exome and targeted sequencing data

Valentini S.;Fedrizzi T.;Demichelis F.;Romanel A.
2019

Abstract

Background: Interrogation of whole exome and targeted sequencing NGS data is rapidly becoming a preferred approach for the exploration of large cohorts in the research setting and importantly in the context of precision medicine. Single-base and genomic region level data retrieval and processing still constitute major bottlenecks in NGS data analysis. Fast and scalable tools are hence needed. Results: PaCBAM is a command line tool written in C and designed for the characterization of genomic regions and single nucleotide positions from whole exome and targeted sequencing data. PaCBAM computes depth of coverage and allele-specific pileup statistics, implements a fast and scalable multi-core computational engine, introduces an innovative and efficient on-the-fly read duplicates filtering strategy and provides comprehensive text output files and visual reports. We demonstrate that PaCBAM exploits parallel computation resources better than existing tools, resulting in important reductions of processing time and memory usage, hence enabling an efficient and fast exploration of large datasets. Conclusions: PaCBAM is a fast and scalable tool designed to process genomic regions from NGS data files and generate coverage and pileup comprehensive statistics for downstream analysis. The tool can be easily integrated in NGS processing pipelines and is available from Bitbucket and Docker/Singularity hubs.
1
Valentini, S.; Fedrizzi, T.; Demichelis, F.; Romanel, A.
PaCBAM: Fast and scalable processing of whole exome and targeted sequencing data / Valentini, S.; Fedrizzi, T.; Demichelis, F.; Romanel, A.. - In: BMC GENOMICS. - ISSN 1471-2164. - ELETTRONICO. - 20:1(2019), pp. 10181-10185. [10.1186/s12864-019-6386-6]
File in questo prodotto:
File Dimensione Formato  
s12864-019-6386-6.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 875.63 kB
Formato Adobe PDF
875.63 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11572/257920
Citazioni
  • ???jsp.display-item.citation.pmc??? 3
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact