Duplication with Comparison (DWC) is an effective software-level solution to improve the reliability of computing systems, including Graphics Processing Units (GPUs). DWC, however, introduces performance and energy consumption overheads that could be unacceptable for High-Performance Computing (HPC) or real-time safety-critical applications. In this work, we propose Reduced-Precision DWC (RP-DWC): an improvement over the traditional DWC approach that uses mixed-precision GPUs hardware resources to implement fault detection. We investigate, through both fault injection campaigns and accelerated neutron beam experiments, the impact of RPDWC onto performance, energy consumption, and its fault detection capabilites. We show that RP-DWC achieves on average 74% fault coverage (up to 86%) with very small overheads (0.1% time and 24% energy consumption overhead, in the best case).

Reduced-Precision DWC for Mixed-Precision GPUs / Santos, F. F. D.; Brandalero, M.; Basso, P. M.; Hubner, M.; Carro, L.; Rech, P.. - (2020), pp. 1-6. (Intervento presentato al convegno 26th IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2020 tenutosi a ita nel 2020) [10.1109/IOLTS50870.2020.9159748].

Reduced-Precision DWC for Mixed-Precision GPUs

Rech P.
2020-01-01

Abstract

Duplication with Comparison (DWC) is an effective software-level solution to improve the reliability of computing systems, including Graphics Processing Units (GPUs). DWC, however, introduces performance and energy consumption overheads that could be unacceptable for High-Performance Computing (HPC) or real-time safety-critical applications. In this work, we propose Reduced-Precision DWC (RP-DWC): an improvement over the traditional DWC approach that uses mixed-precision GPUs hardware resources to implement fault detection. We investigate, through both fault injection campaigns and accelerated neutron beam experiments, the impact of RPDWC onto performance, energy consumption, and its fault detection capabilites. We show that RP-DWC achieves on average 74% fault coverage (up to 86%) with very small overheads (0.1% time and 24% energy consumption overhead, in the best case).
2020
Proceedings - 2020 26th IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2020
usa
Institute of Electrical and Electronics Engineers Inc.
978-1-7281-8187-5
Santos, F. F. D.; Brandalero, M.; Basso, P. M.; Hubner, M.; Carro, L.; Rech, P.
Reduced-Precision DWC for Mixed-Precision GPUs / Santos, F. F. D.; Brandalero, M.; Basso, P. M.; Hubner, M.; Carro, L.; Rech, P.. - (2020), pp. 1-6. (Intervento presentato al convegno 26th IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2020 tenutosi a ita nel 2020) [10.1109/IOLTS50870.2020.9159748].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/346707
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact