The high performance, high efficiency, and low cost of Commercial Off-The-Shelf (COTS) devices make them attractive for applications with strict reliability constraints. Today, COTS devices are adopted in HPC and safety-critical applications such as autonomous driving. Unfortunately, the cheap natural boron widely used in COTS chip manufacturing process makes them highly susceptible to thermal (low energy) neutrons. In this paper, we demonstrate that thermal neutrons are a significant threat to COTS device reliability. For our study, we consider two DDR memories, an AMD APU, three NVIDIA GPUs, an Intel accelerator, and an FPGA executing a relevant set of algorithms. We consider different scenarios that impact the thermal neutron flux such as weather, concrete walls and floors, and HPC liquid cooling systems. Correlating beam experiments and neutron detector data, we show that thermal neutrons FIT rate could be comparable or even higher than the high energy neutron FIT rate.

Thermal neutrons: a possible threat for supercomputer reliability / Oliveira, Daniel; Blanchard, Sean; Debardeleben, Nathan; dos Santos, Fernando Fernandes; Davila, Gabriel Piscoya; Navaux, Philippe; Favalli, Andrea; Schappert, Opale; Wender, Stephen; Cazzaniga, Carlo; Frost, Christopher; Rech, Paolo. - In: THE JOURNAL OF SUPERCOMPUTING. - ISSN 0920-8542. - 77:2(2021), pp. 1612-1634. [10.1007/s11227-020-03324-9]

Thermal neutrons: a possible threat for supercomputer reliability

Rech, Paolo
Ultimo
2021-01-01

Abstract

The high performance, high efficiency, and low cost of Commercial Off-The-Shelf (COTS) devices make them attractive for applications with strict reliability constraints. Today, COTS devices are adopted in HPC and safety-critical applications such as autonomous driving. Unfortunately, the cheap natural boron widely used in COTS chip manufacturing process makes them highly susceptible to thermal (low energy) neutrons. In this paper, we demonstrate that thermal neutrons are a significant threat to COTS device reliability. For our study, we consider two DDR memories, an AMD APU, three NVIDIA GPUs, an Intel accelerator, and an FPGA executing a relevant set of algorithms. We consider different scenarios that impact the thermal neutron flux such as weather, concrete walls and floors, and HPC liquid cooling systems. Correlating beam experiments and neutron detector data, we show that thermal neutrons FIT rate could be comparable or even higher than the high energy neutron FIT rate.
2021
2
Oliveira, Daniel; Blanchard, Sean; Debardeleben, Nathan; dos Santos, Fernando Fernandes; Davila, Gabriel Piscoya; Navaux, Philippe; Favalli, Andrea; S...espandi
Thermal neutrons: a possible threat for supercomputer reliability / Oliveira, Daniel; Blanchard, Sean; Debardeleben, Nathan; dos Santos, Fernando Fernandes; Davila, Gabriel Piscoya; Navaux, Philippe; Favalli, Andrea; Schappert, Opale; Wender, Stephen; Cazzaniga, Carlo; Frost, Christopher; Rech, Paolo. - In: THE JOURNAL OF SUPERCOMPUTING. - ISSN 0920-8542. - 77:2(2021), pp. 1612-1634. [10.1007/s11227-020-03324-9]
File in questo prodotto:
File Dimensione Formato  
JSC_s11227-020-03324-9.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.84 MB
Formato Adobe PDF
1.84 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/346715
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 7
  • OpenAlex ND
social impact