Thanks to the capability of efficiently executing massive computations in parallel, general purpose graphic processing units (GPGPUs) have begun to be preferred to CPUs for several parallel applications in different domains. Two are the most relevant fields in which, recently, GPGPUs have begun to be employed: high performance computing (HPC) and embedded systems. The reliability requirements are different in these two applications domain. In order to be employed in safety-critical applications, GPGPUs for embedded systems must be qualified as reliable. In this paper, we analyze through neutron irradiation typical parallel algorithms for embedded GPGPUs and we evaluate their reliability. We analyze how caches and threads distributions affect the GPGPU reliability. The data have been acquired through neutron test experiments, performed at the VESUVIO neutron facility at ISIS. The obtained experimental results show that, if the L1 cache of the considered GPGPU is disabled, the algorithm execution is most reliable. Moreover, it is demonstrated that during a FFT execution most errors appear in the stages in which the GPGPU is completely loaded as the number of instantiated parallel tasks is higher.
Reliability evaluation of embedded gpgpus for safety critical applications / Sabena, D.; Sterpone, L.; Carro, L.; Rech, P.. - In: IEEE TRANSACTIONS ON NUCLEAR SCIENCE. - ISSN 0018-9499. - 61:6(2014), pp. 3123-3129. [10.1109/TNS.2014.2363358]
Reliability evaluation of embedded gpgpus for safety critical applications
Rech P.
2014-01-01
Abstract
Thanks to the capability of efficiently executing massive computations in parallel, general purpose graphic processing units (GPGPUs) have begun to be preferred to CPUs for several parallel applications in different domains. Two are the most relevant fields in which, recently, GPGPUs have begun to be employed: high performance computing (HPC) and embedded systems. The reliability requirements are different in these two applications domain. In order to be employed in safety-critical applications, GPGPUs for embedded systems must be qualified as reliable. In this paper, we analyze through neutron irradiation typical parallel algorithms for embedded GPGPUs and we evaluate their reliability. We analyze how caches and threads distributions affect the GPGPU reliability. The data have been acquired through neutron test experiments, performed at the VESUVIO neutron facility at ISIS. The obtained experimental results show that, if the L1 cache of the considered GPGPU is disabled, the algorithm execution is most reliable. Moreover, it is demonstrated that during a FFT execution most errors appear in the stages in which the GPGPU is completely loaded as the number of instantiated parallel tasks is higher.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione