Matrix multiplication has always been a cornerstone in computer science. In fact, linear algebra tools permeate a wide variety of applications: from weather forecasting, to financial market prediction, radio signal processing, computer vision, and more. Since many of the aforementioned applications typically impose strict performance and/or fault tolerance constraints, the demand for fast and reliable matrix multiplication (MxM) is at an all-time high. Typically, increased reliability is achieved through redundancy. However, coarse-grain duplication incurs an often prohibitive overhead, higher than 100%. Thanks to the peculiar characteristics of the MxM algorithm, more efficient algorithm-based hardening solutions have been designed to detect (and even correct) some types of errors with lower overhead. We show that, despite being more efficient, current solutions are still sub-optimal in certain scenarios, particularly when considering persistent faults in Field-Programmable Gate-Arrays (FPGAs). Based on a thorough analysis of the fault model, we propose an error detection technique for MxM that decreases both algorithmic and architectural costs by over a polynomial degree, when compared to existing algorithm-based strategies. Furthermore, we report arithmetic overheads at the application level to be under 1% for three state-of-the-art Convolutional Neural Networks (CNNs).

Efficient Error Detection for Matrix Multiplication with Systolic Arrays on FPGAs / Libano, Fabiano; Rech, Paolo; Brunhaver, John. - In: IEEE TRANSACTIONS ON COMPUTERS. - ISSN 0018-9340. - 72:8(2023), pp. 2390-2403. [10.1109/TC.2023.3248282]

Efficient Error Detection for Matrix Multiplication with Systolic Arrays on FPGAs

Rech, Paolo
Secondo
;
2023-01-01

Abstract

Matrix multiplication has always been a cornerstone in computer science. In fact, linear algebra tools permeate a wide variety of applications: from weather forecasting, to financial market prediction, radio signal processing, computer vision, and more. Since many of the aforementioned applications typically impose strict performance and/or fault tolerance constraints, the demand for fast and reliable matrix multiplication (MxM) is at an all-time high. Typically, increased reliability is achieved through redundancy. However, coarse-grain duplication incurs an often prohibitive overhead, higher than 100%. Thanks to the peculiar characteristics of the MxM algorithm, more efficient algorithm-based hardening solutions have been designed to detect (and even correct) some types of errors with lower overhead. We show that, despite being more efficient, current solutions are still sub-optimal in certain scenarios, particularly when considering persistent faults in Field-Programmable Gate-Arrays (FPGAs). Based on a thorough analysis of the fault model, we propose an error detection technique for MxM that decreases both algorithmic and architectural costs by over a polynomial degree, when compared to existing algorithm-based strategies. Furthermore, we report arithmetic overheads at the application level to be under 1% for three state-of-the-art Convolutional Neural Networks (CNNs).
2023
8
Libano, Fabiano; Rech, Paolo; Brunhaver, John
Efficient Error Detection for Matrix Multiplication with Systolic Arrays on FPGAs / Libano, Fabiano; Rech, Paolo; Brunhaver, John. - In: IEEE TRANSACTIONS ON COMPUTERS. - ISSN 0018-9340. - 72:8(2023), pp. 2390-2403. [10.1109/TC.2023.3248282]
File in questo prodotto:
File Dimensione Formato  
TC_2021_ASU_UFRGS_POLITO__Rev1_.pdf

accesso aperto

Tipologia: Pre-print non referato (Non-refereed preprint)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 3.35 MB
Formato Adobe PDF
3.35 MB Adobe PDF Visualizza/Apri
Efficient_Error_Detection_for_Matrix_Multiplication_With_Systolic_Arrays_on_FPGAs.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.63 MB
Formato Adobe PDF
2.63 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/376659
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 7
  • OpenAlex ND
social impact