In this paper, we investigate how the presence of a general purpose operating system influences the reliability of modern embedded Systems-on-Chips (SoCs). We experimentally study the difference in the neutron-induced error rate of SoCs when executing the application bare to the metal (i.e., without an underlying operating system) and on top of the Linux kernel. Our analysis demonstrates that Linux presence barely affects the Silent Data Corruption (SDC) rate whereas it greatly increases the system Functional Interruption (SEFI) rate (up to 7.48 times) if no preventive measures are taken. Nevertheless, we experimentally demonstrate that cache conflicts between the operating system and application can be leveraged to significantly reduce the Linux-induced SEFI rate increase. Moreover, we evaluate the OS software stack masking effect and show that the higher the abstraction layer in which an application is implemented, the lower its SDC rate. Furthermore, we analyze system reliability taking into account not only the resulting failure rates, but also the execution (and, thus, exposure) times.
Reliability analysis of operating systems and software stack for embedded systems / Santini, T.; Carro, L.; Wagner, F. R.; Rech, P.. - In: IEEE TRANSACTIONS ON NUCLEAR SCIENCE. - ISSN 0018-9499. - 63:4(2016), pp. 2225-2232. [10.1109/TNS.2015.2513384]
Reliability analysis of operating systems and software stack for embedded systems
Rech P.
2016-01-01
Abstract
In this paper, we investigate how the presence of a general purpose operating system influences the reliability of modern embedded Systems-on-Chips (SoCs). We experimentally study the difference in the neutron-induced error rate of SoCs when executing the application bare to the metal (i.e., without an underlying operating system) and on top of the Linux kernel. Our analysis demonstrates that Linux presence barely affects the Silent Data Corruption (SDC) rate whereas it greatly increases the system Functional Interruption (SEFI) rate (up to 7.48 times) if no preventive measures are taken. Nevertheless, we experimentally demonstrate that cache conflicts between the operating system and application can be leveraged to significantly reduce the Linux-induced SEFI rate increase. Moreover, we evaluate the OS software stack masking effect and show that the higher the abstraction layer in which an application is implemented, the lower its SDC rate. Furthermore, we analyze system reliability taking into account not only the resulting failure rates, but also the execution (and, thus, exposure) times.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione