Machine Learning (ML) models are increasingly used by domain experts to tackle classification tasks, aiming for high predictive accuracy. However, classifiers are inherently prone to misclassifications, especially when they encounter unfamiliar, previously unseen or out-of-distribution input data. This creates significant challenges for their deployment in critical Cyber-Physical Systems (CPSs)—such as autonomous vehicles, industrial control systems, and medical devices—where misclassifications can lead to severe consequences for people, infrastructure, and the environment. This paper argues that ML classifiers intended for critical applications should not be designed nor evaluated in isolation. Instead, Critical System Classifiers (CSCs) primarily aim at reducing misclassifications by rejecting uncertain predictions and trigger mitigation strategies integrated into the encompassing CPS. We present a high-level CSC architecture that supports black-box classifier integration, preprocessing for unknown detection, post-hoc calibration, and cost-sensitive thresholding. We emphasize the need for cost-aware evaluation metrics that explicitly account for rejected predictions, enabling a more realistic assessment of classifier performance in critical systems. We validate our approach through experiments on tabular datasets related to failure prediction, intrusion detection, and error detection—common use cases for classifiers in CPSs. Key findings include: (i) cost-sensitive evaluation often leads to the selection of different classifiers than standard metrics suggest; (ii) tree-based models outperform statistical ones in classification tasks; (iii) calibration and rejection mechanisms provide a robust notion of confidence; and (iv) combining multiple uncertainty-based rejection strategies achieves a favorable trade-off between high accuracy, low rejection rates, and cost. All experiments and implementations are publicly available via our CINNABAR GitHub repository. Overall, this study offers a system-level perspective and practical software architecture for safely deploying ML classifiers in critical CPS domains, paving the way toward more trustworthy and certifiable AI in real-world infrastructures.

Bringing Machine Learning Classifiers Into Critical Cyber-Physical Systems: a Matter of Design / Sayin, Burcu; Zoppi, Tommaso; Marchini, Nicolò; Ahmed Khokhar, Fahad; Passerini, Andrea. - In: IEEE ACCESS. - ISSN 2169-3536. - 13:(2025), pp. 94858-94877. [10.1109/ACCESS.2025.3568501]

Bringing Machine Learning Classifiers Into Critical Cyber-Physical Systems: a Matter of Design

Burcu Sayin;Tommaso Zoppi
;
Andrea Passerini
2025-01-01

Abstract

Machine Learning (ML) models are increasingly used by domain experts to tackle classification tasks, aiming for high predictive accuracy. However, classifiers are inherently prone to misclassifications, especially when they encounter unfamiliar, previously unseen or out-of-distribution input data. This creates significant challenges for their deployment in critical Cyber-Physical Systems (CPSs)—such as autonomous vehicles, industrial control systems, and medical devices—where misclassifications can lead to severe consequences for people, infrastructure, and the environment. This paper argues that ML classifiers intended for critical applications should not be designed nor evaluated in isolation. Instead, Critical System Classifiers (CSCs) primarily aim at reducing misclassifications by rejecting uncertain predictions and trigger mitigation strategies integrated into the encompassing CPS. We present a high-level CSC architecture that supports black-box classifier integration, preprocessing for unknown detection, post-hoc calibration, and cost-sensitive thresholding. We emphasize the need for cost-aware evaluation metrics that explicitly account for rejected predictions, enabling a more realistic assessment of classifier performance in critical systems. We validate our approach through experiments on tabular datasets related to failure prediction, intrusion detection, and error detection—common use cases for classifiers in CPSs. Key findings include: (i) cost-sensitive evaluation often leads to the selection of different classifiers than standard metrics suggest; (ii) tree-based models outperform statistical ones in classification tasks; (iii) calibration and rejection mechanisms provide a robust notion of confidence; and (iv) combining multiple uncertainty-based rejection strategies achieves a favorable trade-off between high accuracy, low rejection rates, and cost. All experiments and implementations are publicly available via our CINNABAR GitHub repository. Overall, this study offers a system-level perspective and practical software architecture for safely deploying ML classifiers in critical CPS domains, paving the way toward more trustworthy and certifiable AI in real-world infrastructures.
2025
Sayin, Burcu; Zoppi, Tommaso; Marchini, Nicolò; Ahmed Khokhar, Fahad; Passerini, Andrea
Bringing Machine Learning Classifiers Into Critical Cyber-Physical Systems: a Matter of Design / Sayin, Burcu; Zoppi, Tommaso; Marchini, Nicolò; Ahmed Khokhar, Fahad; Passerini, Andrea. - In: IEEE ACCESS. - ISSN 2169-3536. - 13:(2025), pp. 94858-94877. [10.1109/ACCESS.2025.3568501]
File in questo prodotto:
File Dimensione Formato  
Bringing Machine Learning Classifiers Into Critical Cyber-Physical Systems.pdf

accesso aperto

Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.04 MB
Formato Adobe PDF
1.04 MB Adobe PDF Visualizza/Apri
Bringing_Machine_Learning_Classifiers_Into_Critical_Cyber-Physical_Systems_A_Matter_of_Design.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 3.44 MB
Formato Adobe PDF
3.44 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/453732
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 1
  • OpenAlex ND
social impact