Predictive process monitoring (PPM) aims at creating models that predict aspects of interest of process execution using historical data available in event logs, mostly using machine learning (ML) techniques. When developing a PPM model, one has several design choices, encompassing both ML-related concerns, such as which classification or regression model to choose, and PPM-specific concerns, such as how to encode the trace prefixes or whether to drop infrequent activities when training a model. While the literature has seen a few attempts to study how these choices impact the performance of a PPM model, no systematic studies on this matter exist. This paper moves towards closing this gap. We propose a framework to interpret the impact of design choices on the performance of a PPM model. The proposed framework uses as building blocks a search space exploration algorithm, which is able to generate different model configurations, and explainable AI techniques, e.g., SHAP, to analyze the i...
Predictive process monitoring (PPM) aims at creating models that predict aspects of interest of process execution using historical data available in event logs, mostly using machine learning (ML) techniques. When developing a PPM model, one has several design choices, encompassing both ML-related concerns, such as which classification or regression model to choose, and PPM-specific concerns, such as how to encode the trace prefixes or whether to drop infrequent activities when training a model. While the literature has seen a few attempts to study how these choices impact the performance of a PPM model, no systematic studies on this matter exist. This paper moves towards closing this gap. We propose a framework to interpret the impact of design choices on the performance of a PPM model. The proposed framework uses as building blocks a search space exploration algorithm, which is able to generate different model configurations, and explainable AI techniques, e.g., SHAP, to analyze the impact of design choices on the model performance based on the generated configurations. We show an instantiation of the framework in the use case of outcome-oriented PPM, discussing also the experimental results obtained using publicly available event logs.
Understanding the Impact of Design Choices on the Performance of Predictive Process Monitoring / Kim, S.; Comuzzi, M.; Di Francescomarino, C.. - 503:(2024), pp. 153-164. ( International workshops which were held in conjunction with 5th International Conference on Process Mining, ICPM 2023 ita 2023) [10.1007/978-3-031-56107-8_12].
Understanding the Impact of Design Choices on the Performance of Predictive Process Monitoring
Di Francescomarino C.
2024-01-01
Abstract
Predictive process monitoring (PPM) aims at creating models that predict aspects of interest of process execution using historical data available in event logs, mostly using machine learning (ML) techniques. When developing a PPM model, one has several design choices, encompassing both ML-related concerns, such as which classification or regression model to choose, and PPM-specific concerns, such as how to encode the trace prefixes or whether to drop infrequent activities when training a model. While the literature has seen a few attempts to study how these choices impact the performance of a PPM model, no systematic studies on this matter exist. This paper moves towards closing this gap. We propose a framework to interpret the impact of design choices on the performance of a PPM model. The proposed framework uses as building blocks a search space exploration algorithm, which is able to generate different model configurations, and explainable AI techniques, e.g., SHAP, to analyze the i...I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione



