Trustworthy knowledge extraction represents a bottleneck in the development of autonomous AI agents capable of integrating learning and reasoning capabilities. As a foundational framework of neuro-symbolic knowledge acquisition systems from semi-structured data, we introduce an approach that combines Large Language Model (LLM) functionalities with symbolic verification modules. In a process mining context, we propose to leverage LLMs to generate linear temporal logic specifications starting from sets of finite traces that represent event logs. In a knowledge representation setting, we focus instead on LLM-based extraction of description logic concepts to obtain human-readable conceptualizations that separate positive and negative labeled data instances. We integrate chat interfaces based on state-of-the-art LLMs with formal verification modules: in the process mining case, we employ model checking tools for linear temporal logic on finite traces; and, for description logic concept learning, we perform entailment checks using dedicated reasoning engines. First, we conduct a proof-of-concept evaluation of these architectures, comparing the performance of the LLMs on each task. We then provide an implementation of a GPT-based toolchain to automate the candidate generation and verification steps.
LLM-Driven Knowledge Extraction in Temporal and Description Logics / Duranti, D.; Giorgini, P.; Mazzullo, A.; Robol, M.; Roveri, M.. - 15370:(2025), pp. 190-208. (Intervento presentato al convegno 24th International Conference on Knowledge Engineering and Knowledge Management, EKAW 2024 tenutosi a nld nel 2024) [10.1007/978-3-031-77792-9_12].
LLM-Driven Knowledge Extraction in Temporal and Description Logics
Duranti D.
;Giorgini P.
;Mazzullo A.
;Robol M.
;Roveri M.
2025-01-01
Abstract
Trustworthy knowledge extraction represents a bottleneck in the development of autonomous AI agents capable of integrating learning and reasoning capabilities. As a foundational framework of neuro-symbolic knowledge acquisition systems from semi-structured data, we introduce an approach that combines Large Language Model (LLM) functionalities with symbolic verification modules. In a process mining context, we propose to leverage LLMs to generate linear temporal logic specifications starting from sets of finite traces that represent event logs. In a knowledge representation setting, we focus instead on LLM-based extraction of description logic concepts to obtain human-readable conceptualizations that separate positive and negative labeled data instances. We integrate chat interfaces based on state-of-the-art LLMs with formal verification modules: in the process mining case, we employ model checking tools for linear temporal logic on finite traces; and, for description logic concept learning, we perform entailment checks using dedicated reasoning engines. First, we conduct a proof-of-concept evaluation of these architectures, comparing the performance of the LLMs on each task. We then provide an implementation of a GPT-based toolchain to automate the candidate generation and verification steps.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione