Large Language Models (LLMs) demonstrate an impressive capacity to recall a vast range of factual knowledge. However, understanding their underlying reasoning and internal mechanisms in exploiting this knowledge remains a key research area. This work unveils the factual information an LLM represents internally for sentence-level claim verification. We propose an end-to-end framework to decode factual knowledge embedded in token representations from a vector space to a set of ground predicates, showing its layer-wise evolution using a dynamic knowledge graph. Our framework employs activation patching, a vector-level technique that alters a token representation during inference, to extract encoded knowledge. Accordingly, we neither rely on training nor external models. Using factual and common-sense claims from two claim verification datasets, we showcase interpretability analyses at local and global levels. The local analysis highlights entity centrality in LLM reasoning, from claim-related information and multi-hop reasoning to representation errors causing erroneous evaluation. On the other hand, the global reveals trends in the underlying evolution, such as word-based knowledge evolving into claim-related facts. By interpreting semantics from LLM latent representations and enabling graph-related analyses, this work enhances the understanding of the factual knowledge resolution process.

Unveiling LLMs: The Evolution of Latent Representations in a Dynamic Knowledge Graph / Bronzini, Marco; Nicolini, Carlo; Lepri, Bruno; Staiano, Jacopo; Passerini, Andrea. - ELETTRONICO. - (2024). (Intervento presentato al convegno COLM 2024 tenutosi a Philadelphia, USA nel 7th October-9th October 2024).

Unveiling LLMs: The Evolution of Latent Representations in a Dynamic Knowledge Graph

Bronzini, Marco
;
Lepri, Bruno;Staiano, Jacopo
Co-ultimo
;
Passerini, Andrea
2024-01-01

Abstract

Large Language Models (LLMs) demonstrate an impressive capacity to recall a vast range of factual knowledge. However, understanding their underlying reasoning and internal mechanisms in exploiting this knowledge remains a key research area. This work unveils the factual information an LLM represents internally for sentence-level claim verification. We propose an end-to-end framework to decode factual knowledge embedded in token representations from a vector space to a set of ground predicates, showing its layer-wise evolution using a dynamic knowledge graph. Our framework employs activation patching, a vector-level technique that alters a token representation during inference, to extract encoded knowledge. Accordingly, we neither rely on training nor external models. Using factual and common-sense claims from two claim verification datasets, we showcase interpretability analyses at local and global levels. The local analysis highlights entity centrality in LLM reasoning, from claim-related information and multi-hop reasoning to representation errors causing erroneous evaluation. On the other hand, the global reveals trends in the underlying evolution, such as word-based knowledge evolving into claim-related facts. By interpreting semantics from LLM latent representations and enabling graph-related analyses, this work enhances the understanding of the factual knowledge resolution process.
2024
First Conference on Language Modeling
Philadelphia, USA
openreview.net
Bronzini, Marco; Nicolini, Carlo; Lepri, Bruno; Staiano, Jacopo; Passerini, Andrea
Unveiling LLMs: The Evolution of Latent Representations in a Dynamic Knowledge Graph / Bronzini, Marco; Nicolini, Carlo; Lepri, Bruno; Staiano, Jacopo; Passerini, Andrea. - ELETTRONICO. - (2024). (Intervento presentato al convegno COLM 2024 tenutosi a Philadelphia, USA nel 7th October-9th October 2024).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/425730
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact