LLMs as Repositories of Factual Knowledge: Limitations and Solutions

IRIS

LLMs' sources of knowledge are data snapshots containing factual information about entities collected at different timestamps and from different media types (e.g. wikis, social media, etc.). Such unstructured knowledge is subject to change due to updates through time from past to present. Equally important are the inconsistencies and inaccuracies occurring in different information sources. Consequently, the model's knowl edge about an entity may be perturbed while training over the sequence of snapshots or at inference time, resulting in inconsistent and inaccurate model performance. In this work, we study the appropriateness of Large Language Models (LLMs) as repositories of factual knowledge. We consider twenty-four state of-the-art LLMs that are either closed-, partially (weights), or fully (weight and training data) open-source. We evaluate their reliability in responding to time-sensitive factual questions in terms of accuracy and consistency when prompts are perturbed. We further evaluate the effectiveness of state-of-the-art methods to improve LLMs' accuracy and consistency. We then propose “ENtity-Aware Fine-tuning” (ENAF), a soft neurosymbolic approach aimed at providing structured representation of entities during fine-tuning to reduce inconsistencies and improve response stability under prompt variations.

LLMs as Repositories of Factual Knowledge: Limitations and Solutions / Mousavi, S.M., Alghisi, S., Riccardi, G.. - In: IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2998-4173. - 34:(2026), pp. 2213-2226. [10.1109/TASLPRO.2026.3680709]

LLMs as Repositories of Factual Knowledge: Limitations and Solutions

Mousavi S. M.;Alghisi S.;Riccardi G.

2026-01-01

Abstract

LLMs' sources of knowledge are data snapshots containing factual information about entities collected at different timestamps and from different media types (e.g. wikis, social media, etc.). Such unstructured knowledge is subject to change due to updates through time from past to present. Equally important are the inconsistencies and inaccuracies occurring in different information sources. Consequently, the model's knowl edge about an entity may be perturbed while training over the sequence of snapshots or at inference time, resulting in inconsistent and inaccurate model performance. In this work, we study the appropriateness of Large Language Models (LLMs) as repositories of factual knowledge. We consider twenty-four state of-the-art LLMs that are either closed-, partially (weights), or fully (weight and training data) open-source. We evaluate their reliability in responding to time-sensitive factual questions in terms of accuracy and consistency when prompts are perturbed. We further evaluate the effectiveness of state-of-the-art methods to improve LLMs' accuracy and consistency. We then propose “ENtity-Aware Fine-tuning” (ENAF), a soft neurosymbolic approach aimed at providing structured representation of entities during fine-tuning to reduce inconsistencies and improve response stability under prompt variations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2026
			
	Titolo del periodico (Journal title)
	
				IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING
			
	DOI
	
				https://dx.doi.org/10.1109/TASLPRO.2026.3680709
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-105035521327
			
	Tutti gli autori
	
						Mousavi, S. M.; Alghisi, S.; Riccardi, G.
					
	Citazione
	
				LLMs as Repositories of Factual Knowledge: Limitations and Solutions / Mousavi, S.M., Alghisi, S., Riccardi, G.. - In: IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 2998-4173. - 34:(2026), pp. 2213-2226. [10.1109/TASLPRO.2026.3680709]

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/485290

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

0

social impact