Will LLMs Replace the Encoder-Only Models in Temporal Relation Classification?

IRIS

The automatic detection of temporal relations among events has been mainly investigated with encoder-only models such as RoBERTa. Large Language Models (LLM) have recently shown promising performance in temporal reasoning tasks such as temporal question answering. Nevertheless, recent studies have tested the LLMs’ performance in detecting temporal relations of closed-source models only, limiting the interpretability of those results. In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task. First, we assess the performance of seven open and closed-sourced LLMs experimenting with in-context learning and lightweight fine-tuning approaches. Results show that LLMs with in-context learning significantly underperform smaller encoder-only models based on RoBERTa. Then, we delve into the possible reasons for this gap by applying explainable methods. The outcome suggests a limitation of LLMs in this task due to their autoregressive nature, which causes them to focus only on the last part of the sequence. Additionally, we evaluate the word embeddings of these two models to better understand their pre-training differences. The code and the fine-tuned models can be found respectively on GitHub.

Will LLMs Replace the Encoder-Only Models in Temporal Relation Classification? / Roccabruna, Gabriel; Rizzoli, Massimo; Riccardi, Giuseppe. - (2024). (Intervento presentato al convegno EMNLP tenutosi a Miami, Florida, USA nel 12th November - 16th November 2024).

Will LLMs Replace the Encoder-Only Models in Temporal Relation Classification?

Gabriel Roccabruna^Primo;Massimo Rizzoli^Secondo;Giuseppe Riccardi^Ultimo

2024-01-01

Abstract

The automatic detection of temporal relations among events has been mainly investigated with encoder-only models such as RoBERTa. Large Language Models (LLM) have recently shown promising performance in temporal reasoning tasks such as temporal question answering. Nevertheless, recent studies have tested the LLMs’ performance in detecting temporal relations of closed-source models only, limiting the interpretability of those results. In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task. First, we assess the performance of seven open and closed-sourced LLMs experimenting with in-context learning and lightweight fine-tuning approaches. Results show that LLMs with in-context learning significantly underperform smaller encoder-only models based on RoBERTa. Then, we delve into the possible reasons for this gap by applying explainable methods. The outcome suggests a limitation of LLMs in this task due to their autoregressive nature, which causes them to focus only on the last part of the sequence. Additionally, we evaluate the word embeddings of these two models to better understand their pre-training differences. The code and the fine-tuned models can be found respectively on GitHub.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2024
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
			
	Luogo di edizione (Place of publication)
	
				Miami, Florida, USA
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics
			
	Tutti gli autori
	
						Roccabruna, Gabriel; Rizzoli, Massimo; Riccardi, Giuseppe
					
	Citazione
	
				Will LLMs Replace the Encoder-Only Models in Temporal Relation Classification? / Roccabruna, Gabriel; Rizzoli, Massimo; Riccardi, Giuseppe. - (2024). (Intervento presentato al  convegno EMNLP tenutosi a Miami, Florida, USA nel 12th November - 16th November 2024).

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/438834

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

ND

social impact