The automatic detection of temporal relations among events has been mainly investigated with encoder-only models such as RoBERTa.Large Language Models (LLM) have recently shown promising performance in temporal reasoning tasks such as temporal question answering.Nevertheless, recent studies have tested the LLMs' performance in detecting temporal relations of closed-source models only, limiting the interpretability of those results.In this work, we investigate LLMs' performance and decision process in the Temporal Relation Classification task.First, we assess the performance of seven open and closed-sourced LLMs experimenting with in-context learning and lightweight fine-tuning approaches.Results show that LLMs with in-context learning significantly underperform smaller encoder-only models based on RoBERTa.Then, we delve into the possible reasons for this gap by applying explainable methods.The outcome suggests a limitation of LLMs in this task due to their autoregressive nature, which ...

The automatic detection of temporal relations among events has been mainly investigated with encoder-only models such as RoBERTa. Large Language Models (LLM) have recently shown promising performance in temporal reasoning tasks such as temporal question answering. Nevertheless, recent studies have tested the LLMs’ performance in detecting temporal relations of closed-source models only, limiting the interpretability of those results. In this work, we investigate LLMs’ performance and decision process in the Temporal Relation Classification task. First, we assess the performance of seven open and closed-sourced LLMs experimenting with in-context learning and lightweight fine-tuning approaches. Results show that LLMs with in-context learning significantly underperform smaller encoder-only models based on RoBERTa. Then, we delve into the possible reasons for this gap by applying explainable methods. The outcome suggests a limitation of LLMs in this task due to their autoregressive nature, which causes them to focus only on the last part of the sequence. Additionally, we evaluate the word embeddings of these two models to better understand their pre-training differences. The code and the fine-tuned models can be found respectively on GitHub.

Will LLMs Replace the Encoder-Only Models in Temporal Relation Classification? / Roccabruna, Gabriel; Rizzoli, Massimo; Riccardi, Giuseppe. - (2024), pp. 20402-20415. ( 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 Miami, Florida, USA 12th November - 16th November 2024) [10.18653/v1/2024.emnlp-main.1136].

Will LLMs Replace the Encoder-Only Models in Temporal Relation Classification?

Gabriel Roccabruna
Primo
;
Massimo Rizzoli
Secondo
;
Giuseppe Riccardi
Ultimo
2024-01-01

Abstract

The automatic detection of temporal relations among events has been mainly investigated with encoder-only models such as RoBERTa.Large Language Models (LLM) have recently shown promising performance in temporal reasoning tasks such as temporal question answering.Nevertheless, recent studies have tested the LLMs' performance in detecting temporal relations of closed-source models only, limiting the interpretability of those results.In this work, we investigate LLMs' performance and decision process in the Temporal Relation Classification task.First, we assess the performance of seven open and closed-sourced LLMs experimenting with in-context learning and lightweight fine-tuning approaches.Results show that LLMs with in-context learning significantly underperform smaller encoder-only models based on RoBERTa.Then, we delve into the possible reasons for this gap by applying explainable methods.The outcome suggests a limitation of LLMs in this task due to their autoregressive nature, which ...
2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Miami, Florida, USA
Association for Computational Linguistics
9798891761643
Roccabruna, Gabriel; Rizzoli, Massimo; Riccardi, Giuseppe
Will LLMs Replace the Encoder-Only Models in Temporal Relation Classification? / Roccabruna, Gabriel; Rizzoli, Massimo; Riccardi, Giuseppe. - (2024), pp. 20402-20415. ( 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 Miami, Florida, USA 12th November - 16th November 2024) [10.18653/v1/2024.emnlp-main.1136].
File in questo prodotto:
File Dimensione Formato  
2024.emnlp-main.1136.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 409.11 kB
Formato Adobe PDF
409.11 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/438834
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 3
social impact