The swift advancement of Large Language Models (LLMs) has led to their widespread use across various tasks and domains, demonstrating remarkable generalization capabilities. However, achieving optimal performance in specialized tasks often requires fine-tuning LLMs with task-specific resources. The creation of high-quality, human-annotated datasets for this purpose is challenging due to financial constraints and the limited availability of human experts. To address these limitations, we propose First-AID, a novel human-in-theloop (HITL) data collection framework for the knowledge-driven generation of synthetic dialogues using LLM prompting. In particular, our framework implements different strategies of data collection that require different user intervention during dialogue generation to reduce post-editing efforts and enhance the quality of generated dialogues. We also evaluated First-AID on misinformation and hate countering dialogues collection, demonstrating (1) its potential for efficient and high-quality data generation and (2) its adaptability to different practical constraints thanks to the three data

First-AID: the first Annotation Interface for grounded Dialogues / Menini, Stefano; Russo, Daniel; Palmero Aprosio, Alessio; Guerini, Marco. - (2025), pp. 563-571. ( 63rd Annual Meeting of the Association for Computational Linguistics Vienna, Austria July 27–August 1st, 2025) [10.18653/v1/2025.acl-demo.54].

First-AID: the first Annotation Interface for grounded Dialogues

Menini Stefano;Russo Daniel;Aprosio Alessio Palmero;Guerini Marco
2025-01-01

Abstract

The swift advancement of Large Language Models (LLMs) has led to their widespread use across various tasks and domains, demonstrating remarkable generalization capabilities. However, achieving optimal performance in specialized tasks often requires fine-tuning LLMs with task-specific resources. The creation of high-quality, human-annotated datasets for this purpose is challenging due to financial constraints and the limited availability of human experts. To address these limitations, we propose First-AID, a novel human-in-theloop (HITL) data collection framework for the knowledge-driven generation of synthetic dialogues using LLM prompting. In particular, our framework implements different strategies of data collection that require different user intervention during dialogue generation to reduce post-editing efforts and enhance the quality of generated dialogues. We also evaluated First-AID on misinformation and hate countering dialogues collection, demonstrating (1) its potential for efficient and high-quality data generation and (2) its adaptability to different practical constraints thanks to the three data
2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Vienna, Austria
Association for Computational Linguistics
Menini, Stefano; Russo, Daniel; Palmero Aprosio, Alessio; Guerini, Marco
First-AID: the first Annotation Interface for grounded Dialogues / Menini, Stefano; Russo, Daniel; Palmero Aprosio, Alessio; Guerini, Marco. - (2025), pp. 563-571. ( 63rd Annual Meeting of the Association for Computational Linguistics Vienna, Austria July 27–August 1st, 2025) [10.18653/v1/2025.acl-demo.54].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/469033
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact