Grounding Dialogue History: Strengths and Weaknesses of Pre-trained Transformers

Greco, Claudio; Testoni, Alberto; Bernardi, Raffaella

doi:10.1007/978-3-030-77091-4_17

We focus on visually grounded dialogue history encoding. We show that GuessWhat?! can be used as a “diagnostic” dataset to understand whether State-of-the-Art encoders manage to capture salient information in the dialogue history. We compare models across several dimensions: the architecture (Recurrent Neural Networks vs. Transformers), the input modalities (only language vs. language and vision), and the model background knowledge (trained from scratch vs. pre-trained and then fine-tuned on the downstream task). We show that pre-trained Transformers, RoBERTa and LXMERT, are able to identify the most salient information independently of the order in which the dialogue history is processed. Moreover, we find that RoBERTa handles the dialogue structure to some extent; instead LXMERT can effectively ground short dialogues, but it fails in processing longer dialogues having a more complex structure.

Grounding Dialogue History: Strengths and Weaknesses of Pre-trained Transformers / Greco, Claudio; Testoni, Alberto; Bernardi, Raffaella. - ELETTRONICO. - 12414:(2021), pp. 236-279. [10.1007/978-3-030-77091-4_17]

Grounding Dialogue History: Strengths and Weaknesses of Pre-trained Transformers

Greco, Claudio;Testoni, Alberto;Bernardi, Raffaella

2021-01-01

Abstract

We focus on visually grounded dialogue history encoding. We show that GuessWhat?! can be used as a “diagnostic” dataset to understand whether State-of-the-Art encoders manage to capture salient information in the dialogue history. We compare models across several dimensions: the architecture (Recurrent Neural Networks vs. Transformers), the input modalities (only language vs. language and vision), and the model background knowledge (trained from scratch vs. pre-trained and then fine-tuned on the downstream task). We show that pre-trained Transformers, RoBERTa and LXMERT, are able to identify the most salient information independently of the order in which the dialogue history is processed. Moreover, we find that RoBERTa handles the dialogue structure to some extent; instead LXMERT can effectively ground short dialogues, but it fails in processing longer dialogues having a more complex structure.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2021
			
	Titolo del libro (Book title)
	
				AIxIA 2020 – Advances in Artificial Intelligence
			
	Luogo di edizione (Place of publication)
	
				GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND
			
	Casa editrice (Publisher)
	
				Springer Science and Business Media Deutschland GmbH
			
	ISBN
	
				978-3-030-77090-7
			
	Tutti gli autori
	
						Greco, Claudio; Testoni, Alberto; Bernardi, Raffaella
					
	Citazione
	
				Grounding Dialogue History: Strengths and Weaknesses of Pre-trained Transformers / Greco, Claudio; Testoni, Alberto; Bernardi, Raffaella. - ELETTRONICO. - 12414:(2021), pp. 236-279. [10.1007/978-3-030-77091-4_17]
			
	Appare nelle tipologie:
	
				02.1 Saggio su volume miscellaneo o Capitolo di libro (Essay or Book Chapter)

File in questo prodotto:

File	Dimensione	Formato
AIxIA_greco_testoni_bernardi.pdf Solo gestori archivio Descrizione: articolo principale Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 947.79 kB Formato Adobe PDF Visualizza/Apri	947.79 kB	Adobe PDF	Visualizza/Apri