We focus on visually grounded dialogue history encoding. We show that GuessWhat?! can be used as a “diagnostic” dataset to understand whether State-of-the-Art encoders manage to capture salient information in the dialogue history. We compare models across several dimensions: the architecture (Recurrent Neural Networks vs. Transformers), the input modalities (only language vs. language and vision), and the model background knowledge (trained from scratch vs. pre-trained and then fine-tuned on the downstream task). We show that pre-trained Transformers are able to identify the most salient information independently of the order in which the dialogue history is processed whereas LSTM based models do not.

Which Turn do Neural Models Exploit the Most to Solve GuessWhat? Diving into the Dialogue History Encoding in Transformers and LSTMs / Greco, Claudio; Testoni, Alberto; Bernardi, Raffaella. - ELETTRONICO. - 2735:(2020), pp. 29-43. (Intervento presentato al convegno 4th Workshop on Natural Language for Artificial Intelligence, NL4AI 2020 tenutosi a Online nel 25th-27th November, 2020.).

Which Turn do Neural Models Exploit the Most to Solve GuessWhat? Diving into the Dialogue History Encoding in Transformers and LSTMs

Greco, Claudio;Testoni, Alberto;Bernardi, Raffaella
2020-01-01

Abstract

We focus on visually grounded dialogue history encoding. We show that GuessWhat?! can be used as a “diagnostic” dataset to understand whether State-of-the-Art encoders manage to capture salient information in the dialogue history. We compare models across several dimensions: the architecture (Recurrent Neural Networks vs. Transformers), the input modalities (only language vs. language and vision), and the model background knowledge (trained from scratch vs. pre-trained and then fine-tuned on the downstream task). We show that pre-trained Transformers are able to identify the most salient information independently of the order in which the dialogue history is processed whereas LSTM based models do not.
2020
Proceedings of the 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020)
Aachen, Germany
CEUR-WS
Greco, Claudio; Testoni, Alberto; Bernardi, Raffaella
Which Turn do Neural Models Exploit the Most to Solve GuessWhat? Diving into the Dialogue History Encoding in Transformers and LSTMs / Greco, Claudio; Testoni, Alberto; Bernardi, Raffaella. - ELETTRONICO. - 2735:(2020), pp. 29-43. (Intervento presentato al convegno 4th Workshop on Natural Language for Artificial Intelligence, NL4AI 2020 tenutosi a Online nel 25th-27th November, 2020.).
File in questo prodotto:
File Dimensione Formato  
paper31.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 815.52 kB
Formato Adobe PDF
815.52 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/286801
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact