“I’ve Seen Things You People Wouldn’t Believe”: Hallucinating Entities in GuessWhat?!

Testoni, Alberto; Bernardi, Raffaella

doi:10.18653/v1/2021.acl-srw.11

Natural language generation systems have witnessed important progress in the last years, but they are shown to generate tokens that are unrelated to the source input. This problem affects computational models in many NLP tasks, and it is particularly unpleasant in multimodal systems. In this work, we assess the rate of object hallucination in multimodal conversational agents playing the GuessWhat?! referential game. Better visual processing has been shown to mitigate this issue in image captioning; hence, we adapt to the GuessWhat?! task the best visual processing models at disposal, and propose two new models to play the Questioner agent. We show that the new models generate few hallucinations compared to other renowned models available in the literature. Moreover, their hallucinations are less severe (affect task-accuracy less) and are more human-like. We also analyse where hallucinations tend to occur more often through the dialogue: hallucinations are less frequent in earlier turns, cause a cascade hallucination effect, and are often preceded by negative answers, which have been shown to be harder to ground.

“I’ve Seen Things You People Wouldn’t Believe”: Hallucinating Entities in GuessWhat?! / Testoni, Alberto; Bernardi, Raffaella. - ELETTRONICO. - (2021), pp. 101-111. (Intervento presentato al convegno 2021 Student Research Workshop, SRW 2021 at the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 tenutosi a Online nel 1-6 August 2021) [10.18653/v1/2021.acl-srw.11].

“I’ve Seen Things You People Wouldn’t Believe”: Hallucinating Entities in GuessWhat?!

Testoni, Alberto;Bernardi, Raffaella

2021-01-01

Abstract

Natural language generation systems have witnessed important progress in the last years, but they are shown to generate tokens that are unrelated to the source input. This problem affects computational models in many NLP tasks, and it is particularly unpleasant in multimodal systems. In this work, we assess the rate of object hallucination in multimodal conversational agents playing the GuessWhat?! referential game. Better visual processing has been shown to mitigate this issue in image captioning; hence, we adapt to the GuessWhat?! task the best visual processing models at disposal, and propose two new models to play the Questioner agent. We show that the new models generate few hallucinations compared to other renowned models available in the literature. Moreover, their hallucinations are less severe (affect task-accuracy less) and are more human-like. We also analyse where hallucinations tend to occur more often through the dialogue: hallucinations are less frequent in earlier turns, cause a cascade hallucination effect, and are often preceded by negative answers, which have been shown to be harder to ground.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2021
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop
			
	Luogo di edizione (Place of publication)
	
				Online
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics (ACL)
			
	ISBN
	
				978-1-954085-55-8
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85115690120
			
	Tutti gli autori
	
						Testoni, Alberto; Bernardi, Raffaella
					
	Citazione
	
				“I’ve Seen Things You People Wouldn’t Believe”: Hallucinating Entities in GuessWhat?! / Testoni, Alberto; Bernardi, Raffaella. - ELETTRONICO. - (2021), pp. 101-111. (Intervento presentato al  convegno 2021 Student Research Workshop, SRW 2021 at the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 tenutosi a Online nel 1-6 August 2021) [10.18653/v1/2021.acl-srw.11].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
2021.acl-srw.11.pdf accesso aperto Descrizione: articolo principale Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 899.78 kB Formato Adobe PDF Visualizza/Apri	899.78 kB	Adobe PDF	Visualizza/Apri