The Efficiency of Question-Asking Strategies in a Real-World Visual Search Task

IRIS

In recent years, a multitude of datasets of human-human conversations has been released for the main purpose of training conversational agents based on data-hungry artificial neural networks. In this paper, we argue that datasets of this sort represent a useful and underexplored source to validate, complement, and enhance cognitive studies on human behavior and language use. We present a method that leverages the recent development of powerful computational models to obtain the fine-grained annotation required to apply metrics and techniques from Cognitive Science to large datasets. Previous work in Cognitive Science has investigated the question-asking strategies of human participants by employing different variants of the so-called 20-question-game setting and proposing several evaluation methods. In our work, we focus on GuessWhat, a task proposed within the Computer Vision and Natural Language Processing communities that is similar in structure to the 20-question-game setting. Crucially, the GuessWhat dataset contains tens of thousands of dialogues based on real-world images, making it a suitable setting to investigate the question-asking strategies of human players on a large scale and in a natural setting. Our results demonstrate the effectiveness of computational tools to automatically code how the hypothesis space changes throughout the dialogue in complex visual scenes. On the one hand, we confirm findings from previous work on smaller and more controlled settings. On the other hand, our analyses allow us to highlight the presence of "uninformative" questions (in terms of Expected Information Gain) at specific rounds of the dialogue. We hypothesize that these questions fulfill pragmatic constraints that are exploited by human players to solve visual tasks in complex scenes successfully. Our work illustrates a method that brings together efforts and findings from different disciplines to gain a better understanding of human question-asking strategies on large-scale datasets, while at the same time posing new questions about the development of conversational systems.

The Efficiency of Question-Asking Strategies in a Real-World Visual Search Task / Testoni, A.; Bernardi, R.; Ruggeri, A.. - In: COGNITIVE SCIENCE. - ISSN 1551-6709. - 47:12(2023), pp. 1-29. [10.1111/cogs.13396]

The Efficiency of Question-Asking Strategies in a Real-World Visual Search Task

Testoni A.^Primo;Bernardi R.;Ruggeri A.

2023-01-01

Abstract

In recent years, a multitude of datasets of human-human conversations has been released for the main purpose of training conversational agents based on data-hungry artificial neural networks. In this paper, we argue that datasets of this sort represent a useful and underexplored source to validate, complement, and enhance cognitive studies on human behavior and language use. We present a method that leverages the recent development of powerful computational models to obtain the fine-grained annotation required to apply metrics and techniques from Cognitive Science to large datasets. Previous work in Cognitive Science has investigated the question-asking strategies of human participants by employing different variants of the so-called 20-question-game setting and proposing several evaluation methods. In our work, we focus on GuessWhat, a task proposed within the Computer Vision and Natural Language Processing communities that is similar in structure to the 20-question-game setting. Crucially, the GuessWhat dataset contains tens of thousands of dialogues based on real-world images, making it a suitable setting to investigate the question-asking strategies of human players on a large scale and in a natural setting. Our results demonstrate the effectiveness of computational tools to automatically code how the hypothesis space changes throughout the dialogue in complex visual scenes. On the one hand, we confirm findings from previous work on smaller and more controlled settings. On the other hand, our analyses allow us to highlight the presence of "uninformative" questions (in terms of Expected Information Gain) at specific rounds of the dialogue. We hypothesize that these questions fulfill pragmatic constraints that are exploited by human players to solve visual tasks in complex scenes successfully. Our work illustrates a method that brings together efforts and findings from different disciplines to gain a better understanding of human question-asking strategies on large-scale datasets, while at the same time posing new questions about the development of conversational systems.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2023
			
	Titolo del periodico (Journal title)
	
				COGNITIVE SCIENCE
			
	Numero e parte del fascicolo (Issue number and part)
	
				12
			
	DOI
	
				https://dx.doi.org/10.1111/cogs.13396
			
	Codice PubMed (PubMed Identifier)
	
				38142430
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-85180757581
			
	Codice WOS (WOS identifier)
	
				WOS:001132779100001
			
	Tutti gli autori
	
						Testoni, A.; Bernardi, R.; Ruggeri, A.
					
	Citazione
	
				The Efficiency of Question-Asking Strategies in a Real-World Visual Search Task / Testoni, A.; Bernardi, R.; Ruggeri, A.. - In: COGNITIVE SCIENCE. - ISSN 1551-6709. - 47:12(2023), pp. 1-29. [10.1111/cogs.13396]
			
	Appare nelle tipologie:
	
				03.1 Articolo su rivista (Journal article)

File in questo prodotto:

File	Dimensione	Formato
Cognitive Science - 2023 - Testoni - The Efficiency of Question‐Asking Strategies in a Real‐World Visual Search Task.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 2.95 MB Formato Adobe PDF Visualizza/Apri	2.95 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/401089

Citazioni

0

1

0

ND

social impact