Automatically Predicting User Ratings for Conversational Systems

Cervone, Alessandra; Gambi, Enrico; Tortoreto, Giuliano; Stepanov, Evgeny A.; Riccardi, Giuseppe

Automatic evaluation models for open-domain conversational agents either correlate poorly with human judgment or require expensive annotations on top of conversation scores. In this work we investigate the feasibility of learning evaluation models without relying on any further annotations besides conversationlevel human ratings. We use a dataset of rated (1-5) open domain spoken conversations between the conversational agent Roving Mind (competing in the Amazon Alexa Prize Challenge 2017) and Amazon Alexa users. First, we assess the complexity of the task by asking two experts to re-annotate a sample of the dataset and observe that the subjectivity of user ratings yields a low upper-bound. Second, through an analysis of the entire dataset we show that automatically extracted features such as user sentiment, Dialogue Acts and conversation length have significant, but low correlation with user ratings. Finally, we report the results of our experiments exploring different combinations of these features to train automatic dialogue evaluation models. Our work suggests that predicting subjective user ratings in open domain conversations is a challenging task.

Automatically Predicting User Ratings for Conversational Systems / Cervone, Alessandra; Gambi, Enrico; Tortoreto, Giuliano; Stepanov, Evgeny A.; Riccardi, Giuseppe. - ELETTRONICO. - 2253:(2018), pp. 99-104. (Intervento presentato al convegno CLiC-it tenutosi a Torino nel 10th-12th December 2018).

Automatically Predicting User Ratings for Conversational Systems

Alessandra Cervone;Enrico Gambi;Giuliano Tortoreto;Evgeny A. Stepanov;Giuseppe Riccardi

2018-01-01

Abstract

Automatic evaluation models for open-domain conversational agents either correlate poorly with human judgment or require expensive annotations on top of conversation scores. In this work we investigate the feasibility of learning evaluation models without relying on any further annotations besides conversationlevel human ratings. We use a dataset of rated (1-5) open domain spoken conversations between the conversational agent Roving Mind (competing in the Amazon Alexa Prize Challenge 2017) and Amazon Alexa users. First, we assess the complexity of the task by asking two experts to re-annotate a sample of the dataset and observe that the subjectivity of user ratings yields a low upper-bound. Second, through an analysis of the entire dataset we show that automatically extracted features such as user sentiment, Dialogue Acts and conversation length have significant, but low correlation with user ratings. Finally, we report the results of our experiments exploring different combinations of these features to train automatic dialogue evaluation models. Our work suggests that predicting subjective user ratings in open domain conversations is a challenging task.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2018
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018)
			
	Luogo di edizione (Place of publication)
	
				Torino
			
	Casa editrice (Publisher)
	
				CEUR
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85057727492
			
	Tutti gli autori
	
						Cervone, Alessandra; Gambi, Enrico; Tortoreto, Giuliano; Stepanov, Evgeny A.; Riccardi, Giuseppe
					
	Citazione
	
				Automatically Predicting User Ratings for Conversational Systems / Cervone, Alessandra; Gambi, Enrico; Tortoreto, Giuliano; Stepanov, Evgeny A.; Riccardi, Giuseppe. - ELETTRONICO. - 2253:(2018), pp. 99-104. (Intervento presentato al  convegno CLiC-it tenutosi a Torino nel 10th-12th December 2018).
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
paper32.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 229.71 kB Formato Adobe PDF Visualizza/Apri	229.71 kB	Adobe PDF	Visualizza/Apri