Who Are We Talking About? Handling Person Names in Speech Translation

Gaido, Marco; Negri, Matteo; Turchi, Marco

doi:10.18653/v1/2022.iwslt-1.6

Recent work has shown that systems for speech translation (ST) – similarly to automatic speech recognition (ASR) – poorly handle person names. This shortcoming does not only lead to errors that can seriously distort the meaning of the input, but also hinders the adoption of such systems in application scenarios (like computer-assisted interpreting) where the translation of named entities, like person names, is crucial. In this paper, we first analyse the outputs of ASR/ST systems to identify the reasons of failures in person name transcription/translation. Besides the frequency in the training data, we pinpoint the nationality of the referred person as a key factor. We then mitigate the problem by creating multilingual models, and further improve our ST systems by forcing them to jointly generate transcripts and translations, prioritising the former over the latter. Overall, our solutions result in a relative improvement in token-level person name accuracy by 47.8% on average for three language pairs (en→es,fr,it).

Who Are We Talking About? Handling Person Names in Speech Translation / Gaido, Marco; Negri, Matteo; Turchi, Marco. - (2022), pp. 62-73. (Intervento presentato al convegno 19th International Conference on Spoken Language Translation, IWSLT 2022 tenutosi a Dublin, Ireland (in-person and online) nel 26-27 May 2022) [10.18653/v1/2022.iwslt-1.6].

Who Are We Talking About? Handling Person Names in Speech Translation

Gaido, Marco;Negri, Matteo;Turchi, Marco

2022-01-01

Abstract

Recent work has shown that systems for speech translation (ST) – similarly to automatic speech recognition (ASR) – poorly handle person names. This shortcoming does not only lead to errors that can seriously distort the meaning of the input, but also hinders the adoption of such systems in application scenarios (like computer-assisted interpreting) where the translation of named entities, like person names, is crucial. In this paper, we first analyse the outputs of ASR/ST systems to identify the reasons of failures in person name transcription/translation. Besides the frequency in the training data, we pinpoint the nationality of the referred person as a key factor. We then mitigate the problem by creating multilingual models, and further improve our ST systems by forcing them to jointly generate transcripts and translations, prioritising the former over the latter. Overall, our solutions result in a relative improvement in token-level person name accuracy by 47.8% on average for three language pairs (en→es,fr,it).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2022
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)
			
	Luogo di edizione (Place of publication)
	
				Dublin, Ireland (in-person and online)
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics (ACL)
			
	ISBN
	
				978-1-955917-41-4
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85137493637
			
	Codice WOS (WOS identifier)
	
				WOS:000846899900006
			
	Tutti gli autori
	
						Gaido, Marco; Negri, Matteo; Turchi, Marco
					
	Citazione
	
				Who Are We Talking About? Handling Person Names in Speech Translation / Gaido, Marco; Negri, Matteo; Turchi, Marco. - (2022), pp. 62-73. (Intervento presentato al  convegno 19th International Conference on Spoken Language Translation, IWSLT 2022 tenutosi a Dublin, Ireland (in-person and online) nel 26-27 May 2022) [10.18653/v1/2022.iwslt-1.6].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
2022.iwslt-1.6.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 290.43 kB Formato Adobe PDF Visualizza/Apri	290.43 kB	Adobe PDF	Visualizza/Apri