The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models

Conti, Lina; Fucci, Dennis; Gaido, Marco; Negri, Matteo; Wisniewski, Guillaume; Bentivogli, Luisa

doi:10.18653/v1/2025.blackboxnlp-1.23

Contrastive explanations, which indicate why an AI system produced one output (the target) instead of another (the foil), are widely recognized in explainable AI as more informative and interpretable than standard explanations. However, obtaining such explanations for speech-to-text (S2T) generative models remains an open challenge. Adopting a feature attribution framework, we propose the first method to obtain contrastive explanations in S2T by analyzing how specific regions of the input spectrogram influence the choice between alternative outputs. Through a case study on gender translation in speech translation, we show that our method accurately identifies the audio features that drive the selection of one gender over another.

The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models / Conti, L., Fucci, D., Gaido, M., Negri, M., Wisniewski, G., Bentivogli, L.. - ELETTRONICO. - (2025), pp. 398-414. (BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP Suzhou 9th November 2025) [10.18653/v1/2025.blackboxnlp-1.23].

The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models

Lina Conti;Dennis Fucci;Marco Gaido;Matteo Negri;Guillaume Wisniewski;Luisa Bentivogli

2025-01-01

Abstract

Contrastive explanations, which indicate why an AI system produced one output (the target) instead of another (the foil), are widely recognized in explainable AI as more informative and interpretable than standard explanations. However, obtaining such explanations for speech-to-text (S2T) generative models remains an open challenge. Adopting a feature attribution framework, we propose the first method to obtain contrastive explanations in S2T by analyzing how specific regions of the input spectrogram influence the choice between alternative outputs. Through a case study on gender translation in speech translation, we show that our method accurately identifies the audio features that drive the selection of one gender over another.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2025
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
			
	Luogo di edizione (Place of publication)
	
				Suzhou
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics
			
	ISBN
	
				979-8-89176-346-3
			
	Tutti gli autori
	
						Conti, Lina; Fucci, Dennis; Gaido, Marco; Negri, Matteo; Wisniewski, Guillaume; Bentivogli, Luisa
					
	Citazione
	
				The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models / Conti, L., Fucci, D., Gaido, M., Negri, M., Wisniewski, G., Bentivogli, L.. - ELETTRONICO. - (2025), pp. 398-414. (BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP Suzhou 9th November 2025) [10.18653/v1/2025.blackboxnlp-1.23].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
2025.blackboxnlp-1.23.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 1.28 MB Formato Adobe PDF Visualizza/Apri	1.28 MB	Adobe PDF	Visualizza/Apri