Speech Adaptation Modeling for Statistical Machine Translation

Ruiz, Nicholas

Spoken language translation (SLT) exists within one of the most challenging intersections of speech and natural language processing. While machine translation (MT) has demonstrated its effectiveness on the translation of textual data, the translation of spoken language remains a challenge, largely due to the mismatch between the training conditions of MT and the noisy signal that is output by an automatic speech recognition (ASR) system. In the interchange between ASR and MT, errors propagated from noisy speech recognition outputs may become compounded, rendering the speech translation to be unintelligible. Additionally, aspects such as stylistic differences between written and spoken registers can lead to the generation of inadequate translations. This scenario is predominantly caused by a mismatch between the training conditions of ASR and MT. Due to the lack of training data that couples speech audio with translated transcripts, MT systems in the SLT pipeline must rely predominantly on textual data that does not represent well the characteristics of spoken language. Likewise, independence assumptions between each sentence results in ASR and MT systems that do not yield consistent outputs. In this thesis develop techniques to overcome the mismatch between speech and textual data by improving the robustness of the MT system. Our work can be divided into three parts. First we analyze the effects the difference between spoken and written registers has on SLT quality. We additionally introduce a data analysis methodology to measure the impact of ASR errors on translation quality. Secondly, we propose several approaches to improve the MT component's tolerance of noisy ASR outputs: by adapting its models based on the bilingual statistics of each sentence's neighboring context, and through the introduction of a process by which textual resources can be transformed into synthetic ASR data to use when training a speech-centric MT system. In particular, we focus on the translation from spoken English to French and German -- the two parent languages of English -- and demonstrate that information about the types and frequency of ASR errors can improve the robustness of machine translation for SLT. Finally, we introduce and motivate several challenges in spoken language translation with neural machine translation models that are specific to their modeling architecture.

Speech Adaptation Modeling for Statistical Machine Translation / Ruiz, Nicholas. - (2017), pp. 1-175.

Speech Adaptation Modeling for Statistical Machine Translation

Ruiz, Nicholas

2017-01-01

Abstract

Spoken language translation (SLT) exists within one of the most challenging intersections of speech and natural language processing. While machine translation (MT) has demonstrated its effectiveness on the translation of textual data, the translation of spoken language remains a challenge, largely due to the mismatch between the training conditions of MT and the noisy signal that is output by an automatic speech recognition (ASR) system. In the interchange between ASR and MT, errors propagated from noisy speech recognition outputs may become compounded, rendering the speech translation to be unintelligible. Additionally, aspects such as stylistic differences between written and spoken registers can lead to the generation of inadequate translations. This scenario is predominantly caused by a mismatch between the training conditions of ASR and MT. Due to the lack of training data that couples speech audio with translated transcripts, MT systems in the SLT pipeline must rely predominantly on textual data that does not represent well the characteristics of spoken language. Likewise, independence assumptions between each sentence results in ASR and MT systems that do not yield consistent outputs. In this thesis develop techniques to overcome the mismatch between speech and textual data by improving the robustness of the MT system. Our work can be divided into three parts. First we analyze the effects the difference between spoken and written registers has on SLT quality. We additionally introduce a data analysis methodology to measure the impact of ASR errors on translation quality. Secondly, we propose several approaches to improve the MT component's tolerance of noisy ASR outputs: by adapting its models based on the bilingual statistics of each sentence's neighboring context, and through the introduction of a process by which textual resources can be transformed into synthetic ASR data to use when training a speech-centric MT system. In particular, we focus on the translation from spoken English to French and German -- the two parent languages of English -- and demonstrate that information about the types and frequency of ASR errors can improve the robustness of machine translation for SLT. Finally, we introduce and motivate several challenges in spoken language translation with neural machine translation models that are specific to their modeling architecture.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				2017
			
	Ciclo
	
				XXVII
			
	Anno Accademico
	
				2017-2018
			
	Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di dottorato
	
				Information and Communication Technology
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Federico, Marcello
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
DECLARATORIA_ENG.pdf Solo gestori archivio Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 757.98 kB Formato Adobe PDF Visualizza/Apri	757.98 kB	Adobe PDF	Visualizza/Apri
speech-adaptation-modeling_(6).pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.29 MB Formato Adobe PDF Visualizza/Apri	1.29 MB	Adobe PDF	Visualizza/Apri