Adaptation methods for phrase-based statistical Machine Translation (MT) have been explored in the literature under different paradigms, such as domain adaptation and topic adaptation, and most of the times in rather ideal experimental set-ups. We address this subject in three real-life industrial use cases, in which MT has to quickly adapt in accordance with specific operating conditions. In particular, we explore domain adaptation when no in-domain parallel data are available, which is a typical use case of MT service providers. Then, we investigate topic adaptation for the translation of short highly ambiguous item titles in an e-commerce setting. Finally, we consider the Computer Assisted Translation (CAT) scenario, in which MT interacts with a human translator by providing them with translation drafts and by adapting from their post-editions. In this scenario, we investigate online adaptation from human post-editions, respectively, in a single-user setting and in a multi-user setting, in which multiple translators are working on different parts of the same document. In addition, for the single-user case we also discuss the optimisation of the hyper-parameters of the employed online adaptation method.

Adaptation Methods for Statistical Machine Translation In Business Scenarios / Mathur, Prashant. - (2017), pp. 1-194.

Adaptation Methods for Statistical Machine Translation In Business Scenarios

Mathur, Prashant
2017-01-01

Abstract

Adaptation methods for phrase-based statistical Machine Translation (MT) have been explored in the literature under different paradigms, such as domain adaptation and topic adaptation, and most of the times in rather ideal experimental set-ups. We address this subject in three real-life industrial use cases, in which MT has to quickly adapt in accordance with specific operating conditions. In particular, we explore domain adaptation when no in-domain parallel data are available, which is a typical use case of MT service providers. Then, we investigate topic adaptation for the translation of short highly ambiguous item titles in an e-commerce setting. Finally, we consider the Computer Assisted Translation (CAT) scenario, in which MT interacts with a human translator by providing them with translation drafts and by adapting from their post-editions. In this scenario, we investigate online adaptation from human post-editions, respectively, in a single-user setting and in a multi-user setting, in which multiple translators are working on different parts of the same document. In addition, for the single-user case we also discuss the optimisation of the hyper-parameters of the employed online adaptation method.
2017
XXVII
2017-2018
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Federico, Marcello
Cettolo, Mauro
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
PrashantThesis-uploaded.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.74 MB
Formato Adobe PDF
1.74 MB Adobe PDF Visualizza/Apri
Disclaimer_Mathur.pdf

Solo gestori archivio

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 317.23 kB
Formato Adobe PDF
317.23 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368157
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact