Adaptation methods for phrase-based statistical Machine Translation (MT) have been explored in the literature under different paradigms, such as domain adaptation and topic adaptation, and most of the times in rather ideal experimental set-ups. We address this subject in three real-life industrial use cases, in which MT has to quickly adapt in accordance with specific operating conditions. In particular, we explore domain adaptation when no in-domain parallel data are available, which is a typical use case of MT service providers. Then, we investigate topic adaptation for the translation of short highly ambiguous item titles in an e-commerce setting. Finally, we consider the Computer Assisted Translation (CAT) scenario, in which MT interacts with a human translator by providing them with translation drafts and by adapting from their post-editions. In this scenario, we investigate online adaptation from human post-editions, respectively, in a single-user setting and in a multi-user setting, in which multiple translators are working on different parts of the same document. In addition, for the single-user case we also discuss the optimisation of the hyper-parameters of the employed online adaptation method.
Adaptation Methods for Statistical Machine Translation In Business Scenarios / Mathur, Prashant. - (2017), pp. 1-194.
Adaptation Methods for Statistical Machine Translation In Business Scenarios
Mathur, Prashant
2017-01-01
Abstract
Adaptation methods for phrase-based statistical Machine Translation (MT) have been explored in the literature under different paradigms, such as domain adaptation and topic adaptation, and most of the times in rather ideal experimental set-ups. We address this subject in three real-life industrial use cases, in which MT has to quickly adapt in accordance with specific operating conditions. In particular, we explore domain adaptation when no in-domain parallel data are available, which is a typical use case of MT service providers. Then, we investigate topic adaptation for the translation of short highly ambiguous item titles in an e-commerce setting. Finally, we consider the Computer Assisted Translation (CAT) scenario, in which MT interacts with a human translator by providing them with translation drafts and by adapting from their post-editions. In this scenario, we investigate online adaptation from human post-editions, respectively, in a single-user setting and in a multi-user setting, in which multiple translators are working on different parts of the same document. In addition, for the single-user case we also discuss the optimisation of the hyper-parameters of the employed online adaptation method.File | Dimensione | Formato | |
---|---|---|---|
PrashantThesis-uploaded.pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.74 MB
Formato
Adobe PDF
|
1.74 MB | Adobe PDF | Visualizza/Apri |
Disclaimer_Mathur.pdf
Solo gestori archivio
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
317.23 kB
Formato
Adobe PDF
|
317.23 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione