We are interested in the problem of learning Spoken Language Understanding (SLU) models for multiple target languages. Learning such models requires annotated corpora, and porting to different languages would require corpora with parallel text translation and semantic annotations. In this paper we investigate how to learn a SLU model in a target language starting from no target text and no semantic annotation. Our proposed algorithm is based on the idea of exploiting the diversity (with regard to performance and coverage) of multiple translation systems to transfer statistically stable word-to-concept mappings in the case of the romance language pair, French and Spanish. Each translation system performs differently at the lexical level (wrt BLEU). The best translation system performances for the semantic task are gained from their combination at different stages of the portability methodology. We have evaluated the portability algorithms on the French MEDIA corpus, using French as the source language and Spanish as the target language. The experiments show the effectiveness of the proposed methods with respect to the source language SLU baseline.

Combining Machine Translation Systems for Spoken Language Understanding Portability

Riccardi, Giuseppe
2012

Abstract

We are interested in the problem of learning Spoken Language Understanding (SLU) models for multiple target languages. Learning such models requires annotated corpora, and porting to different languages would require corpora with parallel text translation and semantic annotations. In this paper we investigate how to learn a SLU model in a target language starting from no target text and no semantic annotation. Our proposed algorithm is based on the idea of exploiting the diversity (with regard to performance and coverage) of multiple translation systems to transfer statistically stable word-to-concept mappings in the case of the romance language pair, French and Spanish. Each translation system performs differently at the lexical level (wrt BLEU). The best translation system performances for the semantic task are gained from their combination at different stages of the portability methodology. We have evaluated the portability algorithms on the French MEDIA corpus, using French as the source language and Spanish as the target language. The experiments show the effectiveness of the proposed methods with respect to the source language SLU baseline.
Spoken Language Technology Workshop (SLT), 2012 IEEE
Miami, FL, USA
IEEE
9781467351256
F., Garcia; L. F., Hurtado; E., Segarra; E., Sanchis; Riccardi, Giuseppe
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11572/96501
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 4
social impact