Language model (LM) adaptation is needed to improve the performance of language-based interaction systems. There are two important issues regarding LM adaptation; the selection of the target data set and the mathematical adaptation model. In the literature, usually statistics are drawn from the target data set (e.g. cache model) to augment (e.g. linearly) background statistical language models, as in the case of automatic speech recognition (ASR). Such models are relatively inexpensive to train, however they do not provide the necessary high-dimensional language context description needed for language-based interaction. Instance-based learning provides high-dimensional description of the lexical, semantic, or dialog context. In this paper, we present an instance-based approach to LM adaptation. We show that by retrieving similar instances from the training data and adapting the model with these instances, we can improve the performance of LMs. We propose two different similarity metrics for instance retrieval, edit distance and n-gram match score. We have performed instance-based adaptation on feed forward neural network LMs (NNLMs) to re-score n-best lists for ASR on the LUNA corpus, which includes conver sational speech. We have achieved significant improvements in word error rate (WER) by using instance-based on-line LM adaptation on feed forward NNLMs.

Instance-Based On-line Language Model Adaptation / Bayer, Ali Orkan; Riccardi, Giuseppe. - STAMPA. - (2013), pp. 2688-2692. ((Intervento presentato al convegno INTERSPEECH 2013 tenutosi a Lyon France nel 25-29 August 2013.

Instance-Based On-line Language Model Adaptation

Bayer, Ali Orkan;Riccardi, Giuseppe
2013

Abstract

Language model (LM) adaptation is needed to improve the performance of language-based interaction systems. There are two important issues regarding LM adaptation; the selection of the target data set and the mathematical adaptation model. In the literature, usually statistics are drawn from the target data set (e.g. cache model) to augment (e.g. linearly) background statistical language models, as in the case of automatic speech recognition (ASR). Such models are relatively inexpensive to train, however they do not provide the necessary high-dimensional language context description needed for language-based interaction. Instance-based learning provides high-dimensional description of the lexical, semantic, or dialog context. In this paper, we present an instance-based approach to LM adaptation. We show that by retrieving similar instances from the training data and adapting the model with these instances, we can improve the performance of LMs. We propose two different similarity metrics for instance retrieval, edit distance and n-gram match score. We have performed instance-based adaptation on feed forward neural network LMs (NNLMs) to re-score n-best lists for ASR on the LUNA corpus, which includes conver sational speech. We have achieved significant improvements in word error rate (WER) by using instance-based on-line LM adaptation on feed forward NNLMs.
Interspeech 2013
France
ISCA
Bayer, Ali Orkan; Riccardi, Giuseppe
Instance-Based On-line Language Model Adaptation / Bayer, Ali Orkan; Riccardi, Giuseppe. - STAMPA. - (2013), pp. 2688-2692. ((Intervento presentato al convegno INTERSPEECH 2013 tenutosi a Lyon France nel 25-29 August 2013.
File in questo prodotto:
File Dimensione Formato  
bayer_is13.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 199.76 kB
Formato Adobe PDF
199.76 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11572/115041
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact