Language model (LM) adaptation is needed to improve the performance of language-based interaction systems. There are two important issues regarding LM adaptation; the selection of the target data set and the mathematical adaptation model. In the literature, usually statistics are drawn from the target data set (e.g. cache model) to augment (e.g. linearly) background statistical language models, as in the case of automatic speech recognition (ASR). Such models are relatively inexpensive to train, however they do not provide the necessary high-dimensional language context description needed for language-based interaction. Instance-based learning provides high-dimensional description of the lexical, semantic, or dialog context. In this paper, we present an instance-based approach to LM adaptation. We show that by retrieving similar instances from the training data and adapting the model with these instances, we can improve the performance of LMs. We propose two different similarity metrics for instance retrieval, edit distance and n-gram match score. We have performed instance-based adaptation on feed forward neural network LMs (NNLMs) to re-score n-best lists for ASR on the LUNA corpus, which includes conver sational speech. We have achieved significant improvements in word error rate (WER) by using instance-based on-line LM adaptation on feed forward NNLMs.
Instance-Based On-line Language Model Adaptation / Bayer, Ali Orkan; Riccardi, Giuseppe. - STAMPA. - (2013), pp. 2688-2692. (Intervento presentato al convegno INTERSPEECH 2013 tenutosi a Lyon France nel 25-29 August 2013).
Instance-Based On-line Language Model Adaptation
Bayer, Ali Orkan;Riccardi, Giuseppe
2013-01-01
Abstract
Language model (LM) adaptation is needed to improve the performance of language-based interaction systems. There are two important issues regarding LM adaptation; the selection of the target data set and the mathematical adaptation model. In the literature, usually statistics are drawn from the target data set (e.g. cache model) to augment (e.g. linearly) background statistical language models, as in the case of automatic speech recognition (ASR). Such models are relatively inexpensive to train, however they do not provide the necessary high-dimensional language context description needed for language-based interaction. Instance-based learning provides high-dimensional description of the lexical, semantic, or dialog context. In this paper, we present an instance-based approach to LM adaptation. We show that by retrieving similar instances from the training data and adapting the model with these instances, we can improve the performance of LMs. We propose two different similarity metrics for instance retrieval, edit distance and n-gram match score. We have performed instance-based adaptation on feed forward neural network LMs (NNLMs) to re-score n-best lists for ASR on the LUNA corpus, which includes conver sational speech. We have achieved significant improvements in word error rate (WER) by using instance-based on-line LM adaptation on feed forward NNLMs.File | Dimensione | Formato | |
---|---|---|---|
bayer_is13.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
199.76 kB
Formato
Adobe PDF
|
199.76 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione