Word error rate (WER) is not an appropriate metric for spoken language systems (SLS) because lower WER does not necessarily yield better understanding performance. Therefore, language models (LMs) that are used in SLS should be trained to jointly optimize transcription and understanding performance. Semantic LMs (SELMs) are based on the theory of frame semantics and incorporate features of frames and meaning bearing words target words as semantic context when training LMs. The performance of SELMs is affected by the errors on the ASR and the semantic parser output. In this paper we address the problem of coping with such noise in the training phase of the neural network-based architecture of LMs. We propose the use of deep autoencoders for the encoding of semantic context while accounting for ASR errors. We investigate the optimization of SELMs both for transcription and understanding by using deep semantic encodings. Deep semantic encodings suppress the noise introduced by the ASR module, and enable SELMs to be optimized adequately. We assess the understanding performance by measuring the errors made on target words and we achieve 3.7% relative improvement over recurrent neural network LMs.

Deep Semantic Encodings for Language Modeling / Bayer, Ali Orkan; Riccardi, Giuseppe. - (2015), pp. 1448-1452. (Intervento presentato al convegno INTERSPEECH 2015 tenutosi a Dresden nel 6 Set - 10 Set 2015).

Deep Semantic Encodings for Language Modeling

Bayer, Ali Orkan;Riccardi, Giuseppe
2015-01-01

Abstract

Word error rate (WER) is not an appropriate metric for spoken language systems (SLS) because lower WER does not necessarily yield better understanding performance. Therefore, language models (LMs) that are used in SLS should be trained to jointly optimize transcription and understanding performance. Semantic LMs (SELMs) are based on the theory of frame semantics and incorporate features of frames and meaning bearing words target words as semantic context when training LMs. The performance of SELMs is affected by the errors on the ASR and the semantic parser output. In this paper we address the problem of coping with such noise in the training phase of the neural network-based architecture of LMs. We propose the use of deep autoencoders for the encoding of semantic context while accounting for ASR errors. We investigate the optimization of SELMs both for transcription and understanding by using deep semantic encodings. Deep semantic encodings suppress the noise introduced by the ASR module, and enable SELMs to be optimized adequately. We assess the understanding performance by measuring the errors made on target words and we achieve 3.7% relative improvement over recurrent neural network LMs.
2015
Interspeech 2015, 16th Annual Conference of the International Speech Communication Association
Dresden Germania
International Speech Communication Association
Bayer, Ali Orkan; Riccardi, Giuseppe
Deep Semantic Encodings for Language Modeling / Bayer, Ali Orkan; Riccardi, Giuseppe. - (2015), pp. 1448-1452. (Intervento presentato al convegno INTERSPEECH 2015 tenutosi a Dresden nel 6 Set - 10 Set 2015).
File in questo prodotto:
File Dimensione Formato  
bayer_riccardi_is2015_cam_ready.pdf

Solo gestori archivio

Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 234.82 kB
Formato Adobe PDF
234.82 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/114925
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact