Training task-oriented dialogue systems requires significant amount of manual effort and integration of many independently built components; moreover, the pipeline is prone to errorpropagation. End-To-end training has been proposed to overcome these problems by training the whole system over the utterances of both dialogue parties. In this paper we present an end-To-end spoken dialogue system architecture that is based on turn embeddings. Turn embeddings encode a robust representation of user turns with a local dialogue history and they are trained using sequence-To-sequence models. Turn embeddings are trained by generating the previous and the next turns of the dialogue and additionally perform spoken language understanding. The end-To-end spoken dialogue system is trained using the pre-Trained turn embeddings in a stateful architecture that considers the whole dialogue history. We observe that the proposed spoken dialogue system architecture outperforms the models based on local-only...

Towards end-To-end spoken dialogue systems with turn embeddings / Bayer, Ali Orkan; Stepanov, Evgeny A.; Riccardi, Giuseppe. - 2017-:(2017), pp. 2516-2520. ( 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 swe 2017) [10.21437/Interspeech.2017-1577].

Towards end-To-end spoken dialogue systems with turn embeddings

Bayer, Ali Orkan;Stepanov, Evgeny A.;Riccardi, Giuseppe
2017-01-01

Abstract

Training task-oriented dialogue systems requires significant amount of manual effort and integration of many independently built components; moreover, the pipeline is prone to errorpropagation. End-To-end training has been proposed to overcome these problems by training the whole system over the utterances of both dialogue parties. In this paper we present an end-To-end spoken dialogue system architecture that is based on turn embeddings. Turn embeddings encode a robust representation of user turns with a local dialogue history and they are trained using sequence-To-sequence models. Turn embeddings are trained by generating the previous and the next turns of the dialogue and additionally perform spoken language understanding. The end-To-end spoken dialogue system is trained using the pre-Trained turn embeddings in a stateful architecture that considers the whole dialogue history. We observe that the proposed spoken dialogue system architecture outperforms the models based on local-only...
2017
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
;4 Rue des Fauvettes - Lous Tourils
International Speech Communication Association
Bayer, Ali Orkan; Stepanov, Evgeny A.; Riccardi, Giuseppe
Towards end-To-end spoken dialogue systems with turn embeddings / Bayer, Ali Orkan; Stepanov, Evgeny A.; Riccardi, Giuseppe. - 2017-:(2017), pp. 2516-2520. ( 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 swe 2017) [10.21437/Interspeech.2017-1577].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/193574
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact