Building task-oriented bots requires mapping a user utterance to an intent with its associated entities to serve the request. Doing so is not easy since it requires large quantities of high-quality and diverse training data to learn how to map all possible variations of utterances with the same intent. Crowdsourcing may be an effective, inexpensive, and scalable technique for collecting such large datasets. However, the diversity of the results suffers from the priming effect (i.e. workers are more likely to use the words in the sentence we are asking to paraphrase). In this paper, we leverage priming as an opportunity rather than a threat: we dynamically generate word suggestions to motivate crowd workers towards producing diverse utterances. The key challenge is to make suggestions that can improve diversity without resulting in semantically invalid paraphrases. To achieve this, we propose a probabilistic model that generates continuously improved versions of word suggestions that balance diversity and semantic relevance. Our experiments show that the proposed approach improves the diversity of crowdsourced paraphrases.
Dynamic word recommendation to obtain diverse crowdsourced paraphrases of user utterances / Yaghoub-Zadeh-Fard, Mohammad-Ali; Benatallah, Boualem; Casati, Fabio; Barukh, Moshe Chai; Zamanirad, Shayan. - (2020), pp. 55-66. (Intervento presentato al convegno 25th ACM International Conference on Intelligent User Interfaces, IUI 2020 tenutosi a CAGLIARI nel 17 - 20 March, 2020) [10.1145/3377325.3377486].
Dynamic word recommendation to obtain diverse crowdsourced paraphrases of user utterances
Benatallah, Boualem;Casati, Fabio;
2020-01-01
Abstract
Building task-oriented bots requires mapping a user utterance to an intent with its associated entities to serve the request. Doing so is not easy since it requires large quantities of high-quality and diverse training data to learn how to map all possible variations of utterances with the same intent. Crowdsourcing may be an effective, inexpensive, and scalable technique for collecting such large datasets. However, the diversity of the results suffers from the priming effect (i.e. workers are more likely to use the words in the sentence we are asking to paraphrase). In this paper, we leverage priming as an opportunity rather than a threat: we dynamically generate word suggestions to motivate crowd workers towards producing diverse utterances. The key challenge is to make suggestions that can improve diversity without resulting in semantically invalid paraphrases. To achieve this, we propose a probabilistic model that generates continuously improved versions of word suggestions that balance diversity and semantic relevance. Our experiments show that the proposed approach improves the diversity of crowdsourced paraphrases.File | Dimensione | Formato | |
---|---|---|---|
3377325.3377486-2.pdf
accesso aperto
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
744.07 kB
Formato
Adobe PDF
|
744.07 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione