Text classification is one of the most common goals of machine learning (ML) projects, and also one of the most frequent human intelligence tasks in crowdsourcing platforms. ML has mixed success in such tasks depending on the nature of the problem, while crowd-based classification has proven to be surprisingly effective, but can be expensive. Recently, hybrid text classification algorithms, combining human computation and machine learning, have been proposed to improve accuracy and reduce costs. One way to do so is to have ML highlight or emphasize portions of text that it believes to be more relevant to the decision. Humans can then rely only on this text or read the entire text if the highlighted information is insufficient. In this paper, we investigate if and under what conditions highlighting selected parts of the text can (or cannot) improve classification cost and/or accuracy, and in general how it affects the process and outcome of the human intelligence tasks. We study this through a series of crowdsourcing experiments running over different datasets and with task designs imposing different cognitive demands. Our findings suggest that highlighting is effective in reducing classification effort but does not improve accuracy - and in fact, low-quality highlighting can decrease it.

Understanding the Impact of Text Highlighting in Crowdsourcing Tasks / Ramirez, Jorge; Baez, Marcos; Casati, Fabio; Benatallah, Boualem. - (2019). (Intervento presentato al convegno Seventh AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2019) tenutosi a Skamania Lodge, WA nel Oct 28–30, 2019).

Understanding the Impact of Text Highlighting in Crowdsourcing Tasks

Ramirez, Jorge;Baez, Marcos;Casati, Fabio;Benatallah, Boualem
2019-01-01

Abstract

Text classification is one of the most common goals of machine learning (ML) projects, and also one of the most frequent human intelligence tasks in crowdsourcing platforms. ML has mixed success in such tasks depending on the nature of the problem, while crowd-based classification has proven to be surprisingly effective, but can be expensive. Recently, hybrid text classification algorithms, combining human computation and machine learning, have been proposed to improve accuracy and reduce costs. One way to do so is to have ML highlight or emphasize portions of text that it believes to be more relevant to the decision. Humans can then rely only on this text or read the entire text if the highlighted information is insufficient. In this paper, we investigate if and under what conditions highlighting selected parts of the text can (or cannot) improve classification cost and/or accuracy, and in general how it affects the process and outcome of the human intelligence tasks. We study this through a series of crowdsourcing experiments running over different datasets and with task designs imposing different cognitive demands. Our findings suggest that highlighting is effective in reducing classification effort but does not improve accuracy - and in fact, low-quality highlighting can decrease it.
2019
Proceedings of the Seventh AAAI Conference on Human Computation and Crowdsourcing
2275 East Bayshore Road, Suite 160 Palo Alto, California 94303 USA
Association for the Advancement of Artificial Intelligence
Ramirez, Jorge; Baez, Marcos; Casati, Fabio; Benatallah, Boualem
Understanding the Impact of Text Highlighting in Crowdsourcing Tasks / Ramirez, Jorge; Baez, Marcos; Casati, Fabio; Benatallah, Boualem. - (2019). (Intervento presentato al convegno Seventh AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2019) tenutosi a Skamania Lodge, WA nel Oct 28–30, 2019).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/268240
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact