Crowdsourcing involves releasing tasks on the internet for people with diverse backgrounds and skills to solve. Its adoption has come a long way, from scaling up problem-solving to becoming an environment for running complex experiments. Designing tasks to obtain reliable results is not straightforward as it requires many design choices that grow with the complexity of crowdsourcing projects, often demanding multiple trial-and-error iterations to properly configure. These inherent characteristics of crowdsourcing, the complexity of the design space, and heterogeneity of the crowd, set quality control as a major concern, making it an integral part of task design. Despite all the progress and guidelines for developing effective tasks, crowdsourcing still is addressed as an ``art'' rather than an exact science, in part due to the challenges related to task design but also because crowdsourcing allows more complex use cases nowadays, where the support available has not yet caught up with this progress. This leaves researchers and practitioners at the forefront to often rely on intuitions instead of informed decisions. Running controlled experiments in crowdsourcing platforms is a prominent example. Despite their importance, experiments in these platforms are not yet first-class citizens, making researchers resort to building custom features to compensate for the lack of support, where pitfalls in this process may be detrimental to the experimental outcome. In this thesis, therefore, our goal is to attend to the need of moving crowdsourcing from art to science from two perspectives that interplay with each other: providing guidance on task design through experimentation, and supporting the experimentation process itself. First, we select classification problems as a use case, given their importance and pervasive nature, and aim to bring awareness, empirical evidence, and guidance to previously unexplored task design choices to address performance concerns. And second, we also aim to make crowdsourcing accessible to researchers and practitioners from all backgrounds, reducing the requirement of in-depth knowledge of known biases in crowdsourcing platforms, experimental methods, as well as programming skills to overcome the limitations of crowdsourcing providers while running experiments. We start by proposing task design strategies to address workers' performance, quality and time, in crowdsourced classification tasks. Then we distill the challenges associated with running controlled crowdsourcing experiments, propose coping strategies to address these challenges, and introduce solutions to help researchers report their crowdsourcing experiments, moving crowdsourcing forward to standardized reporting.
Strategies for addressing performance concerns and bias in designing, running, and reporting crowdsourcing experiment / Ramirez Medina, Jorge Daniel. - (2021 Nov 11), pp. 1-127. [10.15168/11572_321908]
Strategies for addressing performance concerns and bias in designing, running, and reporting crowdsourcing experiment
Ramirez Medina, Jorge Daniel
2021-11-11
Abstract
Crowdsourcing involves releasing tasks on the internet for people with diverse backgrounds and skills to solve. Its adoption has come a long way, from scaling up problem-solving to becoming an environment for running complex experiments. Designing tasks to obtain reliable results is not straightforward as it requires many design choices that grow with the complexity of crowdsourcing projects, often demanding multiple trial-and-error iterations to properly configure. These inherent characteristics of crowdsourcing, the complexity of the design space, and heterogeneity of the crowd, set quality control as a major concern, making it an integral part of task design. Despite all the progress and guidelines for developing effective tasks, crowdsourcing still is addressed as an ``art'' rather than an exact science, in part due to the challenges related to task design but also because crowdsourcing allows more complex use cases nowadays, where the support available has not yet caught up with this progress. This leaves researchers and practitioners at the forefront to often rely on intuitions instead of informed decisions. Running controlled experiments in crowdsourcing platforms is a prominent example. Despite their importance, experiments in these platforms are not yet first-class citizens, making researchers resort to building custom features to compensate for the lack of support, where pitfalls in this process may be detrimental to the experimental outcome. In this thesis, therefore, our goal is to attend to the need of moving crowdsourcing from art to science from two perspectives that interplay with each other: providing guidance on task design through experimentation, and supporting the experimentation process itself. First, we select classification problems as a use case, given their importance and pervasive nature, and aim to bring awareness, empirical evidence, and guidance to previously unexplored task design choices to address performance concerns. And second, we also aim to make crowdsourcing accessible to researchers and practitioners from all backgrounds, reducing the requirement of in-depth knowledge of known biases in crowdsourcing platforms, experimental methods, as well as programming skills to overcome the limitations of crowdsourcing providers while running experiments. We start by proposing task design strategies to address workers' performance, quality and time, in crowdsourced classification tasks. Then we distill the challenges associated with running controlled crowdsourcing experiments, propose coping strategies to address these challenges, and introduce solutions to help researchers report their crowdsourcing experiments, moving crowdsourcing forward to standardized reporting.File | Dimensione | Formato | |
---|---|---|---|
phd_unitn_Jorge_Ramirez.pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Altra licenza (Other type of license)
Dimensione
6.94 MB
Formato
Adobe PDF
|
6.94 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione