Data labeling systems are designed to facilitate the training and validation of machine learning algorithms under the umbrella of crowdsourcing practices. The current paper presents a novel approach for designing a customized data labeling system, emphasizing two key aspects: an innovative payment mechanism for users and an efficient configuration of output results. The main problem addressed is the labeling of datasets where golden items are utilized to verify user performance and assure the quality of the annotated outputs. Our proposed payment mechanism is enhanced through a modified skip-based golden-oriented function that balances user penalties and prevents spam activities. Additionally, we introduce a comprehensive reporting framework to measure aggregated results and accuracy levels, ensuring the reliability of the labeling output. Our findings indicate that the proposed solutions are pivotal in incentivizing user participation, thereby reinforcing the applicability and profitability of newly launched labeling systems.
A System Design Perspective for Business Growth in a Crowdsourced Data Labeling Practice / Hajipour, Vahid; Jalali, Sajjad; Santos-Arteaga, Francisco Javier; Vazifeh Noshafagh, Samira; Di Caprio, Debora. - In: ALGORITHMS. - ISSN 1999-4893. - 17:8(2024), p. 357. [10.3390/a17080357]
A System Design Perspective for Business Growth in a Crowdsourced Data Labeling Practice
Vazifeh Noshafagh, SamiraPenultimo
;Di Caprio, DeboraUltimo
2024-01-01
Abstract
Data labeling systems are designed to facilitate the training and validation of machine learning algorithms under the umbrella of crowdsourcing practices. The current paper presents a novel approach for designing a customized data labeling system, emphasizing two key aspects: an innovative payment mechanism for users and an efficient configuration of output results. The main problem addressed is the labeling of datasets where golden items are utilized to verify user performance and assure the quality of the annotated outputs. Our proposed payment mechanism is enhanced through a modified skip-based golden-oriented function that balances user penalties and prevents spam activities. Additionally, we introduce a comprehensive reporting framework to measure aggregated results and accuracy levels, ensuring the reliability of the labeling output. Our findings indicate that the proposed solutions are pivotal in incentivizing user participation, thereby reinforcing the applicability and profitability of newly launched labeling systems.File | Dimensione | Formato | |
---|---|---|---|
algorithms-17-00357-v2.pdf
accesso aperto
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Creative commons
Dimensione
4.56 MB
Formato
Adobe PDF
|
4.56 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione