Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle the problem from an algorithmic perspective, so to reduce the need for annotated data, less attention has been paid to the quality of these data. Following a trend that has emerged recently, we focus on the level of agreement among annotators while selecting data to create offensive language datasets, a task involving a high level of subjectivity. Our study comprises the creation of three novel datasets of English tweets covering different topics and having five crowd-sourced judgments each. We also present an extensive set of experiments showing that selecting training and test data according to different levels of annotators’ agreement has a strong effect on classifiers performance and robustness. Our findings are further validated in cross-domain experiments and studied using a popular benchmark dataset. We show that such hard cases, where low agreement is present, are not necessarily due to poor-quality annotation and we advocate for a higher presence of ambiguous cases in future datasets, in order to train more robust systems and better account for the different points of view expressed online.

Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement / Leonardelli, Elisa; Menini, Stefano; Palmero Aprosio, Alessio; Guerini, Marco; Tonelli, Sara. - (2021), pp. 10528-10539. (Intervento presentato al convegno 2021 Conference on Empirical Methods in Natural Language Processing tenutosi a Online and Punta Cana, Dominican Republic nel November 2021) [10.18653/v1/2021.emnlp-main.822].

Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement

Leonardelli, Elisa;Menini, Stefano;Palmero Aprosio, Alessio;Guerini, Marco;Tonelli, Sara
2021-01-01

Abstract

Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle the problem from an algorithmic perspective, so to reduce the need for annotated data, less attention has been paid to the quality of these data. Following a trend that has emerged recently, we focus on the level of agreement among annotators while selecting data to create offensive language datasets, a task involving a high level of subjectivity. Our study comprises the creation of three novel datasets of English tweets covering different topics and having five crowd-sourced judgments each. We also present an extensive set of experiments showing that selecting training and test data according to different levels of annotators’ agreement has a strong effect on classifiers performance and robustness. Our findings are further validated in cross-domain experiments and studied using a popular benchmark dataset. We show that such hard cases, where low agreement is present, are not necessarily due to poor-quality annotation and we advocate for a higher presence of ambiguous cases in future datasets, in order to train more robust systems and better account for the different points of view expressed online.
2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Punta Cana, Dominican Republic
Association for Computational Linguistics
978-1-955917-09-4
Leonardelli, Elisa; Menini, Stefano; Palmero Aprosio, Alessio; Guerini, Marco; Tonelli, Sara
Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement / Leonardelli, Elisa; Menini, Stefano; Palmero Aprosio, Alessio; Guerini, Marco; Tonelli, Sara. - (2021), pp. 10528-10539. (Intervento presentato al convegno 2021 Conference on Empirical Methods in Natural Language Processing tenutosi a Online and Punta Cana, Dominican Republic nel November 2021) [10.18653/v1/2021.emnlp-main.822].
File in questo prodotto:
File Dimensione Formato  
2021.emnlp-main.822.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 556.17 kB
Formato Adobe PDF
556.17 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/412973
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact