Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering

IRIS

Fighting online hate speech is a challenge that is usually addressed using Natural Language Processing via automatic detection and removal of hate content. Besides this approach, counter narratives have emerged as an effective tool employed by NGOs to respond to online hate on social media platforms. For this reason, Natural Language Generation is currently being studied as a way to automatize counter narrative writing. However, the existing resources necessary to train NLG models are limited to 2-turn interactions (a hate speech and a counter narrative as response), while in real life, interactions can consist of multiple turns. In this paper, we present a hybrid approach for dialogical data collection, which combines the intervention of human expert annotators over machine generated dialogues obtained using 19 different configurations. The result of this work is DIALOCONAN, the first dataset comprising over 3000 fictitious multi-turn dialogues between a hater and an NGO operator, covering 6 targets of hate.

Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering / Bonaldi, Helena; Dellantonio, Sara; Tekiroglu, Serra Sinem; Guerini, Marco. - (2022), pp. 8031-8049. (Intervento presentato al convegno 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 tenutosi a Abu Dhabi, United Arab Emirates nel 7th-11th December 2022).

Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering

Bonaldi Helena^Primo;Dellantonio Sara^Secondo;Tekiroglu Serra Sinem^Penultimo;Guerini Marco^Ultimo

2022-01-01

Abstract

Fighting online hate speech is a challenge that is usually addressed using Natural Language Processing via automatic detection and removal of hate content. Besides this approach, counter narratives have emerged as an effective tool employed by NGOs to respond to online hate on social media platforms. For this reason, Natural Language Generation is currently being studied as a way to automatize counter narrative writing. However, the existing resources necessary to train NLG models are limited to 2-turn interactions (a hate speech and a counter narrative as response), while in real life, interactions can consist of multiple turns. In this paper, we present a hybrid approach for dialogical data collection, which combines the intervention of human expert annotators over machine generated dialogues obtained using 19 different configurations. The result of this work is DIALOCONAN, the first dataset comprising over 3000 fictitious multi-turn dialogues between a hater and an NGO operator, covering 6 targets of hate.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2022
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
			
	Luogo di edizione (Place of publication)
	
				209 N. Eighth Street, Stroudsburg PA 18360, USA
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics (ACL)
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85149437997
			
	Tutti gli autori
	
						Bonaldi, Helena; Dellantonio, Sara; Tekiroglu, Serra Sinem; Guerini, Marco
					
	Citazione
	
				Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering / Bonaldi, Helena; Dellantonio, Sara; Tekiroglu, Serra Sinem; Guerini, Marco. - (2022), pp. 8031-8049. (Intervento presentato al  convegno 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 tenutosi a Abu Dhabi, United Arab Emirates nel 7th-11th December 2022).
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
2022.emnlp-main.549.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 456.41 kB Formato Adobe PDF Visualizza/Apri	456.41 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/370032

Citazioni

ND

27

ND

ND

social impact