Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study

IRIS

In this work, we present an extensive study on the use of pre-trained language models for the task of automatic Counter Narrative (CN) generation to fight online hate speech in English. We first present a comparative study to determine whether there is a particular Language Model (or class of LMs) and a particular decoding mechanism that are the most appropriate to generate CNs. Findings show that autoregressive models combined with stochastic decodings are the most promising. We then investigate how an LM performs in generating a CN with regard to an unseen target of hate. We find out that a key element for successful {`}out of target{'} experiments is not an overall similarity with the training data but the presence of a specific subset of training data, i. e. a target that shares some commonalities with the test target that can be defined a-priori. We finally introduce the idea of a pipeline based on the addition of an automatic post-editing step to refine generated CNs.

Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study / Tekiroglu, Serra Sinem; Bonaldi, Helena; Fanton, Margherita; Guerini, Marco. - (2022), pp. 3099-3114. (Intervento presentato al convegno 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 tenutosi a Dublin, Ireland nel 22nd-27th May, 2022) [10.18653/v1/2022.findings-acl.245].

Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study

Tekiroglu Serra Sinem^Primo;Bonaldi Helena^Secondo;Fanton Margherita^Penultimo;Guerini Marco^Ultimo

2022-01-01

Abstract

In this work, we present an extensive study on the use of pre-trained language models for the task of automatic Counter Narrative (CN) generation to fight online hate speech in English. We first present a comparative study to determine whether there is a particular Language Model (or class of LMs) and a particular decoding mechanism that are the most appropriate to generate CNs. Findings show that autoregressive models combined with stochastic decodings are the most promising. We then investigate how an LM performs in generating a CN with regard to an unseen target of hate. We find out that a key element for successful {`}out of target{'} experiments is not an overall similarity with the training data but the presence of a specific subset of training data, i. e. a target that shares some commonalities with the test target that can be defined a-priori. We finally introduce the idea of a pipeline based on the addition of an automatic post-editing step to refine generated CNs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2022
			
	Titolo del volume (Proceedings title)
	
				Findings of the Association for Computational Linguistics: ACL 2022
			
	Luogo di edizione (Place of publication)
	
				209 N. Eighth Street, Stroudsburg PA 18360, USA
			
	Casa editrice (Publisher)
	
				Association for Computational Linguistics (ACL)
			
	ISBN
	
				978-1-955917-25-4
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85140792267
			
	Codice WOS (WOS identifier)
	
				WOS:000828767403016
			
	Tutti gli autori
	
						Tekiroglu, Serra Sinem; Bonaldi, Helena; Fanton, Margherita; Guerini, Marco
					
	Citazione
	
				Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study / Tekiroglu, Serra Sinem; Bonaldi, Helena; Fanton, Margherita; Guerini, Marco. - (2022), pp. 3099-3114. (Intervento presentato al  convegno 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 tenutosi a Dublin, Ireland nel 22nd-27th May, 2022) [10.18653/v1/2022.findings-acl.245].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
2022.findings-acl.245.pdf accesso aperto Tipologia: Versione editoriale (Publisher’s layout) Licenza: Creative commons Dimensione 403.12 kB Formato Adobe PDF Visualizza/Apri	403.12 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/370031

Citazioni

ND

19

2

ND

social impact