DecOp: A multilingual and multi-domain corpus for detecting deception in typed text

IRIS

In recent years, the increasing interest in the development of automatic approaches for unmasking deception in online sources led to promising results. Nonetheless, among the others, two major issues remain still unsolved: the stability of classifiers performances across different domains and languages. Tackling these issues is challenging since labelled corpora involving multiple domains and compiled in more than one language are few in the scientific literature. For filling this gap, in this paper we introduce DecOp (Deceptive Opinions), a new language resource developed for automatic deception detection in cross-domain and cross-language scenarios. DecOp is composed of 5000 examples of both truthful and deceitful first-person opinions balanced both across five different domains and two languages and, to the best of our knowledge, is the largest corpus allowing cross-domain and cross-language comparisons in deceit detection tasks. In this paper, we describe the collection procedure of the DecOp corpus and his main characteristics. Moreover, the human performance on the DecOp test-set and preliminary experiments by means of machine learning models based on Transformer architecture are shown.

DecOp: A multilingual and multi-domain corpus for detecting deception in typed text / Capuozzo, P., Lauriola, I., Strapparava, C., Aiolli, F., Sartori, G.. - (2020), pp. 1423-1430. (12th International Conference on Language Resources and Evaluation, LREC 2020 Palais du Pharo, fra 2020).

DecOp: A multilingual and multi-domain corpus for detecting deception in typed text

Capuozzo P.;Lauriola I.;Strapparava C.;Aiolli F.;Sartori G.

2020-01-01

Abstract

In recent years, the increasing interest in the development of automatic approaches for unmasking deception in online sources led to promising results. Nonetheless, among the others, two major issues remain still unsolved: the stability of classifiers performances across different domains and languages. Tackling these issues is challenging since labelled corpora involving multiple domains and compiled in more than one language are few in the scientific literature. For filling this gap, in this paper we introduce DecOp (Deceptive Opinions), a new language resource developed for automatic deception detection in cross-domain and cross-language scenarios. DecOp is composed of 5000 examples of both truthful and deceitful first-person opinions balanced both across five different domains and two languages and, to the best of our knowledge, is the largest corpus allowing cross-domain and cross-language comparisons in deceit detection tasks. In this paper, we describe the collection procedure of the DecOp corpus and his main characteristics. Moreover, the human performance on the DecOp test-set and preliminary experiments by means of machine learning models based on Transformer architecture are shown.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2020
			
	Titolo del volume (Proceedings title)
	
				LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
			
	Luogo di edizione (Place of publication)
	
				France
			
	Casa editrice (Publisher)
	
				European Language Resources Association (ELRA)
			
	ISBN
	
				979-10-95546-34-4
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-85096606076
			
	Codice WOS (WOS identifier)
	
				WOS:000724697201062
			
	Tutti gli autori
	
						Capuozzo, P.; Lauriola, I.; Strapparava, C.; Aiolli, F.; Sartori, G.
					
	Citazione
	
				DecOp: A multilingual and multi-domain corpus for detecting deception in typed text / Capuozzo, P., Lauriola, I., Strapparava, C., Aiolli, F., Sartori, G.. - (2020), pp. 1423-1430. (12th International Conference on Language Resources and Evaluation, LREC 2020 Palais du Pharo, fra 2020).
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/341957

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

21

15

ND

social impact