Deception Detection in Italian Court testimonies

Fornaciari, Tommaso

doi:10.15168/11572_369179

Effective methods for evaluating the reliability of statements issued by witnesses and defendants in hearings would be extremely valuable to decision-making in Court and other legal settings. In recent years, methods relying on stylometric techniques have proven most successful for this task; but few such methods have been tested with language collected in real-life situations of high-stakes deception, and therefore their usefulness outside laboratory conditions still has to be properly assessed. DeCour - DEception in COURt corpus - has been built with the aim of training models suitable to discriminate, from a stylometric point of view, between sincere and deceptive statements. DeCour is a collection of hearings held in four Italian Courts, in which the speakers lie in front of the judge. These hearings become the object of a specific criminal proceeding for calumny or false testimony, in which the deceptiveness of the statements of the defendant is ascertained. Thanks to the final Court judgment, that points out which lies are told, each utterance of the corpus has been annotated as true, uncertain or false, according to its degree of truthfulness. Since the judgment of deceptiveness follows a judicial inquiry, the annotation has been realized with a greater degree of confidence than ever before. In Italy this is the first corpus of deceptive texts not relying on ‘mock’ lies created in laboratory conditions, but which has been collected in a natural environment. In this dissertation we replicated the methods used in previous studies but never before applied to high-stakes data, and tested new methods. Among the best known proposals in this direction are methods proposed by Pennebaker and colleagues, who employed their lexicon - the Linguistic Inquiry and Word Count (liwc) - to analyze different texts or transcriptions of spoken language, in which deception could have been used, but collected in an artificial way. In our experiments, we trained machine learning models relying both on lexical features belonging to liwc and on surface features. The surface features were selected calculating their Information Gain, or simply according to the frequency they appear in the texts. We also considered the effect of a number of variables including the degree of certainty the utterances were annotated as truthful or not and the homogeneity of the dataset. In particular, the classification task of false utterances was carried out against the only utterances annotated as true, or against the utterances annotated as true and as uncertain together. Moreover subsets of DeCour were analysed, in which the statements were issued by homogeneous categories of subject, e.g. speakers of the same gender, age or native language. Our results suggest that accuracy at deception detection clearly above chance level can be obtained with real-life data as well.

Deception Detection in Italian Court testimonies / Fornaciari, Tommaso. - (2012), pp. 1-105. [10.15168/11572_369179]

Deception Detection in Italian Court testimonies

Fornaciari, Tommaso

2012-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				2012
			
	Ciclo
	
				XXV
			
	Anno Accademico
	
				2011-2012
			
	Dipartimento
	
				CIMEC (29/10/12-)
			
	Corso di dottorato
	
				Cognitive and Brain Sciences
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Poesio, Massimo
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Codice DOI
	
				https://dx.doi.org/10.15168/11572_369179
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore L-LIN/01 - Glottologia e Linguistica
Settore M-PSI/01 - Psicologia Generale
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
tfthesis.pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.04 MB Formato Adobe PDF Visualizza/Apri	1.04 MB	Adobe PDF	Visualizza/Apri