The assessment of how a deceptive message is produced in different languages has received little attention, with the majority of studies focused on the English language. Moreover, there is no agreement about the stability of linguistic clues of deceit across different languages. In this paper, we address this issue by analysing both theory-driven linguistic markers of deception (cognitive load hypothesis) and standard text categorisation features. After compiling a multilingual corpus of both honest and deceitful first-person opinions regarding five different topics, we assessed the cross-language applicability of four different features sets in within-topic, cross-topic and cross-language binary classification experiments. Results showed promising classification performances in all the three experiments with few exceptions. Interestingly, linguistic markers of deceit linked to the cognitive load hypothesis exhibited the same trend in the two languages under investigation and the cross-language evaluation highlighted their usefulness in spotting deceit between different languages.
Automatic Detection of Cross-language Verbal Deception / Capuozzo, Pasquale; Lauriola, Ivano; Strapparava, Carlo; Aiolli, Fabio; Sartori, Giuseppe. - ELETTRONICO. - (2020), pp. 1756-1762. (Intervento presentato al convegno 42nd Annual Conference of the Cognitive Science Society (CogSci'20) tenutosi a Virtual conference nel 29 July – 1 August).
Automatic Detection of Cross-language Verbal Deception
Carlo Strapparava;
2020-01-01
Abstract
The assessment of how a deceptive message is produced in different languages has received little attention, with the majority of studies focused on the English language. Moreover, there is no agreement about the stability of linguistic clues of deceit across different languages. In this paper, we address this issue by analysing both theory-driven linguistic markers of deception (cognitive load hypothesis) and standard text categorisation features. After compiling a multilingual corpus of both honest and deceitful first-person opinions regarding five different topics, we assessed the cross-language applicability of four different features sets in within-topic, cross-topic and cross-language binary classification experiments. Results showed promising classification performances in all the three experiments with few exceptions. Interestingly, linguistic markers of deceit linked to the cognitive load hypothesis exhibited the same trend in the two languages under investigation and the cross-language evaluation highlighted their usefulness in spotting deceit between different languages.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione