We present experiments that show the influence of native language on lexical choice when producing text in another language – in this particular case English. We start from the premise that non-native English speakers will choose lexical items that are close to words in their native language. This leads us to an etymology based representation of documents written by people whose mother tongue is an IndoEuropean language. Based on this representation we grow a language family tree, that matches closely the Indo-European language tree.
Word Etymology as Native Language Interference / Nastase, Vivi; Strapparava, Carlo. - (2017), pp. 2702-2707. (Intervento presentato al convegno Empirical Methods in Natural Language Processing (EMNLP 2017) tenutosi a Copenhagen, Denmark nel September) [10.18653/v1/D17-1286].
Word Etymology as Native Language Interference
Carlo Strapparava
2017-01-01
Abstract
We present experiments that show the influence of native language on lexical choice when producing text in another language – in this particular case English. We start from the premise that non-native English speakers will choose lexical items that are close to words in their native language. This leads us to an etymology based representation of documents written by people whose mother tongue is an IndoEuropean language. Based on this representation we grow a language family tree, that matches closely the Indo-European language tree.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione