This paper introduces CogNet, a new, large-scale lexical database that provides cognates—words of common origin and meaning—across languages. The database currently contains 3.1 million cognate pairs across 338 languages using 35 writing sys- tems. The paper also describes the automated method by which cognates were computed from publicly available wordnets, with an accuracy evaluated to 94%. Finally, statistics and early insights about the cognate data are presented, hinting at a possible future exploitation of the resource1 by various fields of lingustics.
CogNet: a Large-Scale Cognate Database / Batsuren, Khuyagbaatar; Bella, Gabor; Giunchiglia, Fausto. - (2020), pp. 3136-3145. (Intervento presentato al convegno 57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 tenutosi a Firenze nel 28th July-2nd August 2019) [10.18653/v1/P19-1302].
CogNet: a Large-Scale Cognate Database
Batsuren, Khuyagbaatar;Bella, Gabor;Giunchiglia, Fausto
2020-01-01
Abstract
This paper introduces CogNet, a new, large-scale lexical database that provides cognates—words of common origin and meaning—across languages. The database currently contains 3.1 million cognate pairs across 338 languages using 35 writing sys- tems. The paper also describes the automated method by which cognates were computed from publicly available wordnets, with an accuracy evaluated to 94%. Finally, statistics and early insights about the cognate data are presented, hinting at a possible future exploitation of the resource1 by various fields of lingustics.File | Dimensione | Formato | |
---|---|---|---|
2019-ACL-Cognet.pdf
accesso aperto
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.73 MB
Formato
Adobe PDF
|
2.73 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione