Large-scale morphological databases provide essential input to a wide range of NLP applications. Inflectional data is of particular importance for morphologically rich (agglutinative and highly inflecting) languages, and derivations can be used, e.g. to infer the semantics of out-of-vocabulary words. Extending the scope of state-of-the-art multilingual morphological databases, we announce the release of MorphyNet, a high-quality resource with 15 languages, 519k derivational and 10.1M inflectional entries, and a rich set of morphological features. MorphyNet was extracted from Wiktionary using both hand-crafted and automated methods, and was manually evaluated to be of a precision higher than 98%. Both the resource generation logic and the resulting database are made freely available12 and are reusable as stand-alone tools or in combination with existing resources.

MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology / Batsuren, Khuyagbaatar; Bella, Gabor; Giunchiglia, Fausto. - (2021), pp. 39-48. (Intervento presentato al convegno 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2021 tenutosi a Bangkok, Thailand (online) nel 5th August 2021).

MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology

Batsuren, Khuyagbaatar;Bella, Gabor;Giunchiglia, Fausto
2021-01-01

Abstract

Large-scale morphological databases provide essential input to a wide range of NLP applications. Inflectional data is of particular importance for morphologically rich (agglutinative and highly inflecting) languages, and derivations can be used, e.g. to infer the semantics of out-of-vocabulary words. Extending the scope of state-of-the-art multilingual morphological databases, we announce the release of MorphyNet, a high-quality resource with 15 languages, 519k derivational and 10.1M inflectional entries, and a rich set of morphological features. MorphyNet was extracted from Wiktionary using both hand-crafted and automated methods, and was manually evaluated to be of a precision higher than 98%. Both the resource generation logic and the resulting database are made freely available12 and are reusable as stand-alone tools or in combination with existing resources.
2021
18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology: Proceedings of the Workshop
209 N EIGHTH STREET, STROUDSBURG, PA 18360 USA
Association for Computational Linguistics (ACL)
978-1-954085-62-6
Batsuren, Khuyagbaatar; Bella, Gabor; Giunchiglia, Fausto
MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology / Batsuren, Khuyagbaatar; Bella, Gabor; Giunchiglia, Fausto. - (2021), pp. 39-48. (Intervento presentato al convegno 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, SIGMORPHON 2021 tenutosi a Bangkok, Thailand (online) nel 5th August 2021).
File in questo prodotto:
File Dimensione Formato  
MorphyNet_SIGMORPHON(2).pdf

accesso aperto

Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 314.34 kB
Formato Adobe PDF
314.34 kB Adobe PDF Visualizza/Apri
1_PDFsam_2021_SIGMORPHON_Proceedings.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 498.46 kB
Formato Adobe PDF
498.46 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/313146
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 20
  • ???jsp.display-item.citation.isi??? 5
  • OpenAlex ND
social impact