Building a wordnet from scratch is a huge task, especially for languages less equipped with pre-existing lexical resources such as thesauri or bilingual dictionaries. We address the issue of costliness of human supervision through crowdsourcing that offers a good trade-off between quality of output and speed of progress. In this paper, we demonstrate a two-phase crowdsourcing workflow that consists of a synset localization step followed by a validation step. Validation is performed using the inter-rater agreement metrics Fleiss’ kappa and Krippendorf’s alpha, which allow us to estimate the precision of the result, as well as to set a balance between precision and recall. In our experiment, 947 synsets were localized from English to Mongolian and evaluated through crowdsourcing with the precision of 0.

Using Crowd Agreement for Wordnet Localization / Ganbold, Amarsanaa; Chagnaa, Altangerel; Bella, Gabor. - (2018), pp. 474-478. (Intervento presentato al convegno LREC 2018 tenutosi a Miyazaki, Japan nel 7th-12th May 2018).

Using Crowd Agreement for Wordnet Localization

Bella, Gabor
2018-01-01

Abstract

Building a wordnet from scratch is a huge task, especially for languages less equipped with pre-existing lexical resources such as thesauri or bilingual dictionaries. We address the issue of costliness of human supervision through crowdsourcing that offers a good trade-off between quality of output and speed of progress. In this paper, we demonstrate a two-phase crowdsourcing workflow that consists of a synset localization step followed by a validation step. Validation is performed using the inter-rater agreement metrics Fleiss’ kappa and Krippendorf’s alpha, which allow us to estimate the precision of the result, as well as to set a balance between precision and recall. In our experiment, 947 synsets were localized from English to Mongolian and evaluated through crowdsourcing with the precision of 0.
2018
Eleventh International Conference on Language Resources and Evaluation, held under the patronage of the Japanese Ministry of Education, Culture, Sports, Science and Technology (MEXT): Conference Proceedings
Paris
European Language Resources Association (ELRA)
979-10-95546-00-9
Ganbold, Amarsanaa; Chagnaa, Altangerel; Bella, Gabor
Using Crowd Agreement for Wordnet Localization / Ganbold, Amarsanaa; Chagnaa, Altangerel; Bella, Gabor. - (2018), pp. 474-478. (Intervento presentato al convegno LREC 2018 tenutosi a Miyazaki, Japan nel 7th-12th May 2018).
File in questo prodotto:
File Dimensione Formato  
L18-1074.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 269.08 kB
Formato Adobe PDF
269.08 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/313142
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 4
  • OpenAlex ND
social impact