The IndoWordNet is an Indian language lexical resource. The project started with HindiWordNet, which was manually built from various resources with the preference for culture-specific synsets. Other languages were added later. The development approach used in IndoWordNet is very similar to that used in Princeton WordNet (PWN). PWN is a semantic network where English synsets are nodes, and semantic relations are edges connecting them. Due to the popularity of PWN, IndoWordNet also connected Hindi and English languages through direct (synonymy) and hypernymy linkages between their synsets. Due to the diversity of the languages, these linkages generate three types of mappings between IndoWordNet and PWN which generate the misalignment. This paper proposes to align the IndoWordNet with PWN using a largescale lexical-semantic resource called Universal Knowledge Core (UKC), which forms asemantic network where nodes are language-independent concepts. In the UKC semantic relations connect concepts and not synsets.

Aligning the IndoWordNet with the Princeton WordNet / Chandran Nair, Nandu; Rajendran, S. Velayuthan Nair; Batsuren, Khuyagbaatar. - (2019), pp. 9-16. (Intervento presentato al convegno ICNLSP 2019 tenutosi a Trento nel 12th-13th September 2019).

Aligning the IndoWordNet with the Princeton WordNet

Chandran Nair, Nandu;Batsuren, Khuyagbaatar
2019-01-01

Abstract

The IndoWordNet is an Indian language lexical resource. The project started with HindiWordNet, which was manually built from various resources with the preference for culture-specific synsets. Other languages were added later. The development approach used in IndoWordNet is very similar to that used in Princeton WordNet (PWN). PWN is a semantic network where English synsets are nodes, and semantic relations are edges connecting them. Due to the popularity of PWN, IndoWordNet also connected Hindi and English languages through direct (synonymy) and hypernymy linkages between their synsets. Due to the diversity of the languages, these linkages generate three types of mappings between IndoWordNet and PWN which generate the misalignment. This paper proposes to align the IndoWordNet with PWN using a largescale lexical-semantic resource called Universal Knowledge Core (UKC), which forms asemantic network where nodes are language-independent concepts. In the UKC semantic relations connect concepts and not synsets.
2019
ICNLSP 2019: Proceedings of the 3rd International Conference on Natural Language and Speech Processing
Stroudsburg, PA
Association for Computational Linguistics
978-1-950737-62-8
Chandran Nair, Nandu; Rajendran, S. Velayuthan Nair; Batsuren, Khuyagbaatar
Aligning the IndoWordNet with the Princeton WordNet / Chandran Nair, Nandu; Rajendran, S. Velayuthan Nair; Batsuren, Khuyagbaatar. - (2019), pp. 9-16. (Intervento presentato al convegno ICNLSP 2019 tenutosi a Trento nel 12th-13th September 2019).
File in questo prodotto:
File Dimensione Formato  
Aligning the IndoWordNet with the Princeton WordNet.pdf

accesso aperto

Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.91 MB
Formato Adobe PDF
1.91 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/251461
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact