An actual trend in the computational linguistics and natural language processing is the implementation of multilingual utilities for different tasks, like information retrival, summarization of documents in different languages or machine translation, tasks in which the resolution of anaphoric references plays a crucial role. This dissertation presents a proposal of annotation scheme for the creation of corpus resources for linguistic based multilingual anaphora resolution. This scheme has been implemented for the annotation of English and Italian data. Inter-annotator agreement studies show that the annotation scheme is relaiable. The annotated corpora have been used for the anaphora resolution task, and the results have been compared with well known corpora. Finally hand annotated linguistic features have been used to help in the anaphora resolution process. The results show that our multilingual annotation scheme proposal has been utilized to produce data useful to build anaphora resolution systems for languages with different grammatical and typological features, like English and Italian.

Resources for linguistically motivated Multilingual Anaphora Resolution / Rodriguez, Kepa Joseba. - (2010), pp. 1-109.

Resources for linguistically motivated Multilingual Anaphora Resolution

Rodriguez, Kepa Joseba
2010-01-01

Abstract

An actual trend in the computational linguistics and natural language processing is the implementation of multilingual utilities for different tasks, like information retrival, summarization of documents in different languages or machine translation, tasks in which the resolution of anaphoric references plays a crucial role. This dissertation presents a proposal of annotation scheme for the creation of corpus resources for linguistic based multilingual anaphora resolution. This scheme has been implemented for the annotation of English and Italian data. Inter-annotator agreement studies show that the annotation scheme is relaiable. The annotated corpora have been used for the anaphora resolution task, and the results have been compared with well known corpora. Finally hand annotated linguistic features have been used to help in the anaphora resolution process. The results show that our multilingual annotation scheme proposal has been utilized to produce data useful to build anaphora resolution systems for languages with different grammatical and typological features, like English and Italian.
2010
XXIII
2010-2011
Scienze della Cogn e della Form (cess.4/11/12)
Cognitive and Brain Sciences
Poesio, Massimo
no
Inglese
Settore L-LIN/01 - Glottologia e Linguistica
File in questo prodotto:
File Dimensione Formato  
PhD-Rodriguez.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.4 MB
Formato Adobe PDF
2.4 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/367836
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact