An actual trend in the computational linguistics and natural language processing is the implementation of multilingual utilities for different tasks, like information retrival, summarization of documents in different languages or machine translation, tasks in which the resolution of anaphoric references plays a crucial role. This dissertation presents a proposal of annotation scheme for the creation of corpus resources for linguistic based multilingual anaphora resolution. This scheme has been implemented for the annotation of English and Italian data. Inter-annotator agreement studies show that the annotation scheme is relaiable. The annotated corpora have been used for the anaphora resolution task, and the results have been compared with well known corpora. Finally hand annotated linguistic features have been used to help in the anaphora resolution process. The results show that our multilingual annotation scheme proposal has been utilized to produce data useful to build anaphora resolution systems for languages with different grammatical and typological features, like English and Italian.
Resources for linguistically motivated Multilingual Anaphora Resolution / Rodriguez, Kepa Joseba. - (2010), pp. 1-109.
Resources for linguistically motivated Multilingual Anaphora Resolution
Rodriguez, Kepa Joseba
2010-01-01
Abstract
An actual trend in the computational linguistics and natural language processing is the implementation of multilingual utilities for different tasks, like information retrival, summarization of documents in different languages or machine translation, tasks in which the resolution of anaphoric references plays a crucial role. This dissertation presents a proposal of annotation scheme for the creation of corpus resources for linguistic based multilingual anaphora resolution. This scheme has been implemented for the annotation of English and Italian data. Inter-annotator agreement studies show that the annotation scheme is relaiable. The annotated corpora have been used for the anaphora resolution task, and the results have been compared with well known corpora. Finally hand annotated linguistic features have been used to help in the anaphora resolution process. The results show that our multilingual annotation scheme proposal has been utilized to produce data useful to build anaphora resolution systems for languages with different grammatical and typological features, like English and Italian.File | Dimensione | Formato | |
---|---|---|---|
PhD-Rodriguez.pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.4 MB
Formato
Adobe PDF
|
2.4 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione