In this paper we present a novel treebank developed to analyse marked constructions in Italian called MarkIT. The resource contains almost 1,300 sentences manually annotated with dependency relations following the Universal Dependencies paradigm. The sentences have been extracted from essays written by high-school students along several years, which accounts for the structure and the topic variability of the sentences. In this work, we detail the process to select the sentences, parse them automatically and then manually correct them. The resource covers seven types of marked constructions (839 sentences overall) plus some sentences, whose syntax can be wrongly classified as marked and which can serve as negative examples of markedness (453 sentences). We also present an evaluation of parsing performance, comparing a model trained on existing Italian treebanks with the model obtained by adding MarkIT to the training set.

Adding a Novel Italian Treebank of Marked Constructions to Universal Dependencies / Paccosi, Teresa; Palmero Aprosio, Alessio; Tonelli, Sara. - In: IJCOL. - ISSN 2499-4553. - 9:1(2023). [10.4000/ijcol.1110]

Adding a Novel Italian Treebank of Marked Constructions to Universal Dependencies

Paccosi, Teresa;Palmero Aprosio, Alessio;Tonelli, Sara
2023-01-01

Abstract

In this paper we present a novel treebank developed to analyse marked constructions in Italian called MarkIT. The resource contains almost 1,300 sentences manually annotated with dependency relations following the Universal Dependencies paradigm. The sentences have been extracted from essays written by high-school students along several years, which accounts for the structure and the topic variability of the sentences. In this work, we detail the process to select the sentences, parse them automatically and then manually correct them. The resource covers seven types of marked constructions (839 sentences overall) plus some sentences, whose syntax can be wrongly classified as marked and which can serve as negative examples of markedness (453 sentences). We also present an evaluation of parsing performance, comparing a model trained on existing Italian treebanks with the model obtained by adding MarkIT to the training set.
2023
1
Paccosi, Teresa; Palmero Aprosio, Alessio; Tonelli, Sara
Adding a Novel Italian Treebank of Marked Constructions to Universal Dependencies / Paccosi, Teresa; Palmero Aprosio, Alessio; Tonelli, Sara. - In: IJCOL. - ISSN 2499-4553. - 9:1(2023). [10.4000/ijcol.1110]
File in questo prodotto:
File Dimensione Formato  
ijcol-1110.pdf

accesso aperto

Descrizione: The text only may be used under licence CC BY-NC-ND 4.0. All other elements (illustrations, imported files) are “All rights reserved”, unless otherwise stated.
Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 1.08 MB
Formato Adobe PDF
1.08 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/412713
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact