Parsing discourse is a challenging natural language processing task. In this paper we take a data driven approach to identify arguments of explicit discourse connectives. In contrast to previous work we do not make any assumptions on the span of arguments and consider parsing as a token-level sequence labeling task. We design the argument segmentation task as a cascade of decisions based on conditional random fields (CRFs). We train the CRFs on lexical, syntactic and semantic features extracted from the Penn Discourse Treebank and evaluate feature combinations on the commonly used test split. We show that the best combination of features includes syntactic and semantic features. The comparative error analysis investigates the performance variability over connective types and argument positions.
Scheda prodotto non validato
I dati visualizzati non sono stati ancora sottoposti a validazione formale da parte dello Staff di IRIS, ma sono stati ugualmente trasmessi al Sito Docente Cineca (Loginmiur).
Titolo: | Shallow Discourse Parsing with Conditional Random Fields |
Autori: | Ghosh, Sucheta; R., Johansson; Riccardi, Giuseppe; Tonelli, Sara |
Autori Unitn: | |
Titolo del volume contenente il saggio: | International Joint Conference on Natural Language Processing |
Luogo di edizione: | Thailand |
Casa editrice: | Haifeng Wang and David Yarowsky; Chiang Mai, Thailand; |
Anno di pubblicazione: | 2011 |
Handle: | http://hdl.handle.net/11572/92167 |
Appare nelle tipologie: | 04.1 Saggio in atti di convegno (Paper in proceedings) |