The availability of large scale data sets of manually annotated predicate argument structures has recently favored the use of Machine Learning approaches to the design of automated Semantic Role Labeling (SRL) systems. The main research in this area relates to the design choices for feature representation and for effective decompositions of the task in different learning models. Regarding the former choice, structural properties of full syntactic parses are largely employed as they represent ways to encode different principles suggested by the linking theory between syntax and semantics. The latter choice relates to several learning schemes over global views of the parses. For example, re-ranking stages operating over alternative predicate-argument sequences of the same sentence have shown to be very effective. In this paper, we propose several kernel functions to model parse tree properties in kernel-based machines, e.g., Perceptrons or Support Vector Machines. In particular, we define different kinds of tree kernels as general approaches to feature engineering in SRL. Moreover, we extensively experiment with such kernels to investigate their contribution on individual stages of an SRL architecture both in isolation and in combination with other traditional manually-coded features. The results on boundary recognition, classification and re-ranking stages provide systematic evidence about the significant impact of tree kernels on the overall accuracy, especially when the amount of training data is small. As a conclusive result, tree kernels allow for a general and easily portable feature engineering method which is applicable to a large family of Natural Language Processing tasks.

Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees

Moschitti, Alessandro
2006-01-01

Abstract

The availability of large scale data sets of manually annotated predicate argument structures has recently favored the use of Machine Learning approaches to the design of automated Semantic Role Labeling (SRL) systems. The main research in this area relates to the design choices for feature representation and for effective decompositions of the task in different learning models. Regarding the former choice, structural properties of full syntactic parses are largely employed as they represent ways to encode different principles suggested by the linking theory between syntax and semantics. The latter choice relates to several learning schemes over global views of the parses. For example, re-ranking stages operating over alternative predicate-argument sequences of the same sentence have shown to be very effective. In this paper, we propose several kernel functions to model parse tree properties in kernel-based machines, e.g., Perceptrons or Support Vector Machines. In particular, we define different kinds of tree kernels as general approaches to feature engineering in SRL. Moreover, we extensively experiment with such kernels to investigate their contribution on individual stages of an SRL architecture both in isolation and in combination with other traditional manually-coded features. The results on boundary recognition, classification and re-ranking stages provide systematic evidence about the significant impact of tree kernels on the overall accuracy, especially when the amount of training data is small. As a conclusive result, tree kernels allow for a general and easily portable feature engineering method which is applicable to a large family of Natural Language Processing tasks.
2006
Machine Learning: ECML 2006 17th European Conference on Machine Learning
Berlin
Springer
9783540453758
Moschitti, Alessandro
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/12477
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 260
  • ???jsp.display-item.citation.isi??? 149
social impact