A 2-phase frame-based knowledge extraction framework

IRIS

We present an approach for extracting knowledge from natural language English texts where processing is decoupled in two phases. The first phase comprises several standard NLP tasks whose results are integrated in a single RDF graph of mentions. The second phase processes the mention graph with SPARQL-like mapping rules to produce a knowledge graph organized around semantic frames (i.e., prototypical descriptions of events and situations). The decoupling allows: (i) choosing different tools for the NLP tasks without affecting the remaining computation; (ii) combining the outputs of different NLP tasks in non-trivial ways, leveraging their integrated and coherent representation in a mention graph; and (iii) relating each piece of extracted knowledge to the mention(s) it comes from, leveraging the single RDF representation. We evaluate precision and recall of our approach on a gold standard, showing its competitiveness w.r.t. the state of the art. We also evaluate execution times and (sampled) accuracy on a corpus of 110K Wikipedia pages, showing the applicability of the approach on large corpora.

We present an approach for extracting knowledge from natural language English texts where processing is decoupled in two phases. The first phase comprises several standard NLP tasks whose results are integrated in a single RDF graph of mentions. The second phase processes the mention graph with SPARQL-like mapping rules to produce a knowledge graph organized around semantic frames (i.e., prototypical descriptions of events and situations). The decoupling allows: (i) choosing different tools for the NLP tasks without affecting the remaining computation; (ii) combining the outputs of different NLP tasks in non-trivial ways, leveraging their integrated and coherent representation in a mention graph; and (iii) relating each piece of extracted knowledge to the mention(s) it comes from, leveraging the single RDF representation. We evaluate precision and recall of our approach on a gold standard, showing its competitiveness w.r.t. the state of the art. We also evaluate execution times and (sampled) accuracy on a corpus of 110K Wikipedia pages, showing the applicability of the approach on large corpora.

A 2-phase frame-based knowledge extraction framework / Corcoglioniti, F., Rospocher, M., Palmero Aprosio, A.. - 04-08-:(2016), pp. 354-361. (31st Annual ACM Symposium on Applied Computing, SAC 2016 April 4-8, 2016 2016) [10.1145/2851613.2851845].

A 2-phase frame-based knowledge extraction framework

Corcoglioniti, Francesco;Rospocher, Marco;Palmero Aprosio, Alessio

2016-01-01

Abstract

We present an approach for extracting knowledge from natural language English texts where processing is decoupled in two phases. The first phase comprises several standard NLP tasks whose results are integrated in a single RDF graph of mentions. The second phase processes the mention graph with SPARQL-like mapping rules to produce a knowledge graph organized around semantic frames (i.e., prototypical descriptions of events and situations). The decoupling allows: (i) choosing different tools for the NLP tasks without affecting the remaining computation; (ii) combining the outputs of different NLP tasks in non-trivial ways, leveraging their integrated and coherent representation in a mention graph; and (iii) relating each piece of extracted knowledge to the mention(s) it comes from, leveraging the single RDF representation. We evaluate precision and recall of our approach on a gold standard, showing its competitiveness w.r.t. the state of the art. We also evaluate execution times and (sampled) accuracy on a corpus of 110K Wikipedia pages, showing the applicability of the approach on large corpora.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2016
			
	Titolo del volume (Proceedings title)
	
				SAC '16 Proceedings of the 31st Annual ACM Symposium on Applied Computing
			
	Luogo di edizione (Place of publication)
	
				New York
			
	Casa editrice (Publisher)
	
				ACM
			
	ISBN
	
				9781450337397
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-84975887050
			
	Tutti gli autori
	
						Corcoglioniti, Francesco; Rospocher, Marco; Palmero Aprosio, Alessio
					
	Citazione
	
				A 2-phase frame-based knowledge extraction framework / Corcoglioniti, F., Rospocher, M., Palmero Aprosio, A.. - 04-08-:(2016), pp. 354-361. (31st Annual ACM Symposium on Applied Computing, SAC 2016 April 4-8, 2016 2016) [10.1145/2851613.2851845].

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/454150

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

22

ND

21

social impact