In this work we argue for the definition a knowledge-based entity matching framework for the implementation of a reliable and incrementally scalable solution. Such knowledge base is formed by an ontology and a set of entity matching rules suitable to be applied as a reliable equational theory in the context of the Semantic Web. In particular, we are going to prove that relying on the existence of a set of contextual mappings to ease the semantic heterogeneity characterizing descriptions on the Web, a knowledge-based solution can perform comparably, and sometimes better, than existing solutions at the state of the art. We further argue that a knowledge-based solution to the open entity matching problem ought to be considered under the open world assumption, as in some cases the descriptions to be matched may not contain the necessary information to take any accurate matching decision. The main goal of this work is to show how the framework proposed is suitable to pursue a reliable solution of the entity matching problem, regardless the set of rules for the ontology adopted. In fact, we believe that structural and syntactic heterogeneity affecting data on the Web undermine the definition of a global unique solution. However, we argue that a knowledge-driven approach, considering the semantic and meta-properties of compared attributes, can provide important benefits and lead to more reliable solutions. To achieve this goal, we are going to implement several experiments to evaluate different sets of rules, testing our thesis and learning important lessons for future developments. The sets of rules that we will consider to bootstrap the solution proposed in this work are the result of diverse complementary processes: first we want to investigate whether capturing the matching knowledge employed by people in taking entity matching decision by relying on machine learning techniques can produce an effective set of rules (bottom-up strategy); second, we investigate the application of formal ontology pools to analyze the features defined in the ontology and support the definition of entity matching rules (top-down strategy). Moreover, in this work we argue that by merging the rules resulting from these complementary processes, we can define a set of rules that can support reliably entity matching decision in an open context.

Knowledge Based Open Entity Matching / Bortoli, Stefano. - (2013), pp. 1-275.

Knowledge Based Open Entity Matching

Bortoli, Stefano
2013-01-01

Abstract

In this work we argue for the definition a knowledge-based entity matching framework for the implementation of a reliable and incrementally scalable solution. Such knowledge base is formed by an ontology and a set of entity matching rules suitable to be applied as a reliable equational theory in the context of the Semantic Web. In particular, we are going to prove that relying on the existence of a set of contextual mappings to ease the semantic heterogeneity characterizing descriptions on the Web, a knowledge-based solution can perform comparably, and sometimes better, than existing solutions at the state of the art. We further argue that a knowledge-based solution to the open entity matching problem ought to be considered under the open world assumption, as in some cases the descriptions to be matched may not contain the necessary information to take any accurate matching decision. The main goal of this work is to show how the framework proposed is suitable to pursue a reliable solution of the entity matching problem, regardless the set of rules for the ontology adopted. In fact, we believe that structural and syntactic heterogeneity affecting data on the Web undermine the definition of a global unique solution. However, we argue that a knowledge-driven approach, considering the semantic and meta-properties of compared attributes, can provide important benefits and lead to more reliable solutions. To achieve this goal, we are going to implement several experiments to evaluate different sets of rules, testing our thesis and learning important lessons for future developments. The sets of rules that we will consider to bootstrap the solution proposed in this work are the result of diverse complementary processes: first we want to investigate whether capturing the matching knowledge employed by people in taking entity matching decision by relying on machine learning techniques can produce an effective set of rules (bottom-up strategy); second, we investigate the application of formal ontology pools to analyze the features defined in the ontology and support the definition of entity matching rules (top-down strategy). Moreover, in this work we argue that by merging the rules resulting from these complementary processes, we can define a set of rules that can support reliably entity matching decision in an open context.
2013
XXV
2012-2013
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Bouquet, Paolo
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
Knowledge-Based_Open_Entity_Matching.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 3.64 MB
Formato Adobe PDF
3.64 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368858
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact