A rule-based entity matching task requires the definition of an effective set of rules, which is a time-consuming and error-prone process. The typical approach adopted for its resolution is a trial and error method, where the rules are incrementally added and modified until satisfactory results are obtained. This approach requires significant human intervention, since a typical dataset needs the definition of a large number of rules and possible interconnections that cannot be manually managed. In this paper, we propose TuneR, a software library supporting developers (i.e., coders, scientists, and domain experts) in tuning sets of matching rules. It aims to reduce human intervention by offering a tool for the optimization of rule sets based on user-defined criteria (such as effectiveness, interpretability, etc.). Our goal is to integrate the framework in the Magellan ecosystem, thus completing the functionalities required by the developers for performing Entity Matching tasks.
TuneR: Fine Tuning of Rule-based Entity Matchers / Paganelli, Matteo; Sottovia, Paolo; Guerra, Francesco; Velegrakis, Yannis. - (2019), pp. 2945-2948. (Intervento presentato al convegno CIKM tenutosi a Beijing, China nel November 3-7,2019) [10.1145/3357384.3357854].
TuneR: Fine Tuning of Rule-based Entity Matchers
Sottovia , Paolo;Velegrakis, Yannis
2019-01-01
Abstract
A rule-based entity matching task requires the definition of an effective set of rules, which is a time-consuming and error-prone process. The typical approach adopted for its resolution is a trial and error method, where the rules are incrementally added and modified until satisfactory results are obtained. This approach requires significant human intervention, since a typical dataset needs the definition of a large number of rules and possible interconnections that cannot be manually managed. In this paper, we propose TuneR, a software library supporting developers (i.e., coders, scientists, and domain experts) in tuning sets of matching rules. It aims to reduce human intervention by offering a tool for the optimization of rule sets based on user-defined criteria (such as effectiveness, interpretability, etc.). Our goal is to integrate the framework in the Magellan ecosystem, thus completing the functionalities required by the developers for performing Entity Matching tasks.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione