Dealing with Semantic Heterogeneity in Classifications

Maltese, Vincenzo

doi:10.15168/11572_369248

Many projects have dealt with mappings between classifications both in computer science and digital library communities. The adopted solutions range from fully manual to fully automatic approaches. Manual approaches are very precise, but automation becomes unavoidable when classifications contain thousands of nodes with millions of candidate correspondences. As fun-damental preliminary step towards automation, S-Match converts classifications into formal on-tologies, i.e. lightweight ontologies. Despite many solutions to the problem have been offered, with S-Match representing a state of the art matcher with good accuracy and run-time perfor-mance, there are still several open problems. In particular, the problems addressed in this thesis include: (a) Run-time performance. Due to the high number of calls to the SAT reasoning engine, semantic matching may require exponential time; (b) Maintenance. Current matching tools offer poor support to users for the process of creation, validation and maintenance of the correspond-ences; (c) Lack of background knowledge. The lack of domain specific background knowledge is one important cause of low recall. As significant progress to (a) and (b), we describe MinSMatch, a semantic matching tool we developed evolving S-Match that computes the minimal mapping between two lightweight ontologies. The minimal mapping is that minimal subset of correspondences such that all the others can be efficiently computed from them and are therefore said to be redundant. We provide a formal definition of minimal and, dually, redundant map-pings, evidence of the fact that the minimal mapping always exists and it is unique and a correct and complete algorithm for computing it. Our experiments demonstrate a substantial improve-ment in run-time. Based on this, we also developed a method to support users in the validation task that allows saving up to 99% of the time. We address problem (c) by creating and by making use of an extensible diversity-aware knowledge base providing a continuously growing quantity of properly organized knowledge. Our approach is centered on the fundamental notions of domain and context. Domains, developed by adapting the faceted approach from library science, are the main means by which diversity is captured and allow scaling as with them it is possible to add new knowledge as needed. Context allows a better disambiguation of the terms used and re-ducing the complexity of reasoning at run-time. As proof of the applicability of the approach, we developed the Space domain and applied it in the Semantic Geo-Catalogue (SGC) project.

Dealing with Semantic Heterogeneity in Classifications / Maltese, Vincenzo. - (2012), pp. 1-190. [10.15168/11572_369248]

Dealing with Semantic Heterogeneity in Classifications

Maltese, Vincenzo

2012-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				2012
			
	Ciclo
	
				XXIII
			
	Anno Accademico
	
				2011-2012
			
	Dipartimento
	
				Ingegneria e Scienza dell'Informaz (cess.4/11/12)
			
	Corso di dottorato
	
				Informatica e telecomunicazioni (fino a.a. 2020-21, 36° ciclo)
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Giunchiglia, Fausto
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Codice DOI
	
				https://dx.doi.org/10.15168/11572_369248
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore INF/01 - Informatica
Settore MAT/01 - Logica Matematica
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
Maltese_Vincenzo_PHD_THESIS.pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.47 MB Formato Adobe PDF Visualizza/Apri	2.47 MB	Adobe PDF	Visualizza/Apri