Modern information systems often store data that has been transformed and integrated from a variety of sources. This integration may obscure the original source semantics of data items. For many tasks, it is important to be able to determine not only where data items originated, but also why they appear in the integration as they do and through what transformation they were derived. This problem is known as data provenance. In this work, we consider data provenance at the schema and mapping level. In particular, we consider how to answer questions such as "what schema elements in the source(s) contributed to this value", or "through what transformations or mappings was this value derived?" Towards this end, we elevate schemas and mappings to first-class citizens that are stored in a repository and are associated with the actual data values. An extended query language, called MXQL, is also developed that allows meta-data to be queried as regular data and we describe its implementation. ...

Representing and querying data transformations

Velegrakis, Ioannis;Mylopoulos, Ioannis
2005-01-01

Abstract

Modern information systems often store data that has been transformed and integrated from a variety of sources. This integration may obscure the original source semantics of data items. For many tasks, it is important to be able to determine not only where data items originated, but also why they appear in the integration as they do and through what transformation they were derived. This problem is known as data provenance. In this work, we consider data provenance at the schema and mapping level. In particular, we consider how to answer questions such as "what schema elements in the source(s) contributed to this value", or "through what transformations or mappings was this value derived?" Towards this end, we elevate schemas and mappings to first-class citizens that are stored in a repository and are associated with the actual data values. An extended query language, called MXQL, is also developed that allows meta-data to be queried as regular data and we describe its implementation. ...
2005
Proceedings of the 21st International Conference on Data Engineering, ICDE 2005
10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA
IEEE COMPUTER SOC
9780769522852
Velegrakis, Ioannis; R., Miller; Mylopoulos, Ioannis
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/78269
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 35
  • ???jsp.display-item.citation.isi??? 10
  • OpenAlex ND
social impact