Metadata are fundamental for the indexing, browsing, and retrieval of cultural heritage resources in digital repositories. Since the manual control of metadata quality in digital repositories may not be feasible, especially when working with large collections, this Ph.D. thesis focuses specifically on the problem of automatic metadata quality assessment. Taking as the main reference the Metadata Quality Framework developed by Thomas Bruce and Diane Hilmann, we propose to evaluate metadata information according to three aspects. The first is metadata Completeness, approached as a statistical analysis. We compute the ratio of the filled elements with respect to the metadata schema taking into account its structure as well as the specific topic of a collection. The second is metadata Accuracy of the textual description of a given cultural heritage object, approached as a binary classification problem. We determine whether the field contains a high-quality or low-quality description, measured as the compliance of the textual content with the description rules from the guidelines used to implement metadata information. The last aspect concerns metadata Coherence, where we investigate the feasibility to use high-quality metadata at source while implementing metadata information. We assess the metadata Coherence of the subject element recommending the three most likely subjects of the resource analyzing the iconography of the resource. Applying this methodology to the Italian digital library ``Cultura Italia'', we noticed overall that it is indeed possible to automatically evaluate metadata quality. However, despite the promising results we obtained, to have a more detailed picture about automatic metadata quality evaluation, our methods should be also tested on a wider range of digital repositories.

Metadata Quality Evaluation in Cultural Heritage Domain / Lorenzini, Matteo. - (2022 Feb 15), pp. 1-129. [10.15168/11572_330448]

Metadata Quality Evaluation in Cultural Heritage Domain

Lorenzini, Matteo
2022-02-15

Abstract

Metadata are fundamental for the indexing, browsing, and retrieval of cultural heritage resources in digital repositories. Since the manual control of metadata quality in digital repositories may not be feasible, especially when working with large collections, this Ph.D. thesis focuses specifically on the problem of automatic metadata quality assessment. Taking as the main reference the Metadata Quality Framework developed by Thomas Bruce and Diane Hilmann, we propose to evaluate metadata information according to three aspects. The first is metadata Completeness, approached as a statistical analysis. We compute the ratio of the filled elements with respect to the metadata schema taking into account its structure as well as the specific topic of a collection. The second is metadata Accuracy of the textual description of a given cultural heritage object, approached as a binary classification problem. We determine whether the field contains a high-quality or low-quality description, measured as the compliance of the textual content with the description rules from the guidelines used to implement metadata information. The last aspect concerns metadata Coherence, where we investigate the feasibility to use high-quality metadata at source while implementing metadata information. We assess the metadata Coherence of the subject element recommending the three most likely subjects of the resource analyzing the iconography of the resource. Applying this methodology to the Italian digital library ``Cultura Italia'', we noticed overall that it is indeed possible to automatically evaluate metadata quality. However, despite the promising results we obtained, to have a more detailed picture about automatic metadata quality evaluation, our methods should be also tested on a wider range of digital repositories.
15-feb-2022
XXXIII
2019-2020
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Tonelli, Sara
Rospocher, Marco
no
Inglese
File in questo prodotto:
File Dimensione Formato  
tesi_iris.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Creative commons
Dimensione 6.35 MB
Formato Adobe PDF
6.35 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/330448
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact