Exploratory search is the new frontier of information consumption as it goes well beyond simple \emph{lookups}. Information repositories are ubiquitous and grow larger every day, and automated search systems help users find information in such collections. To extract knowledge from these repositories, the common ``query lookup'' retrieval paradigm accepts a set of specifications (the query) that describes the objects of interest and then collects such objects. Yet, the query lookup retrieval paradigms commonly in use are no more sufficient to support complex information needs, as they can only provide candidate starting points, but do not help the user in expanding their knowledge. To ease access and consumption of rich information repositories, we address the crucial problem of data exploration. Exploratory tasks match the natural need for finding answers to open-ended information needs within an unfamiliar environment. In particular, in this dissertation, we focus on enabling access to and exploration of rich information graphs. Within businesses, organizations, and among researchers, data is produced in many forms, large volumes, and different contexts. As a consequence of this heterogeneity, many applications find more useful modelling their datasets with the graph model, where information is represented with entities (nodes) and relationships (edges). Those are the data graphs, the graph databases, the knowledge graphs, or more generally information graphs. The richness of their schema and of their content makes it challenging for users to express appropriate queries and retrieve the desired results. Hence, to allow an effective exploration of a graph, we require: (i) an expressive \emph{query paradigm}, (ii) an intuitive \emph{query mechanism}, and (iii) an appropriate \emph{storage and query processing system}. In this work, we address these three requirements. An exploratory query should be simple enough to avoid complicate declarative languages (such as SQL or SPARQL), and at the same time, it should retain the flexibility and expressiveness of such languages. For this reason, with respect to the query paradigm, we introduce the notion of \emph{exemplar queries} and propose extensions to handle multiple incomplete examples. An exemplar query is a query method in which the user, or the analyst, circumvents query languages by using examples as input. In particular, the solution we design allows flexible matching in the case of incomplete or partially specified examples. Moreover, to enable this query paradigm, there is the need for interactive systems that implement an incremental query-constructions mechanism and interactive explorations. To address this need, we study algorithms and implementations based on pseudo-relevance feedback for \emph{exemplar query suggestion}, along with an in-depth study of their effectiveness. Finally, as there exist many graph databases, high heterogeneity can be observed in the functionalities and performances of these systems. We provide an exhaustive evaluation methodology and a comprehensive study of the existing systems that allow to understand their capabilities and limitations. In particular, we design a novel micro-benchmarking framework for the assessment of the functionalities of some graph databases among the most prominent in the area and provide detailed insights on their performance.

Enabling access to and exploration of information graphs / Lissandrini, Matteo. - (2018), pp. 1-170.

Enabling access to and exploration of information graphs

Lissandrini, Matteo
2018-01-01

Abstract

Exploratory search is the new frontier of information consumption as it goes well beyond simple \emph{lookups}. Information repositories are ubiquitous and grow larger every day, and automated search systems help users find information in such collections. To extract knowledge from these repositories, the common ``query lookup'' retrieval paradigm accepts a set of specifications (the query) that describes the objects of interest and then collects such objects. Yet, the query lookup retrieval paradigms commonly in use are no more sufficient to support complex information needs, as they can only provide candidate starting points, but do not help the user in expanding their knowledge. To ease access and consumption of rich information repositories, we address the crucial problem of data exploration. Exploratory tasks match the natural need for finding answers to open-ended information needs within an unfamiliar environment. In particular, in this dissertation, we focus on enabling access to and exploration of rich information graphs. Within businesses, organizations, and among researchers, data is produced in many forms, large volumes, and different contexts. As a consequence of this heterogeneity, many applications find more useful modelling their datasets with the graph model, where information is represented with entities (nodes) and relationships (edges). Those are the data graphs, the graph databases, the knowledge graphs, or more generally information graphs. The richness of their schema and of their content makes it challenging for users to express appropriate queries and retrieve the desired results. Hence, to allow an effective exploration of a graph, we require: (i) an expressive \emph{query paradigm}, (ii) an intuitive \emph{query mechanism}, and (iii) an appropriate \emph{storage and query processing system}. In this work, we address these three requirements. An exploratory query should be simple enough to avoid complicate declarative languages (such as SQL or SPARQL), and at the same time, it should retain the flexibility and expressiveness of such languages. For this reason, with respect to the query paradigm, we introduce the notion of \emph{exemplar queries} and propose extensions to handle multiple incomplete examples. An exemplar query is a query method in which the user, or the analyst, circumvents query languages by using examples as input. In particular, the solution we design allows flexible matching in the case of incomplete or partially specified examples. Moreover, to enable this query paradigm, there is the need for interactive systems that implement an incremental query-constructions mechanism and interactive explorations. To address this need, we study algorithms and implementations based on pseudo-relevance feedback for \emph{exemplar query suggestion}, along with an in-depth study of their effectiveness. Finally, as there exist many graph databases, high heterogeneity can be observed in the functionalities and performances of these systems. We provide an exhaustive evaluation methodology and a comprehensive study of the existing systems that allow to understand their capabilities and limitations. In particular, we design a novel micro-benchmarking framework for the assessment of the functionalities of some graph databases among the most prominent in the area and provide detailed insights on their performance.
2018
XXVIII
2018-2019
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Velegrakis, Yannis
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
matteo.lissandrini-thesis.pdf

Solo gestori archivio

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 8.68 MB
Formato Adobe PDF
8.68 MB Adobe PDF   Visualizza/Apri
DECLARATORIA_ENG.pdf

Solo gestori archivio

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 274.36 kB
Formato Adobe PDF
274.36 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368600
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact