In complex information spaces, it is often hard for users to formally specify the elements of interest (i.e., the desired answers), either due to lack of technical knowledge, or the complexity of the task, or even because they may not know exactly what they are looking for. Exemplar queries constitute a query paradigm that makes this process easier, by allowing users to provide examples of the elements of interest. However, existing approaches either assume that the complete structure of the intended results is known to the user and is unique, or the provided examples are elements that should be present in the result set, therefore restricting the range of queries and answers that are supported. In this paper, we propose a more general approach, the multi-exemplar query paradigm, where each example may describe only some of the desired characteristics of the elements of interest. Since each example describes only a part of the desired specification, multiple examples can be used, in order to collectively constitute a more precise specification. We provide a formal definition of the problem, and describe exact algorithms for its solution. In addition, we study the top-k problem for a family of generic ranking functions that capture different scenarios, and present an exact solution for this problem as well. A thorough experimental assessment on a large real dataset demonstrates the effectiveness and efficiency of the proposed approach.
Multi-Exemplar Queries: Searching in Complex Information Spaces Using Simple Examples
Lissandrini, Matteo;Velegrakis, Ioannis;
2016-01-01
Abstract
In complex information spaces, it is often hard for users to formally specify the elements of interest (i.e., the desired answers), either due to lack of technical knowledge, or the complexity of the task, or even because they may not know exactly what they are looking for. Exemplar queries constitute a query paradigm that makes this process easier, by allowing users to provide examples of the elements of interest. However, existing approaches either assume that the complete structure of the intended results is known to the user and is unique, or the provided examples are elements that should be present in the result set, therefore restricting the range of queries and answers that are supported. In this paper, we propose a more general approach, the multi-exemplar query paradigm, where each example may describe only some of the desired characteristics of the elements of interest. Since each example describes only a part of the desired specification, multiple examples can be used, in order to collectively constitute a more precise specification. We provide a formal definition of the problem, and describe exact algorithms for its solution. In addition, we study the top-k problem for a family of generic ranking functions that capture different scenarios, and present an exact solution for this problem as well. A thorough experimental assessment on a large real dataset demonstrates the effectiveness and efficiency of the proposed approach.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione