The goal of information retrieval (IR) is to map a natural language query, which specifies the user information needs, to a set of objects in a given collection, which meet these needs. Historically, there have been two major approaches to IR that we call syntactic IR and semantic IR. In syntactic IR, search engines use words or multi-word phrases that occur in document and query representations. The search procedure, used by these search engines, is principally based on the syntactic matching of document and query representations. The precision and recall achieved by these search engines might be negatively affected by the problems of (i) polysemy, (ii) synonymy, (iii) complex concepts, and (iv) related concepts. Semantic IR is based on fetching document and query representations through a semantic analysis of their contents using natural language processing techniques and then retrieving documents by matching these semantic representations. Semantic IR approaches are developed to improve the quality of syntactic approaches but, in practice, results of semantic IR are often inferior to that of syntactic one. In this thesis, we propose a novel approach to IR which extends syntactic IR with semantics, thus addressing the problem of low precision and low recall of syntactic IR. The main idea is to keep the same machinery which has made syntactic IR so successful, but to modify it so that, whenever possible (and useful), syntactic IR is substituted by semantic IR, thus improving the system performance. As instances of the general approach, we describe the semantics enabled approaches to: (i) document retrieval, (ii) document classification, and (iii) peer-to-peer search.

Concept Search: Semantics Enabled Information Retrieval / Kharkevich, Uladzimir. - (2010), pp. 1-138.

Concept Search: Semantics Enabled Information Retrieval

Kharkevich, Uladzimir
2010-01-01

Abstract

The goal of information retrieval (IR) is to map a natural language query, which specifies the user information needs, to a set of objects in a given collection, which meet these needs. Historically, there have been two major approaches to IR that we call syntactic IR and semantic IR. In syntactic IR, search engines use words or multi-word phrases that occur in document and query representations. The search procedure, used by these search engines, is principally based on the syntactic matching of document and query representations. The precision and recall achieved by these search engines might be negatively affected by the problems of (i) polysemy, (ii) synonymy, (iii) complex concepts, and (iv) related concepts. Semantic IR is based on fetching document and query representations through a semantic analysis of their contents using natural language processing techniques and then retrieving documents by matching these semantic representations. Semantic IR approaches are developed to improve the quality of syntactic approaches but, in practice, results of semantic IR are often inferior to that of syntactic one. In this thesis, we propose a novel approach to IR which extends syntactic IR with semantics, thus addressing the problem of low precision and low recall of syntactic IR. The main idea is to keep the same machinery which has made syntactic IR so successful, but to modify it so that, whenever possible (and useful), syntactic IR is substituted by semantic IR, thus improving the system performance. As instances of the general approach, we describe the semantics enabled approaches to: (i) document retrieval, (ii) document classification, and (iii) peer-to-peer search.
2010
XXI
2009-2010
Informatica e Studi Aziendali (cess.4/11/12)
Information and Communication Technology
Giunchiglia, Fausto
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
PhD-Thesis.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 946.49 kB
Formato Adobe PDF
946.49 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368671
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact