In many Information Retrieval tasks, the boundary between classes is not well defined, and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? how should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose comments were originally annotated with three classes. We re-annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.

On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments / Barrón-Cedeño, Alberto; Da San Martino, Giovanni; Filice, Simone; Moschitti, Alessandro. - ELETTRONICO. - (2017), pp. 1209-1212. (Intervento presentato al convegno SIGIR '17 tenutosi a Shinjuku, Tokyo, Japan nel 7 - 11 August, 2017) [10.1145/3077136.3080763].

On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments

Alessandro Moschitti
2017-01-01

Abstract

In many Information Retrieval tasks, the boundary between classes is not well defined, and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? how should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose comments were originally annotated with three classes. We re-annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.
2017
Proceedings of the 40th International ACM SIGIR Conference onResearch and Development in Information Retrieval, Shinjuku, Tokyo,Japan, August 7-11, 2017
Alberto Barrón-Cedeño, Giovanni Da San Martino, Simone Filice and Alessandro Moschitti
New York, NY United States
ACM
978-1-4503-5022-8
Barrón-Cedeño, Alberto; Da San Martino, Giovanni; Filice, Simone; Moschitti, Alessandro
On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments / Barrón-Cedeño, Alberto; Da San Martino, Giovanni; Filice, Simone; Moschitti, Alessandro. - ELETTRONICO. - (2017), pp. 1209-1212. (Intervento presentato al convegno SIGIR '17 tenutosi a Shinjuku, Tokyo, Japan nel 7 - 11 August, 2017) [10.1145/3077136.3080763].
File in questo prodotto:
File Dimensione Formato  
2017_SIGIR_Annotations.pdf

accesso aperto

Tipologia: Post-print referato (Refereed author’s manuscript)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 511.24 kB
Formato Adobe PDF
511.24 kB Adobe PDF Visualizza/Apri
3077136.3080763.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 907.96 kB
Formato Adobe PDF
907.96 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/195424
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact