This paper proposes a novel framework for Relevance Feedback based on the Fisher Kernel (FK). Specifically, we train a Gaussian Mixture Model (GMM) on the top retrieval results (without supervision) and use this to create a FK representation, which is therefore specialized in modelling the most relevant examples. We use the FK representation to explicitly capture temporal variation in video via frame-based features taken at different time intervals. While the GMM is being trained, a user selects from the top examples those which he is looking for. This feedback is used to train a Support Vector Machine on the FK representation, which is then applied to re-rank the top retrieved results. We show that our approach outperforms other state-of-the-art relevance feedback methods. Experiments were carried out on the Blip10000, UCF50, UCF101 and ADL standard datasets using a broad range of multi-modal content descriptors (visual, audio, and text).

Fisher kernel temporal variation-based relevance feedback for video retrieval / Mironicə, Ionuţ; Ionescu, Bogdan; Uijlings, Jasper Reinout Robertus; Sebe, Niculae. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 143:(2016), pp. 38-51. [10.1016/j.cviu.2015.10.005]

Fisher kernel temporal variation-based relevance feedback for video retrieval

Uijlings, Jasper Reinout Robertus;Sebe, Niculae
2016-01-01

Abstract

This paper proposes a novel framework for Relevance Feedback based on the Fisher Kernel (FK). Specifically, we train a Gaussian Mixture Model (GMM) on the top retrieval results (without supervision) and use this to create a FK representation, which is therefore specialized in modelling the most relevant examples. We use the FK representation to explicitly capture temporal variation in video via frame-based features taken at different time intervals. While the GMM is being trained, a user selects from the top examples those which he is looking for. This feedback is used to train a Support Vector Machine on the FK representation, which is then applied to re-rank the top retrieved results. We show that our approach outperforms other state-of-the-art relevance feedback methods. Experiments were carried out on the Blip10000, UCF50, UCF101 and ADL standard datasets using a broad range of multi-modal content descriptors (visual, audio, and text).
2016
Mironicə, Ionuţ; Ionescu, Bogdan; Uijlings, Jasper Reinout Robertus; Sebe, Niculae
Fisher kernel temporal variation-based relevance feedback for video retrieval / Mironicə, Ionuţ; Ionescu, Bogdan; Uijlings, Jasper Reinout Robertus; Sebe, Niculae. - In: COMPUTER VISION AND IMAGE UNDERSTANDING. - ISSN 1077-3142. - 143:(2016), pp. 38-51. [10.1016/j.cviu.2015.10.005]
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1077314215002155-main.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.24 MB
Formato Adobe PDF
1.24 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/148083
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 15
  • OpenAlex 19
social impact