Exploiting language models for visual recognition

IRIS

The problem of learning language models from large text corpora has been widely studied within the computational linguistic community. However, little is known about the performance of these language models when applied to the computer vision domain. In this work, we compare representative models: a window-based model, a topic model, a distributional memory and a commonsense knowledge database, ConceptNet, in two visual recognition scenarios: human action recognition and object prediction. We examine whether the knowledge extracted from texts through these models are compatible to the knowledge represented in images. We determine the usefulness of different language models in aiding the two visual recognition tasks. The study shows that the language models built from general text corpora can be used instead of expensive annotated images and even outperform the image model when testing on a big general dataset.

Exploiting language models for visual recognition

Le, Dieu Thu;Uijlings, Jasper Reinout Robertus;Bernardi, Raffaella

2013-01-01

Abstract

The problem of learning language models from large text corpora has been widely studied within the computational linguistic community. However, little is known about the performance of these language models when applied to the computer vision domain. In this work, we compare representative models: a window-based model, a topic model, a distributional memory and a commonsense knowledge database, ConceptNet, in two visual recognition scenarios: human action recognition and object prediction. We examine whether the knowledge extracted from texts through these models are compatible to the knowledge represented in images. We determine the usefulness of different language models in aiding the two visual recognition tasks. The study shows that the language models built from general text corpora can be used instead of expensive annotated images and even outperform the image model when testing on a big general dataset.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
			2013
		
	Titolo del volume (Proceedings title)
	
			Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
		
	Luogo di edizione (Place of publication)
	
			Seattle, Washington, USA
		
	Casa editrice (Publisher)
	
			ACL - Association for Computational Linguistic
		
	ISBN
	
			9781937284978
		
	Codice Scopus (Scopus Identifier)
	
			2-s2.0-84926309256
		
	Tutti gli autori
	
			Le, Dieu Thu; Uijlings, Jasper Reinout Robertus; Bernardi, Raffaella
		
	Appare nelle tipologie:
	
			04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
D13-1072.pdf Solo gestori archivio Descrizione: Articolo principale Tipologia: Versione editoriale (Publisher’s layout) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 578.17 kB Formato Adobe PDF Visualizza/Apri	578.17 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/98429

Citazioni

ND

7

ND

social impact