Although being one very simple statement, the distributional hypothesis - namely, words that occur in similar contexts are semantically similar - has been granted the role of main assumption in many computational linguistic techniques. This is mostly due to the fact that it allows to easily and automatically construct a representation of word meaning from a large textual input. Among the computational linguistic techniques that are corpus-based and adopt the distributional hypothesis, Distributional semantic models (DSMs) have been shown to be a very effective method in many semantic-related tasks. DSMs approximate word meaning by vectors that keep track of the patterns of co-occurrence of words in the processed corpora. In addition, DSMs have been shown to be a very plausible computational model for human concept cognition, since they are able to simulate several psychological phenomena. Despite their success, one of their strongest limitations is that they entirely represent word meaning in terms of connections with other words. Cognitive scientists have argued that, in this way, DSMs neglect that humans rely also on non-verbal experiences and have access to rich sources of perceptual knowledge when they learn the meaning of words. In this work, the lack of perceptual grounding of distributional models is addressed by exploiting computer vision techniques that automatically identify discrete "visual words" in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. A flexible architecture to integrate text- and image-based distributional information is introduced and tested on a set of empirical evaluations, showing that an integrated model is superior to a purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.

Multimodal Distributional Semantics / Bruni, Elia. - (2013), pp. 1-142.

Multimodal Distributional Semantics

Bruni, Elia
2013-01-01

Abstract

Although being one very simple statement, the distributional hypothesis - namely, words that occur in similar contexts are semantically similar - has been granted the role of main assumption in many computational linguistic techniques. This is mostly due to the fact that it allows to easily and automatically construct a representation of word meaning from a large textual input. Among the computational linguistic techniques that are corpus-based and adopt the distributional hypothesis, Distributional semantic models (DSMs) have been shown to be a very effective method in many semantic-related tasks. DSMs approximate word meaning by vectors that keep track of the patterns of co-occurrence of words in the processed corpora. In addition, DSMs have been shown to be a very plausible computational model for human concept cognition, since they are able to simulate several psychological phenomena. Despite their success, one of their strongest limitations is that they entirely represent word meaning in terms of connections with other words. Cognitive scientists have argued that, in this way, DSMs neglect that humans rely also on non-verbal experiences and have access to rich sources of perceptual knowledge when they learn the meaning of words. In this work, the lack of perceptual grounding of distributional models is addressed by exploiting computer vision techniques that automatically identify discrete "visual words" in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. A flexible architecture to integrate text- and image-based distributional information is introduced and tested on a set of empirical evaluations, showing that an integrated model is superior to a purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.
2013
XXVI
2012-2013
CIMEC (29/10/12-)
Cognitive and Brain Sciences
Baroni, Marco
no
Inglese
Settore L-LIN/01 - Glottologia e Linguistica
File in questo prodotto:
File Dimensione Formato  
EliaBruniThesis.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 12.35 MB
Formato Adobe PDF
12.35 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/367901
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact