The present paper addresses the study of cross-linguistic and cross-modal iconicity within a deep learning framework. An LSTM-based Recurrent Neural Network is trained to associate the phonetic representation of a concrete word, encoded as a sequence of feature vectors, to the visual representation of its referent, expressed as an HCNN-transformed image. The processing network is then tested, without further training, in a language that does not appear in the training set and belongs to a different language family. The performance of the model is evaluated through a comparison with a randomized baseline; we show that such an imaginative network is capable of extracting language-independent generalizations in the mapping from linguistic sounds to visual features, providing empirical support for the hypothesis of a universal sound-symbolic substrate underlying all languages.

Phonovisual Biases in Language: Is the Lexicon Tied to the Visual World? / de Varda, A. G.; Strapparava, C.. - In: IJCAI. - ISSN 1045-0823. - (2021), pp. 643-649. (Intervento presentato al convegno 30th International Joint Conference on Artificial Intelligence, IJCAI 2021 tenutosi a can nel 2021).

Phonovisual Biases in Language: Is the Lexicon Tied to the Visual World?

Strapparava C.
2021-01-01

Abstract

The present paper addresses the study of cross-linguistic and cross-modal iconicity within a deep learning framework. An LSTM-based Recurrent Neural Network is trained to associate the phonetic representation of a concrete word, encoded as a sequence of feature vectors, to the visual representation of its referent, expressed as an HCNN-transformed image. The processing network is then tested, without further training, in a language that does not appear in the training set and belongs to a different language family. The performance of the model is evaluated through a comparison with a randomized baseline; we show that such an imaginative network is capable of extracting language-independent generalizations in the mapping from linguistic sounds to visual features, providing empirical support for the hypothesis of a universal sound-symbolic substrate underlying all languages.
2021
IJCAI International Joint Conference on Artificial Intelligence
USA
International Joint Conferences on Artificial Intelligence
de Varda, A. G.; Strapparava, C.
Phonovisual Biases in Language: Is the Lexicon Tied to the Visual World? / de Varda, A. G.; Strapparava, C.. - In: IJCAI. - ISSN 1045-0823. - (2021), pp. 643-649. (Intervento presentato al convegno 30th International Joint Conference on Artificial Intelligence, IJCAI 2021 tenutosi a can nel 2021).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/341939
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact