Thanks to independent advances in language and image generation we could soon be in the position to have systems that communicate with humans by combining language and images in their output, a skill that humans do not possess (we receive, but do not produce images at high speed). The paper explores some of the implications of this idea: which kinds of data sets need to be developed to train such systems, in which cases language and images could be most usefully integrated and which issues could arise on the image generation and language+image integration side.

One Picture and a Thousand Words: Generative Language+images Models and How to Train Them / Zamparelli, R.. - ELETTRONICO. - 3551:(2023). (Intervento presentato al convegno 7th Workshop on Natural Language for Artificial Intelligence, NL4AI 2023 tenutosi a Roma nel 6 novembre-7 novembre 2023).

One Picture and a Thousand Words: Generative Language+images Models and How to Train Them

Zamparelli R.
2023-01-01

Abstract

Thanks to independent advances in language and image generation we could soon be in the position to have systems that communicate with humans by combining language and images in their output, a skill that humans do not possess (we receive, but do not produce images at high speed). The paper explores some of the implications of this idea: which kinds of data sets need to be developed to train such systems, in which cases language and images could be most usefully integrated and which issues could arise on the image generation and language+image integration side.
2023
Proceedings of the Seventh Workshop on Natural Language for Artificial Intelligence (NL4AI 2023)
Aachen, Germany
CEUR Workshop Proceedings
Zamparelli, R.
One Picture and a Thousand Words: Generative Language+images Models and How to Train Them / Zamparelli, R.. - ELETTRONICO. - 3551:(2023). (Intervento presentato al convegno 7th Workshop on Natural Language for Artificial Intelligence, NL4AI 2023 tenutosi a Roma nel 6 novembre-7 novembre 2023).
File in questo prodotto:
File Dimensione Formato  
paper11.pdf

accesso aperto

Descrizione: One Picture and a Thousand Words Generative Language+images Models and How to Train Them
Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 1.4 MB
Formato Adobe PDF
1.4 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/400445
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact