The field of Distributional Semantics (DS) is built on the ‘distributional hypothesis’, which states that meaning can be recovered from statistical information in observable language. It is however notable that the computations necessary to obtain ‘good’ DS representations are often very involved, implying that if meaning is derivable from linguistic data, it is not directly encoded in it. This prompts questions related to fundamental questions about language acquisition: if we regard text data as linguistic performance, what kind of ‘innate’ mechanisms must operate over that data to reach competence? In other words, how much of semantic acquisition is truly data-driven, and what must be hard-encoded in a system’s architecture? In this paper, we introduce a new methodology to pull those questions apart. We use state-of-the-art computational models to investigate the amount and nature of transformations required to perform particular semantic tasks. We apply that methodology to one of the simplest structures in language: the word bigram, giving insights into the specific contribution of that linguistic component.1

How much competence is there in performance? Assessing the distributional hypothesis in word bigrams / Seltmann, J.; Ducceschi, L.; Herbelot, A.. - 2481:(2019). (Intervento presentato al convegno 6th Italian Conference on Computational Linguistics, CLiC-it 2019 tenutosi a ita nel 2019).

How much competence is there in performance? Assessing the distributional hypothesis in word bigrams

Ducceschi L.;Herbelot A.
2019-01-01

Abstract

The field of Distributional Semantics (DS) is built on the ‘distributional hypothesis’, which states that meaning can be recovered from statistical information in observable language. It is however notable that the computations necessary to obtain ‘good’ DS representations are often very involved, implying that if meaning is derivable from linguistic data, it is not directly encoded in it. This prompts questions related to fundamental questions about language acquisition: if we regard text data as linguistic performance, what kind of ‘innate’ mechanisms must operate over that data to reach competence? In other words, how much of semantic acquisition is truly data-driven, and what must be hard-encoded in a system’s architecture? In this paper, we introduce a new methodology to pull those questions apart. We use state-of-the-art computational models to investigate the amount and nature of transformations required to perform particular semantic tasks. We apply that methodology to one of the simplest structures in language: the word bigram, giving insights into the specific contribution of that linguistic component.1
2019
CEUR Workshop Proceedings
Aachen
CEUR-WS
Seltmann, J.; Ducceschi, L.; Herbelot, A.
How much competence is there in performance? Assessing the distributional hypothesis in word bigrams / Seltmann, J.; Ducceschi, L.; Herbelot, A.. - 2481:(2019). (Intervento presentato al convegno 6th Italian Conference on Computational Linguistics, CLiC-it 2019 tenutosi a ita nel 2019).
File in questo prodotto:
File Dimensione Formato  
2019_how_much_competence_in_performance.pdf

accesso aperto

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Creative commons
Dimensione 376.29 kB
Formato Adobe PDF
376.29 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/249655
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact