Neural representations of single concepts have been extensively studied for years, identifying a left-lateralised network encompassing ATL, AG, medial PFC, precuneus, IFG, VTC, pMTG: the semantic system. However, less is known about how the semantic system combines single concepts into increasingly complex representations, like sentences. To investigate this combinatorial process, we leveraged sentence embeddings derived from large language models (LLMs) - which capture the overall meaning of sentences - to examine how this process is distributed across the semantic system. We reanalysed an fMRI study (N=24) where participants read 240 Italian sentences formed by a subject, a verb, and a complement (e.g. “The cops arrest the thieves”). To isolate sentence-level meaning from that of the individual words, sentence embeddings (SONAR by Meta) were used to build (a) a model of sentences presented in their original order and (b) a model derived from sentences with scrambled word order. Through representational similarity analysis (RSA), we contrasted the ordered and scrambled (a > b) models, revealing that sentence-level meaning is represented uniformly across the whole semantic system. While this manipulation produced robust regional differences in the magnitude of representational strength, no specific regions showed a greater relevance to combinatorial meaning when the proportional change in captured information was assessed. To further investigate whether different regions within the semantic system support distinct aspects of the combinatorial process, we focused on the roles played by nouns. Specifically, we constructed two “impoverished” models where sentence meaning is impacted by removing one sentence component: (c) the first noun (“The arrest the thieves”), or (d) the final noun (“The cops arrest the”). We found that the removal of the first noun (c > d) more strongly affected sentence-level meaning in a core area of the semantic system, the precuneus. In contrast, final noun removal (d > c) more strongly impacted sentence meaning in a region processing contextual associations: right PPA. The comparison between original and scrambled sentences suggests that the semantic system is uniformly engaged in the integration of single words into sentence meaning. However, while focusing on different sentence components, differences were observed in dissociable brain regions. Precuneus, core region of the semantic system provides the foundation of the meaning for the sentence, while right PPA known for processing contextual associations, integrates the new piece of information with the rest of the sentence. This study shows an innovative way to use LLMs, specifically sentence embeddings, by altering aspects of the original sentences (like word order or specific components) to investigate different linguistic and semantic processes.
Sentence-level embeddings reveal the role of the semantic system in integrating the meaning of single words into sentences / Belluzzi, Andrea; Fairhall, Scott. - (2025). ( SNL Gallaudet University, Washington DC 12th Sep-14th Sep 2025).
Sentence-level embeddings reveal the role of the semantic system in integrating the meaning of single words into sentences
Belluzzi, Andrea
Primo
;Fairhall, ScottUltimo
2025-01-01
Abstract
Neural representations of single concepts have been extensively studied for years, identifying a left-lateralised network encompassing ATL, AG, medial PFC, precuneus, IFG, VTC, pMTG: the semantic system. However, less is known about how the semantic system combines single concepts into increasingly complex representations, like sentences. To investigate this combinatorial process, we leveraged sentence embeddings derived from large language models (LLMs) - which capture the overall meaning of sentences - to examine how this process is distributed across the semantic system. We reanalysed an fMRI study (N=24) where participants read 240 Italian sentences formed by a subject, a verb, and a complement (e.g. “The cops arrest the thieves”). To isolate sentence-level meaning from that of the individual words, sentence embeddings (SONAR by Meta) were used to build (a) a model of sentences presented in their original order and (b) a model derived from sentences with scrambled word order. Through representational similarity analysis (RSA), we contrasted the ordered and scrambled (a > b) models, revealing that sentence-level meaning is represented uniformly across the whole semantic system. While this manipulation produced robust regional differences in the magnitude of representational strength, no specific regions showed a greater relevance to combinatorial meaning when the proportional change in captured information was assessed. To further investigate whether different regions within the semantic system support distinct aspects of the combinatorial process, we focused on the roles played by nouns. Specifically, we constructed two “impoverished” models where sentence meaning is impacted by removing one sentence component: (c) the first noun (“The arrest the thieves”), or (d) the final noun (“The cops arrest the”). We found that the removal of the first noun (c > d) more strongly affected sentence-level meaning in a core area of the semantic system, the precuneus. In contrast, final noun removal (d > c) more strongly impacted sentence meaning in a region processing contextual associations: right PPA. The comparison between original and scrambled sentences suggests that the semantic system is uniformly engaged in the integration of single words into sentence meaning. However, while focusing on different sentence components, differences were observed in dissociable brain regions. Precuneus, core region of the semantic system provides the foundation of the meaning for the sentence, while right PPA known for processing contextual associations, integrates the new piece of information with the rest of the sentence. This study shows an innovative way to use LLMs, specifically sentence embeddings, by altering aspects of the original sentences (like word order or specific components) to investigate different linguistic and semantic processes.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione



