Given a text, can we segment it into semantically coherent sections in an automatic way? Can we detect the semantic boundaries, if we know how many they are? Can we determine how many semantically distinct sections are in the text? These are the questions we address in this paper. To respond, we use the Bidirectional Encoder Representation from Transformer (BERT) to analyze the text and evaluate a function that we call local incoherence, which we expect to show maxima at the points where a semantic boundary is detected. Our results, although preliminary, are encouraging and suggest that our approach can be successfully applied. However, they are quite sensitive with respect to the text quality, as it happens in the case in which the text is derived from an audio stream via AutomaticSpeech Recognition techniques
Semantic Segmentation of Text Using Deep Learning / Lattisi, Tiziano; Farina, Davide; Ronchetti, Marco. - In: COMPUTING AND INFORMATICS. - ISSN 1335-9150. - ELETTRONICO. - 2022, 41:1(2022), pp. 78-97. [10.31577/cai_2022_1_78]
Semantic Segmentation of Text Using Deep Learning
Lattisi,Tiziano;Ronchetti, Marco
2022-01-01
Abstract
Given a text, can we segment it into semantically coherent sections in an automatic way? Can we detect the semantic boundaries, if we know how many they are? Can we determine how many semantically distinct sections are in the text? These are the questions we address in this paper. To respond, we use the Bidirectional Encoder Representation from Transformer (BERT) to analyze the text and evaluate a function that we call local incoherence, which we expect to show maxima at the points where a semantic boundary is detected. Our results, although preliminary, are encouraging and suggest that our approach can be successfully applied. However, they are quite sensitive with respect to the text quality, as it happens in the case in which the text is derived from an audio stream via AutomaticSpeech Recognition techniquesFile | Dimensione | Formato | |
---|---|---|---|
vieraj,+5877_adsi-8.pdf
accesso aperto
Descrizione: Articolo
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
891.4 kB
Formato
Adobe PDF
|
891.4 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione