Overlapping speech is a natural and frequently occurring phenomenon in human–human conversations with an underlying purpose. Speech overlap events may be categorized as competitive and non-competitive. While the former is an attempt to grab the floor, the latter is an attempt to assist the speaker to continue the turn. The presence and distribution of these categories are indicative of the speakers’ states during the conversation. Therefore, understanding these manifestations is crucial for conversational analysis and for modeling human–machine dialogs. The goal of this study is to design computational models to classify overlapping speech segments of dyadic conversations into competitive vs. non-competitive acts using lexical and acoustic cues, as well as their surrounding context. The designed overlap representations are evaluated in both linear – Support Vector Machines (SVM) – and non-linear – feed-forward (FFNN), convolutional (CNN) and long short-term memory (LSTM) neural network – models. We experiment with lexical and acoustic representations and their combinations from both speaker channels in feature and hidden space. We observe that lexical word-embedding features significantly increase the overall F1-measure compared to both acoustic and bag-of-ngrams lexical representations, suggesting that lexical information can be utilized as a powerful cue for overlap classification. Our comparative study shows that the best computational architecture is an FFNN along with a combination of word embeddings and acoustic features. © 2018 Elsevier Ltd. All rights reserved.
Automatic Classification of Speech Overlaps: Feature Representation and Algorithms / Chowdhury, S. A.; Stepanov, E. A.; Danieli, M.; Riccardi, G.. - In: COMPUTER SPEECH AND LANGUAGE. - ISSN 0885-2308. - 55(2019), pp. 145-167.
|Titolo:||Automatic Classification of Speech Overlaps: Feature Representation and Algorithms|
|Autori:||Chowdhury, S. A.; Stepanov, E. A.; Danieli, M.; Riccardi, G.|
|Titolo del periodico:||COMPUTER SPEECH AND LANGUAGE|
|Anno di pubblicazione:||2019|
|Codice identificativo Scopus:||2-s2.0-85059132607|
|Codice identificativo ISI:||WOS:000456592100008|
|Digital Object Identifier (DOI):||http://dx.doi.org/10.1016/j.csl.2018.12.001|
|Citazione:||Automatic Classification of Speech Overlaps: Feature Representation and Algorithms / Chowdhury, S. A.; Stepanov, E. A.; Danieli, M.; Riccardi, G.. - In: COMPUTER SPEECH AND LANGUAGE. - ISSN 0885-2308. - 55(2019), pp. 145-167.|
|Appare nelle tipologie:||03.1 Articolo su rivista (Journal article)|
File in questo prodotto:
|CSL19-SpeechOverlapCategorization.pdf||Versione editoriale (Publisher’s layout)||Tutti i diritti riservati (All rights reserved)||Administrator|