A Multi-task Learning Framework for Time-continuous Emotion Estimation from Crowd Annotations

IRIS

We propose Multi-task learning (MTL) for time-continuous or dynamic emotion (valence and arousal) estimation in movie scenes. Since compiling annotated training data for dynamic emotion prediction is tedious, we employ crowdsourcing for the same. Even though the crowdworkers come from various demographics, we demonstrate that MTL can effectively discover (1) consistent patterns in their dynamic emotion perception, and (2) the low-level audio and video features that contribute to their valence, arousal (VA) elicitation. Finally, we show that MTL-based regression models, which simultaneously learn the relationship between low-level audio-visual features and high-level VA ratings from a collection of movie scenes, can predict VA ratings for time-contiguous snippets from each scene more effectively than scene-specific models.

A Multi-task Learning Framework for Time-continuous Emotion Estimation from Crowd Annotations / Khomami Abadi, Mojtaba; Abad, Azad; Subramanian, Ramanathan; Rostamzadeh, Negar; Ricci, E.; Varadarajan, J.; Sebe, Niculae. - (2014), pp. 17-23. ( 3rd International ACM Workshop on Crowdsourcing for Multimedia, CrowdMM 2014 Orlando 5 November 2014) [10.1145/2660114.2660126].

A Multi-task Learning Framework for Time-continuous Emotion Estimation from Crowd Annotations

Khomami Abadi, Mojtaba;Abad, Azad;Subramanian, Ramanathan;Rostamzadeh, Negar;E. Ricci;J. Varadarajan;Sebe, Niculae

2014-01-01

Abstract

We propose Multi-task learning (MTL) for time-continuous or dynamic emotion (valence and arousal) estimation in movie scenes. Since compiling annotated training data for dynamic emotion prediction is tedious, we employ crowdsourcing for the same. Even though the crowdworkers come from various demographics, we demonstrate that MTL can effectively discover (1) consistent patterns in their dynamic emotion perception, and (2) the low-level audio and video features that contribute to their valence, arousal (VA) elicitation. Finally, we show that MTL-based regression models, which simultaneously learn the relationship between low-level audio-visual features and high-level VA ratings from a collection of movie scenes, can predict VA ratings for time-contiguous snippets from each scene more effectively than scene-specific models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2014
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the International ACM Workshop on Crowdsourcing for Multimedia
			
	Luogo di edizione (Place of publication)
	
				New York
			
	Casa editrice (Publisher)
	
				ACM
			
	ISBN
	
				9781450331289
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-84915804125
			
	Tutti gli autori
	
						Khomami Abadi, Mojtaba; Abad, Azad; Subramanian, Ramanathan; Rostamzadeh, Negar; Ricci, E.; Varadarajan, J.; Sebe, Niculae
					
	Citazione
	
				A Multi-task Learning Framework for Time-continuous Emotion Estimation from Crowd Annotations / Khomami Abadi, Mojtaba; Abad, Azad; Subramanian, Ramanathan; Rostamzadeh, Negar; Ricci, E.; Varadarajan, J.; Sebe, Niculae. - (2014), pp. 17-23. ( 3rd International ACM Workshop on Crowdsourcing for Multimedia, CrowdMM 2014 Orlando 5 November 2014) [10.1145/2660114.2660126].
			
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/97417

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

3

ND

8

social impact