Analysis of users' psycho-physiological parameters in response to affective multimedia - A mutlimodal and implicit approach for user-centric multimedia tagging

Khomami Abadi, Mojtaba

The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The major contributions of this thesis are as follows: (i) We present (and made publicly available) the very first multimodal dataset that includes the MEG brain signals, facial videos and some peripheral physisological signals of 30 users in response to two sets of affective dynamic stimuli. The dataset is recorded via cutting-edge lab equipments in highly controlled lab environments, facilitating proper analysis of MEG brain responses for affective neuro-science research. (ii) We then present two other multimodal datasets that we recorded using off-the-shelves market-available sensors for the purpose of analyzing users' affective responses to video clips and computer-generated music excerpts. The stimuli are selectively chosen to evoke certain target emotions. The first dataset also includes the BigFive personality traits of individuals and we show that it is possible to infer users' personality traits given their spontaneous reactions to affective videos. Both multimodal datasets are acquired via commercial sensors that are prone to noise artifacts that lead to some noisy uni-modal recordings. We made both datasets publicly available together with quality-assessments of each signal recording. Within the research on the second dataset we present a multimodal inference system that jointly considers the quality of signals and ends up with highly signal noise tolerance. We also show that peripheral physiological signals include patterns that are similar across user. We develop a cross-user affect recognition system that is successfully validated via a leave-one-subject-out cross-validation scheme on the second dataset. (iii) We also present a crowdsourcing protocol for the collection of time-continuous affect annotations for videos. We collect a dataset of affective annotations for 12 videos with the contribution of over 1500 crowd-workers. We introduce algorithms to extract high quality time-continuous affect annotations for the 12 videos from the noisy crowd annotations. We observe that, for the prediction of time-continuous affect annotations given low-level multimedia content, higher regression accuracies are achieved when the crowd sourced annotations are employed as labels than expert annotations. The study suggests that expensive expert annotations for large affective video corpora developments could be replaced by crowdsourcing annotation techniques. Finally, we discuss opportunities for future applications of our research, and conclude with a summary of our contributions to the field of affective computing

Analysis of users' psycho-physiological parameters in response to affective multimedia - A mutlimodal and implicit approach for user-centric multimedia tagging / Khomami Abadi, Mojtaba. - (2017), pp. 1-149.

Analysis of users' psycho-physiological parameters in response to affective multimedia - A mutlimodal and implicit approach for user-centric multimedia tagging

Khomami Abadi, Mojtaba

2017-01-01

Abstract

The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The major contributions of this thesis are as follows: (i) We present (and made publicly available) the very first multimodal dataset that includes the MEG brain signals, facial videos and some peripheral physisological signals of 30 users in response to two sets of affective dynamic stimuli. The dataset is recorded via cutting-edge lab equipments in highly controlled lab environments, facilitating proper analysis of MEG brain responses for affective neuro-science research. (ii) We then present two other multimodal datasets that we recorded using off-the-shelves market-available sensors for the purpose of analyzing users' affective responses to video clips and computer-generated music excerpts. The stimuli are selectively chosen to evoke certain target emotions. The first dataset also includes the BigFive personality traits of individuals and we show that it is possible to infer users' personality traits given their spontaneous reactions to affective videos. Both multimodal datasets are acquired via commercial sensors that are prone to noise artifacts that lead to some noisy uni-modal recordings. We made both datasets publicly available together with quality-assessments of each signal recording. Within the research on the second dataset we present a multimodal inference system that jointly considers the quality of signals and ends up with highly signal noise tolerance. We also show that peripheral physiological signals include patterns that are similar across user. We develop a cross-user affect recognition system that is successfully validated via a leave-one-subject-out cross-validation scheme on the second dataset. (iii) We also present a crowdsourcing protocol for the collection of time-continuous affect annotations for videos. We collect a dataset of affective annotations for 12 videos with the contribution of over 1500 crowd-workers. We introduce algorithms to extract high quality time-continuous affect annotations for the 12 videos from the noisy crowd annotations. We observe that, for the prediction of time-continuous affect annotations given low-level multimedia content, higher regression accuracies are achieved when the crowd sourced annotations are employed as labels than expert annotations. The study suggests that expensive expert annotations for large affective video corpora developments could be replaced by crowdsourcing annotation techniques. Finally, we discuss opportunities for future applications of our research, and conclude with a summary of our contributions to the field of affective computing

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				2017
			
	Ciclo
	
				XXVII
			
	Anno Accademico
	
				2017-2018
			
	Dipartimento
	
				Ingegneria e scienza dell'Informaz (29/10/12-)
			
	Corso di dottorato
	
				Information and Communication Technology
			
	Supervisore/Relatore di tesi esterno (External supervisor)
	
				Sebe, Nicu
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
Disclaimer_Abadi.pdf Solo gestori archivio Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.35 MB Formato Adobe PDF Visualizza/Apri	2.35 MB	Adobe PDF	Visualizza/Apri
Moji-Thesis.pdf accesso aperto Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 8.05 MB Formato Adobe PDF Visualizza/Apri	8.05 MB	Adobe PDF	Visualizza/Apri