The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The major contributions of this thesis are as follows: (i) We present (and made publicly available) the very first multimodal dataset that includes the MEG brain signals, facial videos and some peripheral physisological signals of 30 users in response to two sets of affective dynamic stimuli. The dataset is recorded via cutting-edge lab equipments in highly controlled lab environments, facilitating proper analysis of MEG brain responses for affective neuro-science research. (ii) We then present two other multimodal datasets that we recorded using off-the-shelves market-available sensors for the purpose of analyzing users' affective responses to video clips and computer-generated music excerpts. The stimuli are selectively chosen to evoke certain target emotions. The first dataset also includes the BigFive personality traits of individuals and we show that it is possible to infer users' personality traits given their spontaneous reactions to affective videos. Both multimodal datasets are acquired via commercial sensors that are prone to noise artifacts that lead to some noisy uni-modal recordings. We made both datasets publicly available together with quality-assessments of each signal recording. Within the research on the second dataset we present a multimodal inference system that jointly considers the quality of signals and ends up with highly signal noise tolerance. We also show that peripheral physiological signals include patterns that are similar across user. We develop a cross-user affect recognition system that is successfully validated via a leave-one-subject-out cross-validation scheme on the second dataset. (iii) We also present a crowdsourcing protocol for the collection of time-continuous affect annotations for videos. We collect a dataset of affective annotations for 12 videos with the contribution of over 1500 crowd-workers. We introduce algorithms to extract high quality time-continuous affect annotations for the 12 videos from the noisy crowd annotations. We observe that, for the prediction of time-continuous affect annotations given low-level multimedia content, higher regression accuracies are achieved when the crowd sourced annotations are employed as labels than expert annotations. The study suggests that expensive expert annotations for large affective video corpora developments could be replaced by crowdsourcing annotation techniques. Finally, we discuss opportunities for future applications of our research, and conclude with a summary of our contributions to the field of affective computing

Analysis of users' psycho-physiological parameters in response to affective multimedia - A mutlimodal and implicit approach for user-centric multimedia tagging / Khomami Abadi, Mojtaba. - (2017), pp. 1-149.

Analysis of users' psycho-physiological parameters in response to affective multimedia - A mutlimodal and implicit approach for user-centric multimedia tagging

Khomami Abadi, Mojtaba
2017-01-01

Abstract

The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The major contributions of this thesis are as follows: (i) We present (and made publicly available) the very first multimodal dataset that includes the MEG brain signals, facial videos and some peripheral physisological signals of 30 users in response to two sets of affective dynamic stimuli. The dataset is recorded via cutting-edge lab equipments in highly controlled lab environments, facilitating proper analysis of MEG brain responses for affective neuro-science research. (ii) We then present two other multimodal datasets that we recorded using off-the-shelves market-available sensors for the purpose of analyzing users' affective responses to video clips and computer-generated music excerpts. The stimuli are selectively chosen to evoke certain target emotions. The first dataset also includes the BigFive personality traits of individuals and we show that it is possible to infer users' personality traits given their spontaneous reactions to affective videos. Both multimodal datasets are acquired via commercial sensors that are prone to noise artifacts that lead to some noisy uni-modal recordings. We made both datasets publicly available together with quality-assessments of each signal recording. Within the research on the second dataset we present a multimodal inference system that jointly considers the quality of signals and ends up with highly signal noise tolerance. We also show that peripheral physiological signals include patterns that are similar across user. We develop a cross-user affect recognition system that is successfully validated via a leave-one-subject-out cross-validation scheme on the second dataset. (iii) We also present a crowdsourcing protocol for the collection of time-continuous affect annotations for videos. We collect a dataset of affective annotations for 12 videos with the contribution of over 1500 crowd-workers. We introduce algorithms to extract high quality time-continuous affect annotations for the 12 videos from the noisy crowd annotations. We observe that, for the prediction of time-continuous affect annotations given low-level multimedia content, higher regression accuracies are achieved when the crowd sourced annotations are employed as labels than expert annotations. The study suggests that expensive expert annotations for large affective video corpora developments could be replaced by crowdsourcing annotation techniques. Finally, we discuss opportunities for future applications of our research, and conclude with a summary of our contributions to the field of affective computing
2017
XXVII
2017-2018
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Sebe, Nicu
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
Disclaimer_Abadi.pdf

Solo gestori archivio

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.35 MB
Formato Adobe PDF
2.35 MB Adobe PDF   Visualizza/Apri
Moji-Thesis.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 8.05 MB
Formato Adobe PDF
8.05 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368244
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact