The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The major contributions of this thesis are as follows: (i) We present (and made publicly available) the very first multimodal dataset that includes the MEG brain signals, facial videos and some peripheral physisological signals of 30 users in response to two sets of affective dynamic stimuli. The dataset is recorded via cutting-edge lab equipments in highly controlled lab environments, facilitating proper analysis of MEG brain responses for affective neuro-science research. (ii) We then present two other multimodal datasets that we recorded using off-the-shelves market-available sensors for the purpose of analyzing users' affective responses to video clips and computer-generated music excerpts. The stimuli are selectively chosen to evoke certain target emotions. The first dataset also includes the BigFive personality traits of individuals and we show that it is possible to infer users' personality traits given their spontaneous reactions to affective videos. Both multimodal datasets are acquired via commercial sensors that are prone to noise artifacts that lead to some noisy uni-modal recordings. We made both datasets publicly available together with quality-assessments of each signal recording. Within the research on the second dataset we present a multimodal inference system that jointly considers the quality of signals and ends up with highly signal noise tolerance. We also show that peripheral physiological signals include patterns that are similar across user. We develop a cross-user affect recognition system that is successfully validated via a leave-one-subject-out cross-validation scheme on the second dataset. (iii) We also present a crowdsourcing protocol for the collection of time-continuous affect annotations for videos. We collect a dataset of affective annotations for 12 videos with the contribution of over 1500 crowd-workers. We introduce algorithms to extract high quality time-continuous affect annotations for the 12 videos from the noisy crowd annotations. We observe that, for the prediction of time-continuous affect annotations given low-level multimedia content, higher regression accuracies are achieved when the crowd sourced annotations are employed as labels than expert annotations. The study suggests that expensive expert annotations for large affective video corpora developments could be replaced by crowdsourcing annotation techniques. Finally, we discuss opportunities for future applications of our research, and conclude with a summary of our contributions to the field of affective computing
Analysis of users' psycho-physiological parameters in response to affective multimedia - A mutlimodal and implicit approach for user-centric multimedia tagging / Khomami Abadi, Mojtaba. - (2017), pp. 1-149.
Analysis of users' psycho-physiological parameters in response to affective multimedia - A mutlimodal and implicit approach for user-centric multimedia tagging
Khomami Abadi, Mojtaba
2017-01-01
Abstract
The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The affective state of a user, during an interaction with a computer, is a great source of information for the computer in order to (i) employ the information for adapting an interaction, make the interaction flawless, leading in adaptive affective interfaces. The computer may also use emotional responses of a user to some affective multimedia content (ii) to tag the multimedia content with affective labels. The second is very useful to create affective profiles of users within real world applications for user-centric multimedia retrieval. Affective responses of users could be collected either explicitly (i.e. users directly assess their own emotions through computer interfaces) or implicitly (i.e. via sensors that collect psycho-physiological signals such as facial expressions, vocal clues, neuro-physiological signals, gestures and body postures). The major contributions of this thesis are as follows: (i) We present (and made publicly available) the very first multimodal dataset that includes the MEG brain signals, facial videos and some peripheral physisological signals of 30 users in response to two sets of affective dynamic stimuli. The dataset is recorded via cutting-edge lab equipments in highly controlled lab environments, facilitating proper analysis of MEG brain responses for affective neuro-science research. (ii) We then present two other multimodal datasets that we recorded using off-the-shelves market-available sensors for the purpose of analyzing users' affective responses to video clips and computer-generated music excerpts. The stimuli are selectively chosen to evoke certain target emotions. The first dataset also includes the BigFive personality traits of individuals and we show that it is possible to infer users' personality traits given their spontaneous reactions to affective videos. Both multimodal datasets are acquired via commercial sensors that are prone to noise artifacts that lead to some noisy uni-modal recordings. We made both datasets publicly available together with quality-assessments of each signal recording. Within the research on the second dataset we present a multimodal inference system that jointly considers the quality of signals and ends up with highly signal noise tolerance. We also show that peripheral physiological signals include patterns that are similar across user. We develop a cross-user affect recognition system that is successfully validated via a leave-one-subject-out cross-validation scheme on the second dataset. (iii) We also present a crowdsourcing protocol for the collection of time-continuous affect annotations for videos. We collect a dataset of affective annotations for 12 videos with the contribution of over 1500 crowd-workers. We introduce algorithms to extract high quality time-continuous affect annotations for the 12 videos from the noisy crowd annotations. We observe that, for the prediction of time-continuous affect annotations given low-level multimedia content, higher regression accuracies are achieved when the crowd sourced annotations are employed as labels than expert annotations. The study suggests that expensive expert annotations for large affective video corpora developments could be replaced by crowdsourcing annotation techniques. Finally, we discuss opportunities for future applications of our research, and conclude with a summary of our contributions to the field of affective computingFile | Dimensione | Formato | |
---|---|---|---|
Disclaimer_Abadi.pdf
Solo gestori archivio
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.35 MB
Formato
Adobe PDF
|
2.35 MB | Adobe PDF | Visualizza/Apri |
Moji-Thesis.pdf
accesso aperto
Tipologia:
Tesi di dottorato (Doctoral Thesis)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
8.05 MB
Formato
Adobe PDF
|
8.05 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione