Current research in the emotion recognition field is exploring the possibility of merging the information from physiological signals, behavioural data, and speech. Electrodermal activity (EDA) is amongst the main psychophysiological arousal indicators. Nonetheless, it is quite difficult to be analyzed in ecological scenarios, like, for instance, when the subject is speaking. On the other hand, speech carries relevant information of subject emotional state and its potential in the field of affective computing is still to be fully exploited. In this work, we aim at exploring the possibility of merging the information from electrodermal activity (EDA) and speech to improve the recognition of human arousal level during the pronunciation of single affective words. Unlike the majority of studies in the literature, we focus on speakers' arousal rather than the emotion conveyed by the spoken word. Specifically, a support vector machine with recursive feature elimination strategy (SVM-RFE) is trained and tested on three datasets, i.e using the two channels (i.e., speech and EDA) separately and then jointly. The results show that the merging of EDA and speech information significantly improves the marginal classifier (+11.64%). The six selected features by the RFE procedure will be used for the development of a future multivariate model of emotions.
Combining Electrodermal Activity and Speech Analysis towards a more Accurate Emotion Recognition System
Greco A.;Lanata A.;Scilingo E. P.;Vanello N.
2019-01-01
Abstract
Current research in the emotion recognition field is exploring the possibility of merging the information from physiological signals, behavioural data, and speech. Electrodermal activity (EDA) is amongst the main psychophysiological arousal indicators. Nonetheless, it is quite difficult to be analyzed in ecological scenarios, like, for instance, when the subject is speaking. On the other hand, speech carries relevant information of subject emotional state and its potential in the field of affective computing is still to be fully exploited. In this work, we aim at exploring the possibility of merging the information from electrodermal activity (EDA) and speech to improve the recognition of human arousal level during the pronunciation of single affective words. Unlike the majority of studies in the literature, we focus on speakers' arousal rather than the emotion conveyed by the spoken word. Specifically, a support vector machine with recursive feature elimination strategy (SVM-RFE) is trained and tested on three datasets, i.e using the two channels (i.e., speech and EDA) separately and then jointly. The results show that the merging of EDA and speech information significantly improves the marginal classifier (+11.64%). The six selected features by the RFE procedure will be used for the development of a future multivariate model of emotions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.