Emotionrecognitionhasattractedalotofinterestinrecentyearsinvariousapplicationareassuchashealth- care and autonomous driving. Existing approaches to emotion recognition are based on visual, speech, or psychophysiologicalsignals.However,recentstudiesarelookingatmultimodaltechniquesthatcombinedif- ferent modalities for emotion recognition. In this work, we address the problem of recognizing the user’s emotion as a driver from unlabeled videos using multimodal techniques. We propose a collaborative train- ingmethodbasedoncross-modaldistillation,i.e.,“FedCMD”(FederatedCross-ModalDistillation).Federated Learning (FL) is an emerging collaborative decentralized learning technique that allows each participant to train their model locally to build a better generalized global model without sharing their data. The main ad- vantageofFListhatonlylocaldataisusedfortraining,thusmaintainingprivacyandprovidingasecureand efficientemotionrecognitionsystem.ThelocalmodelinFListrainedforeachvehicledevicewithunlabeled videodatabyusingsensordataasaproxy.Specifically,foreachlocalmodel,weshowhowdriveremotional annotationscanbetransferredfromthesensordomaintothevisualdomainbyusingcross-modaldistillation. Thekeyideaisbasedontheobservationthatadriver’semotionalstateindicatedbyasensorcorrelateswith facial expressions shown in videos. The proposed “FedCMD” approach is tested on the multimodal dataset “BioVidEmoDB”andachievesstate-of-the-artperformance.Experimentalresultsshowthatourapproachis robust to non-identically distributed data, achieving 96.67% and 90.83% accuracy in classifying five different emotions with IID (independently and identically distributed) and non-IID data, respectively. Moreover, our modelismuchmorerobusttooverfitting,resultinginbettergeneralizationthantheotherexistingmethods.

FedCMD: A Federated Cross-modal Knowledge Distillation for Drivers’ Emotion Recognition

Saira Bano;Nicola Tonellotto;
2024-01-01

Abstract

Emotionrecognitionhasattractedalotofinterestinrecentyearsinvariousapplicationareassuchashealth- care and autonomous driving. Existing approaches to emotion recognition are based on visual, speech, or psychophysiologicalsignals.However,recentstudiesarelookingatmultimodaltechniquesthatcombinedif- ferent modalities for emotion recognition. In this work, we address the problem of recognizing the user’s emotion as a driver from unlabeled videos using multimodal techniques. We propose a collaborative train- ingmethodbasedoncross-modaldistillation,i.e.,“FedCMD”(FederatedCross-ModalDistillation).Federated Learning (FL) is an emerging collaborative decentralized learning technique that allows each participant to train their model locally to build a better generalized global model without sharing their data. The main ad- vantageofFListhatonlylocaldataisusedfortraining,thusmaintainingprivacyandprovidingasecureand efficientemotionrecognitionsystem.ThelocalmodelinFListrainedforeachvehicledevicewithunlabeled videodatabyusingsensordataasaproxy.Specifically,foreachlocalmodel,weshowhowdriveremotional annotationscanbetransferredfromthesensordomaintothevisualdomainbyusingcross-modaldistillation. Thekeyideaisbasedontheobservationthatadriver’semotionalstateindicatedbyasensorcorrelateswith facial expressions shown in videos. The proposed “FedCMD” approach is tested on the multimodal dataset “BioVidEmoDB”andachievesstate-of-the-artperformance.Experimentalresultsshowthatourapproachis robust to non-identically distributed data, achieving 96.67% and 90.83% accuracy in classifying five different emotions with IID (independently and identically distributed) and non-IID data, respectively. Moreover, our modelismuchmorerobusttooverfitting,resultinginbettergeneralizationthantheotherexistingmethods.
2024
Bano, Saira; Tonellotto, Nicola; Cassarà, Pietro; Gotta, Alberto
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1264871
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact