CINECA IRIS Institutional Research Information System

Emotionrecognitionhasattractedalotofinterestinrecentyearsinvariousapplicationareassuchashealth- care and autonomous driving. Existing approaches to emotion recognition are based on visual, speech, or psychophysiologicalsignals.However,recentstudiesarelookingatmultimodaltechniquesthatcombinedif- ferent modalities for emotion recognition. In this work, we address the problem of recognizing the user’s emotion as a driver from unlabeled videos using multimodal techniques. We propose a collaborative train- ingmethodbasedoncross-modaldistillation,i.e.,“FedCMD”(FederatedCross-ModalDistillation).Federated Learning (FL) is an emerging collaborative decentralized learning technique that allows each participant to train their model locally to build a better generalized global model without sharing their data. The main ad- vantageofFListhatonlylocaldataisusedfortraining,thusmaintainingprivacyandprovidingasecureand efficientemotionrecognitionsystem.ThelocalmodelinFListrainedforeachvehicledevicewithunlabeled videodatabyusingsensordataasaproxy.Specifically,foreachlocalmodel,weshowhowdriveremotional annotationscanbetransferredfromthesensordomaintothevisualdomainbyusingcross-modaldistillation. Thekeyideaisbasedontheobservationthatadriver’semotionalstateindicatedbyasensorcorrelateswith facial expressions shown in videos. The proposed “FedCMD” approach is tested on the multimodal dataset “BioVidEmoDB”andachievesstate-of-the-artperformance.Experimentalresultsshowthatourapproachis robust to non-identically distributed data, achieving 96.67% and 90.83% accuracy in classifying five different emotions with IID (independently and identically distributed) and non-IID data, respectively. Moreover, our modelismuchmorerobusttooverfitting,resultinginbettergeneralizationthantheotherexistingmethods.

FedCMD: A Federated Cross-modal Knowledge Distillation for Drivers’ Emotion Recognition

Saira Bano;Nicola Tonellotto;Pietro Cassarà;Alberto Gotta

2024-01-01

Abstract

Emotionrecognitionhasattractedalotofinterestinrecentyearsinvariousapplicationareassuchashealth- care and autonomous driving. Existing approaches to emotion recognition are based on visual, speech, or psychophysiologicalsignals.However,recentstudiesarelookingatmultimodaltechniquesthatcombinedif- ferent modalities for emotion recognition. In this work, we address the problem of recognizing the user’s emotion as a driver from unlabeled videos using multimodal techniques. We propose a collaborative train- ingmethodbasedoncross-modaldistillation,i.e.,“FedCMD”(FederatedCross-ModalDistillation).Federated Learning (FL) is an emerging collaborative decentralized learning technique that allows each participant to train their model locally to build a better generalized global model without sharing their data. The main ad- vantageofFListhatonlylocaldataisusedfortraining,thusmaintainingprivacyandprovidingasecureand efficientemotionrecognitionsystem.ThelocalmodelinFListrainedforeachvehicledevicewithunlabeled videodatabyusingsensordataasaproxy.Specifically,foreachlocalmodel,weshowhowdriveremotional annotationscanbetransferredfromthesensordomaintothevisualdomainbyusingcross-modaldistillation. Thekeyideaisbasedontheobservationthatadriver’semotionalstateindicatedbyasensorcorrelateswith facial expressions shown in videos. The proposed “FedCMD” approach is tested on the multimodal dataset “BioVidEmoDB”andachievesstate-of-the-artperformance.Experimentalresultsshowthatourapproachis robust to non-identically distributed data, achieving 96.67% and 90.83% accuracy in classifying five different emotions with IID (independently and identically distributed) and non-IID data, respectively. Moreover, our modelismuchmorerobusttooverfitting,resultinginbettergeneralizationthantheotherexistingmethods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3650040
			
	Tutti gli autori
	
						Bano, Saira; Tonellotto, Nicola; Cassarà, Pietro; Gotta, Alberto

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1264871

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

4

3

social impact