Every year, 2.5 million car crashes involve distracted drivers globally. It takes a few seconds for a car crash to happen after the driver has been distracted. Distracted driving thus poses a critical threat to road safety, needing innovative approaches for its detection and mitigation. This paper introduces a novel system to monitor in-car conversations and identify potential distractions from escalating arguments. The system analyzes Mel spectrograms generated from real-time audio signals containing in-car discussions by combining continuous voice recording and deep learning techniques. First, a denoiser employs a convolutional autoencoder to reduce car engine noise within the spectrograms. Then, a classifier uses convolutional and recurrent neural networks to determine whether the audio corresponds to a calm conversation or a quarrel based on the denoised spectrogram. The experimental results showed that the system achieved a 91.8% classification accuracy. This system addresses a previously unexplored dimension of cognitive distraction, offering valuable insights into strategies for reducing the risk of road accidents. Ongoing research is focused on accounting for other environmental noises, such as radio speakers, music, wind from open windows, and engine sounds from surrounding vehicles, which may influence classification accuracy. The system is also being extended to consider more than two occupants in the car.

Speech-Based Detection of In-Car Escalating Arguments to Prevent Distracted Driving

Francesco Pistolesi
Primo
;
Michele Baldassini
Secondo
;
Beatrice Lazzerini
Ultimo
2023-01-01

Abstract

Every year, 2.5 million car crashes involve distracted drivers globally. It takes a few seconds for a car crash to happen after the driver has been distracted. Distracted driving thus poses a critical threat to road safety, needing innovative approaches for its detection and mitigation. This paper introduces a novel system to monitor in-car conversations and identify potential distractions from escalating arguments. The system analyzes Mel spectrograms generated from real-time audio signals containing in-car discussions by combining continuous voice recording and deep learning techniques. First, a denoiser employs a convolutional autoencoder to reduce car engine noise within the spectrograms. Then, a classifier uses convolutional and recurrent neural networks to determine whether the audio corresponds to a calm conversation or a quarrel based on the denoised spectrogram. The experimental results showed that the system achieved a 91.8% classification accuracy. This system addresses a previously unexplored dimension of cognitive distraction, offering valuable insights into strategies for reducing the risk of road accidents. Ongoing research is focused on accounting for other environmental noises, such as radio speakers, music, wind from open windows, and engine sounds from surrounding vehicles, which may influence classification accuracy. The system is also being extended to consider more than two occupants in the car.
2023
979-8-3503-2445-7
File in questo prodotto:
File Dimensione Formato  
2023325950.pdf

accesso aperto

Descrizione: Main document
Tipologia: Documento in Post-print
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.96 MB
Formato Adobe PDF
2.96 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1212690
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact