Every year, 2.5 million car crashes involve distracted drivers globally. It takes a few seconds for a car crash to happen after the driver has been distracted. Distracted driving thus poses a critical threat to road safety, needing innovative approaches for its detection and mitigation. This paper introduces a novel system to monitor in-car conversations and identify potential distractions from escalating arguments. The system analyzes Mel spectrograms generated from real-time audio signals containing in-car discussions by combining continuous voice recording and deep learning techniques. First, a denoiser employs a convolutional autoencoder to reduce car engine noise within the spectrograms. Then, a classifier uses convolutional and recurrent neural networks to determine whether the audio corresponds to a calm conversation or a quarrel based on the denoised spectrogram. The experimental results showed that the system achieved a 91.8% classification accuracy. This system addresses a previously unexplored dimension of cognitive distraction, offering valuable insights into strategies for reducing the risk of road accidents. Ongoing research is focused on accounting for other environmental noises, such as radio speakers, music, wind from open windows, and engine sounds from surrounding vehicles, which may influence classification accuracy. The system is also being extended to consider more than two occupants in the car.
Speech-Based Detection of In-Car Escalating Arguments to Prevent Distracted Driving
Francesco Pistolesi
Primo
;Michele BaldassiniSecondo
;Beatrice LazzeriniUltimo
2023-01-01
Abstract
Every year, 2.5 million car crashes involve distracted drivers globally. It takes a few seconds for a car crash to happen after the driver has been distracted. Distracted driving thus poses a critical threat to road safety, needing innovative approaches for its detection and mitigation. This paper introduces a novel system to monitor in-car conversations and identify potential distractions from escalating arguments. The system analyzes Mel spectrograms generated from real-time audio signals containing in-car discussions by combining continuous voice recording and deep learning techniques. First, a denoiser employs a convolutional autoencoder to reduce car engine noise within the spectrograms. Then, a classifier uses convolutional and recurrent neural networks to determine whether the audio corresponds to a calm conversation or a quarrel based on the denoised spectrogram. The experimental results showed that the system achieved a 91.8% classification accuracy. This system addresses a previously unexplored dimension of cognitive distraction, offering valuable insights into strategies for reducing the risk of road accidents. Ongoing research is focused on accounting for other environmental noises, such as radio speakers, music, wind from open windows, and engine sounds from surrounding vehicles, which may influence classification accuracy. The system is also being extended to consider more than two occupants in the car.File | Dimensione | Formato | |
---|---|---|---|
2023325950.pdf
accesso aperto
Descrizione: Main document
Tipologia:
Documento in Post-print
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.96 MB
Formato
Adobe PDF
|
2.96 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.