Within the field of automatic speech recognition, the processing of dysarthric speech is a challenge because standard approaches are ineffective in presence of dysarthria. This paper presents preliminary evidence that the performance of speaker-dependent speech recognition systems trained for speakers with dysarthria may be substantially improved by tuning the size and shift of the spectral analysis window used to compute the initial short-time Fourier transform used in many speech front ends. Evidence for this comes from a set of experiments performed on a small collection of Italian speech (isolated words) from five different speakers suffering from different degrees of dysarthria. The experimental framework used in the paper constructs speaker-dependent GMM-HMM speech recognition models using the triphone Kaldi recipe and varying choices of the spectral analysis window size and shift. Results show a variable improvement (31% to 81%), according to the selected user with dysarthria.

Enabling Smart Home Voice Control for Italian People with Dysarthria: Preliminary Analysis of Frame Rate Effect on Speech Recognition

Marini M.;Meoni G.;Mulfari D.;Vanello N.;Fanucci L.
2021-01-01

Abstract

Within the field of automatic speech recognition, the processing of dysarthric speech is a challenge because standard approaches are ineffective in presence of dysarthria. This paper presents preliminary evidence that the performance of speaker-dependent speech recognition systems trained for speakers with dysarthria may be substantially improved by tuning the size and shift of the spectral analysis window used to compute the initial short-time Fourier transform used in many speech front ends. Evidence for this comes from a set of experiments performed on a small collection of Italian speech (isolated words) from five different speakers suffering from different degrees of dysarthria. The experimental framework used in the paper constructs speaker-dependent GMM-HMM speech recognition models using the triphone Kaldi recipe and varying choices of the spectral analysis window size and shift. Results show a variable improvement (31% to 81%), according to the selected user with dysarthria.
2021
Marini, M.; Meoni, G.; Mulfari, D.; Vanello, N.; Fanucci, L.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1116684
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact