Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented.
Smartphone application for the analysis of prosodic features in running speech with a focus on bipolar disorders: System performance evaluation and case study
GUIDI, ANDREA;DE ROSSI, DANILO EMILIO;SCILINGO, ENZO PASQUALE;VANELLO, NICOLA
2015-01-01
Abstract
Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented.File | Dimensione | Formato | |
---|---|---|---|
sensors-15-28070.pdf
accesso aperto
Descrizione: Articolo
Tipologia:
Versione finale editoriale
Licenza:
Creative commons
Dimensione
1.21 MB
Formato
Adobe PDF
|
1.21 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.