Voice Activity Detectors (VADs) are used to enhance performances and to reduce the activation rate of speech recognition and key-word spotting applications. The last aspect is crucial for portable applications because it allows to save energy, increasing battery life. During last decades, VADs have been realized through hardware solutions to increase their speed in processing and to reduce their power consumption. However, the hardware implementation often represents a limit on the choice of the features to use, limiting the performances on recognition. This paper shows a low-power and low-area serial logistic regression classffier which uses the frame-energy, the maximum absolute signal finite difference and the maximum absolute squared signal finite difference over a frame as features. The system has been implemented on IGLOO nano Field Programmable Gate Array (FPGA), leading to power consumption of 0.559 mW and offering acceptable performances for its use as a preprocessor for speech recognition systems or a more sophisticated software VAD.

A low power Voice Activity Detector for portable applications

Meoni, Gabriele
Primo
;
Pilato, Luca
Secondo
;
Fanucci, Luca
Ultimo
2018-01-01

Abstract

Voice Activity Detectors (VADs) are used to enhance performances and to reduce the activation rate of speech recognition and key-word spotting applications. The last aspect is crucial for portable applications because it allows to save energy, increasing battery life. During last decades, VADs have been realized through hardware solutions to increase their speed in processing and to reduce their power consumption. However, the hardware implementation often represents a limit on the choice of the features to use, limiting the performances on recognition. This paper shows a low-power and low-area serial logistic regression classffier which uses the frame-energy, the maximum absolute signal finite difference and the maximum absolute squared signal finite difference over a frame as features. The system has been implemented on IGLOO nano Field Programmable Gate Array (FPGA), leading to power consumption of 0.559 mW and offering acceptable performances for its use as a preprocessor for speech recognition systems or a more sophisticated software VAD.
2018
9781538653869
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/949911
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 6
social impact