We propose a novel architecture called Hierarchical-Task Reservoir (HTR) suitable for real-time sentence parsing from continuous speech. Accordingly, we introduce a novel task that consists in performing anytime Part-of-Speech (POS) tagging from continuous speech. This HTR architecture is designed to address three sub-tasks (phone, word and POS tag estimation) with increasing levels of abstraction. These tasks are performed by the consecutive layers of the HTR architecture. Interestingly, the qualitative results show that the learning of sub-tasks en-forces low frequency dynamics (i.e. with longer timescales) in the more abstract layers. We compared HTR with a baseline hierarchical reservoir architecture (in which each layer is an ESN that addresses the same POS tag estimation). Moreover, we also performed a thorough experimental comparison with several architectural variants. Finally, the HTR obtained the best performance in all experimental comparisons. Overall, the proposed approach will be a useful tool for further studies regarding both the modeling of language comprehension in a neuroscience context and for real-time implementations in Human-Robot Interaction (HRI) context.

Hierarchical-Task Reservoir for Anytime POS Tagging from Continuous Speech

Pedrelli, Luca;
2020-01-01

Abstract

We propose a novel architecture called Hierarchical-Task Reservoir (HTR) suitable for real-time sentence parsing from continuous speech. Accordingly, we introduce a novel task that consists in performing anytime Part-of-Speech (POS) tagging from continuous speech. This HTR architecture is designed to address three sub-tasks (phone, word and POS tag estimation) with increasing levels of abstraction. These tasks are performed by the consecutive layers of the HTR architecture. Interestingly, the qualitative results show that the learning of sub-tasks en-forces low frequency dynamics (i.e. with longer timescales) in the more abstract layers. We compared HTR with a baseline hierarchical reservoir architecture (in which each layer is an ESN that addresses the same POS tag estimation). Moreover, we also performed a thorough experimental comparison with several architectural variants. Finally, the HTR obtained the best performance in all experimental comparisons. Overall, the proposed approach will be a useful tool for further studies regarding both the modeling of language comprehension in a neuroscience context and for real-time implementations in Human-Robot Interaction (HRI) context.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1301468
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 4
social impact