Reservoir Computing (RC) enables efficiently-trained deep Recurrent Neural Networks (RNNs) by removing the need to train the hierarchy of representations of the input sequences. In this paper, we analyze the performance and the dynamical behavior of RC models, specifically Deep Bidirectional Echo State Networks (Deep-BiESNs), applied to Natural Language Processing (NLP) tasks. We compare the performance of Deep-BiESNs against fully-trained NLP baseline models on six common NLP tasks: three sequence-to-vector tasks for sequence-level classification and three sequence-to-sequence tasks for token-level labeling. Experimental results demonstrate that Deep-BiESNs achieve comparable or superior performance to these baseline models. We then adapt the class activation mapping technique for explainability to analyze the dynamical properties of these deep RC models, highlighting how the hierarchy of representations in Deep-BiESNs layers contributes to forming the class prediction in the different NLP tasks. Investigating time scales in deep RNN layers is highly relevant for NLP because language inherently involves dependencies that occur over various temporal horizons. The findings not only underscore the potential of Deep ESNs as a competitive and efficient alternative for NLP applications, but also contribute to a deeper understanding of how to effectively model such architectures for addressing other NLP challenges.

Investigating Time-Scales in Deep Echo State Networks for Natural Language Processing

Corrado Baccheschi;Alessandro Bondielli;Alessandro Lenci;Alessio Micheli;Lucia Passaro;Marco Podda;Domenico Tortorella
2025-01-01

Abstract

Reservoir Computing (RC) enables efficiently-trained deep Recurrent Neural Networks (RNNs) by removing the need to train the hierarchy of representations of the input sequences. In this paper, we analyze the performance and the dynamical behavior of RC models, specifically Deep Bidirectional Echo State Networks (Deep-BiESNs), applied to Natural Language Processing (NLP) tasks. We compare the performance of Deep-BiESNs against fully-trained NLP baseline models on six common NLP tasks: three sequence-to-vector tasks for sequence-level classification and three sequence-to-sequence tasks for token-level labeling. Experimental results demonstrate that Deep-BiESNs achieve comparable or superior performance to these baseline models. We then adapt the class activation mapping technique for explainability to analyze the dynamical properties of these deep RC models, highlighting how the hierarchy of representations in Deep-BiESNs layers contributes to forming the class prediction in the different NLP tasks. Investigating time scales in deep RNN layers is highly relevant for NLP because language inherently involves dependencies that occur over various temporal horizons. The findings not only underscore the potential of Deep ESNs as a competitive and efficient alternative for NLP applications, but also contribute to a deeper understanding of how to effectively model such architectures for addressing other NLP challenges.
2025
9783032045522
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1324255
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact