CINECA IRIS Institutional Research Information System

We analyze Elman-type recurrent neural networks (RNNs) and their training in the mean-field regime. Specifically, we show convergence of gradient descent training dynamics of the RNN to the corresponding mean-field formulation in the large width limit. We also show that the fixed points of the limiting infinite-width dynamics are globally optimal, under some assumptions on the initialization of the weights. Our results establish optimality for feature-learning with wide RNNs in the mean-field regime.

Global optimality of Elman-type RNNs in the mean-field regime

Andrea Agazzi;Jianfeng Lu;Sayan Mukherjee

2023-01-01

Abstract

We analyze Elman-type recurrent neural networks (RNNs) and their training in the mean-field regime. Specifically, we show convergence of gradient descent training dynamics of the RNN to the corresponding mean-field formulation in the large width limit. We also show that the fixed points of the limiting infinite-width dynamics are globally optimal, under some assumptions on the initialization of the weights. Our results establish optimality for feature-learning with wide RNNs in the mean-field regime.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2023

Appare nelle tipologie:

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2303.06726.pdf accesso aperto Tipologia: Documento in Pre-print Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 673.44 kB Formato Adobe PDF Visualizza/Apri	673.44 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1174105

Citazioni

ND

0

0

social impact