Influence of Hyperparameters on the Convergence of Adam Under the Polyak-Lojasiewicz Inequality

Xia, Lu; Massei, Stefano

doi:10.1007/978-3-031-86169-7_50

Adaptive moment estimation (Adam) is one of the most commonly used optimizer in the training of neural networks. The existing convergence studies focus on demonstrating that the limit point is stationary, i.e., limk→∞∇f(x(k))=0, or showing that the ratio between the algorithm’s regret and the number of iterations steps, goes to zero. In this work, we show that under the Polyak-Łojasiewicz inequality, the sequence of objective function values associated with a run of Adam converges linearly up to a neighbourhood of the optimal value. Moreover, our analysis sheds lights on the influence of the various hyperparameters on the convergence of the Adam optimizer. Numerical tests are conducted to assess the convergence speed and accuracy achieved by Adam when varying the configuration of the hyperparameters during the training of a multinomial logistic regression model for image classification.

Influence of Hyperparameters on the Convergence of Adam Under the Polyak-Lojasiewicz Inequality

Xia, Lu;Massei, Stefano

2025-01-01

Abstract

Adaptive moment estimation (Adam) is one of the most commonly used optimizer in the training of neural networks. The existing convergence studies focus on demonstrating that the limit point is stationary, i.e., limk→∞∇f(x(k))=0, or showing that the ratio between the algorithm’s regret and the number of iterations steps, goes to zero. In this work, we show that under the Polyak-Łojasiewicz inequality, the sequence of objective function values associated with a run of Adam converges linearly up to a neighbourhood of the optimal value. Moreover, our analysis sheds lights on the influence of the various hyperparameters on the convergence of the Adam optimizer. Numerical tests are conducted to assess the convergence speed and accuracy achieved by Adam when varying the configuration of the hyperparameters during the training of a multinomial logistic regression model for image classification.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno

2025

Codice ISBN

9783031861680
9783031861697

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1329377

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

CINECA IRIS Institutional Research Information System

Influence of Hyperparameters on the Convergence of Adam Under the Polyak-Lojasiewicz Inequality

Xia, Lu;Massei, Stefano

2025-01-01

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Attenzione

Citazioni

social impact

CINECA IRIS Institutional Research Information System

Influence of Hyperparameters on the Convergence of Adam Under the Polyak-Lojasiewicz Inequality

Xia, Lu;Massei, Stefano

2025-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)