Adaptive moment estimation (Adam) is one of the most commonly used optimizer in the training of neural networks. The existing convergence studies focus on demonstrating that the limit point is stationary, i.e., limk→∞∇f(x(k))=0, or showing that the ratio between the algorithm’s regret and the number of iterations steps, goes to zero. In this work, we show that under the Polyak-Łojasiewicz inequality, the sequence of objective function values associated with a run of Adam converges linearly up to a neighbourhood of the optimal value. Moreover, our analysis sheds lights on the influence of the various hyperparameters on the convergence of the Adam optimizer. Numerical tests are conducted to assess the convergence speed and accuracy achieved by Adam when varying the configuration of the hyperparameters during the training of a multinomial logistic regression model for image classification.

Influence of Hyperparameters on the Convergence of Adam Under the Polyak-Lojasiewicz Inequality

Massei, Stefano
2025-01-01

Abstract

Adaptive moment estimation (Adam) is one of the most commonly used optimizer in the training of neural networks. The existing convergence studies focus on demonstrating that the limit point is stationary, i.e., limk→∞∇f(x(k))=0, or showing that the ratio between the algorithm’s regret and the number of iterations steps, goes to zero. In this work, we show that under the Polyak-Łojasiewicz inequality, the sequence of objective function values associated with a run of Adam converges linearly up to a neighbourhood of the optimal value. Moreover, our analysis sheds lights on the influence of the various hyperparameters on the convergence of the Adam optimizer. Numerical tests are conducted to assess the convergence speed and accuracy achieved by Adam when varying the configuration of the hyperparameters during the training of a multinomial logistic regression model for image classification.
2025
9783031861680
9783031861697
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1329377
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact