In this research, some of the issues that arise from the scalarization of the multi-objective optimization problem in the Advantage Actor–Critic (A2C) reinforcement learning algorithm are investigated. The paper shows how a naive scalarization can lead to gradients overlapping. Furthermore, the possibility that the entropy regularization term can be a source of uncontrolled noise is discussed. With respect to the above issues, a technique to avoid gradient overlapping is proposed, while keeping the same loss formulation. Moreover, a method to avoid the uncontrolled noise, by sampling the actions from distributions with a desired minimum entropy, is investigated. Pilot experiments have been carried out to show how the proposed method speeds up the training. The proposed approach can be applied to any Advantage-based Reinforcement Learning algorithm.
Solving the scalarization issues of Advantage-based Reinforcement Learning algorithms
Galatolo F. A.;Cimino M. G. C. A.;Vaglini G.
2021-01-01
Abstract
In this research, some of the issues that arise from the scalarization of the multi-objective optimization problem in the Advantage Actor–Critic (A2C) reinforcement learning algorithm are investigated. The paper shows how a naive scalarization can lead to gradients overlapping. Furthermore, the possibility that the entropy regularization term can be a source of uncontrolled noise is discussed. With respect to the above issues, a technique to avoid gradient overlapping is proposed, while keeping the same loss formulation. Moreover, a method to avoid the uncontrolled noise, by sampling the actions from distributions with a desired minimum entropy, is investigated. Pilot experiments have been carried out to show how the proposed method speeds up the training. The proposed approach can be applied to any Advantage-based Reinforcement Learning algorithm.File | Dimensione | Formato | |
---|---|---|---|
2004.04120.pdf
accesso aperto
Tipologia:
Documento in Pre-print
Licenza:
Creative commons
Dimensione
1.85 MB
Formato
Adobe PDF
|
1.85 MB | Adobe PDF | Visualizza/Apri |
1-s2.0-S004579062100121X-main.pdf
solo utenti autorizzati
Tipologia:
Versione finale editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
3.53 MB
Formato
Adobe PDF
|
3.53 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.