We explore continual pre-training for Neural Machine Trans lation within a continual learning framework. We introduce a setting where new languages are gradually added to pre-trained models across multiple training experiences. These pre-trained models are subsequently f ine-tuned on downstream translation tasks. We compare mBART and mT5 pre-training objectives using four European Languages. Our find ings demonstrate that sequentially adding languages during pre-training effectively mitigates catastrophic forgetting and minimally impacts down stream task performance.

Sequential Continual Pre-Training for Neural Machine Translation

Niko Dalla Noce
;
Michele Resta;Davide Bacciu
2024-01-01

Abstract

We explore continual pre-training for Neural Machine Trans lation within a continual learning framework. We introduce a setting where new languages are gradually added to pre-trained models across multiple training experiences. These pre-trained models are subsequently f ine-tuned on downstream translation tasks. We compare mBART and mT5 pre-training objectives using four European Languages. Our find ings demonstrate that sequentially adding languages during pre-training effectively mitigates catastrophic forgetting and minimally impacts down stream task performance.
2024
978-2-87587-090-2
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1291487
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact