We explore continual pre-training for Neural Machine Trans lation within a continual learning framework. We introduce a setting where new languages are gradually added to pre-trained models across multiple training experiences. These pre-trained models are subsequently f ine-tuned on downstream translation tasks. We compare mBART and mT5 pre-training objectives using four European Languages. Our find ings demonstrate that sequentially adding languages during pre-training effectively mitigates catastrophic forgetting and minimally impacts down stream task performance.
Sequential Continual Pre-Training for Neural Machine Translation
Niko Dalla Noce
;Michele Resta;Davide Bacciu
2024-01-01
Abstract
We explore continual pre-training for Neural Machine Trans lation within a continual learning framework. We introduce a setting where new languages are gradually added to pre-trained models across multiple training experiences. These pre-trained models are subsequently f ine-tuned on downstream translation tasks. We compare mBART and mT5 pre-training objectives using four European Languages. Our find ings demonstrate that sequentially adding languages during pre-training effectively mitigates catastrophic forgetting and minimally impacts down stream task performance.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.