Natural Language Processing (NLP) has witnessed a paradigm shift with Large Language Models (LLMs), yet the static knowledge from pre-training can lead to knowledge obsolescence. This study focuses on the dynamic relationship between LLMs and evolving knowledge, using GPT-2 as a case study. Leveraging an existing framework, we update models with monthly Wikipedia dumps and Wikidata probes, addressing the stability-plasticity trade-off. We introduce a novel synthetic data generation method for experimental control and present SMARTREVIEW, a state-of-the-art continual learning method. This work advances understanding and methodologies in tackling knowledge obsolescence in evolving language models.
Updating knowledge in Large Language Models: an Empirical Evaluation
Carta Antonio;Passaro Lucia C.
2024-01-01
Abstract
Natural Language Processing (NLP) has witnessed a paradigm shift with Large Language Models (LLMs), yet the static knowledge from pre-training can lead to knowledge obsolescence. This study focuses on the dynamic relationship between LLMs and evolving knowledge, using GPT-2 as a case study. Leveraging an existing framework, we update models with monthly Wikipedia dumps and Wikidata probes, addressing the stability-plasticity trade-off. We introduce a novel synthetic data generation method for experimental control and present SMARTREVIEW, a state-of-the-art continual learning method. This work advances understanding and methodologies in tackling knowledge obsolescence in evolving language models.File | Dimensione | Formato | |
---|---|---|---|
EAIS58494.2024.10570019.pdf
non disponibili
Descrizione: Versione finale editoriale
Tipologia:
Versione finale editoriale
Licenza:
NON PUBBLICO - accesso privato/ristretto
Dimensione
736.13 kB
Formato
Adobe PDF
|
736.13 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.