This paper explores the impact of ecologically and cognitively plausible data on the training of language models. It builds on prior work [1, 2] integrating child-directed speech, curriculum learning and instruction tuning to train Italian BabyLMs. To evaluate our BabyLMs, we compare their performance (trained on fewer than 100M words using various techniques) with that of native Italian Large Language Models using the Invalsi-ITA [3] benchmark, designed to evaluate Italian students on text comprehension and linguistic abilities. The goal is to assess whether cognitively motivated training approaches (Curriculum Learning based on Child-Directed speech and child-friendly data), which are crucial for meaningful comparison between human learners and computational systems [4], yield greater efficiency than standard methods.

BAMBI Goes to School: Evaluating Italian BabyLMs with Invalsi-ITA

Luca Capone;Alessandro Lenci
2025-01-01

Abstract

This paper explores the impact of ecologically and cognitively plausible data on the training of language models. It builds on prior work [1, 2] integrating child-directed speech, curriculum learning and instruction tuning to train Italian BabyLMs. To evaluate our BabyLMs, we compare their performance (trained on fewer than 100M words using various techniques) with that of native Italian Large Language Models using the Invalsi-ITA [3] benchmark, designed to evaluate Italian students on text comprehension and linguistic abilities. The goal is to assess whether cognitively motivated training approaches (Curriculum Learning based on Child-Directed speech and child-friendly data), which are crucial for meaningful comparison between human learners and computational systems [4], yield greater efficiency than standard methods.
File in questo prodotto:
File Dimensione Formato  
2025 Capone et al. - Bambi goes to school. Evaluating italian babylms with invalsi-ita.pdf

accesso aperto

Tipologia: Versione finale editoriale
Licenza: Creative commons
Dimensione 933.41 kB
Formato Adobe PDF
933.41 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1327955
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact