The development of Large Language models, such as GPT, has already revolutionized the fields of Artificial Intelligence and Natural Language Processing, enabling human-like conversation generation. The increasing diffusion of ChatGPT is expected to influence many more areas and society as a whole. Education is one of the domains which is probably going to be affected the most by this revolution. Therefore, analyzing the general public’s opinions on emerging technologies, such as ChatGPT, becomes crucial to better understand their potential ethical implications and applications within the educational landscape, and might be a precious resource for various purposes, including performance evaluation, and identification of needs and expectations. This paper presents a novel dataset comprising 236 thousand tweets capturing public discourse surrounding ChatGPT and Education, along with enriched dimensions extracted from the original texts. By leveraging data enrichment techniques, we make the dataset accessible for analysis by researchers and practitioners who may not have expertise in programming and Natural Language Processing. This dataset serves as a valuable resource for performing exploratory data analysis about Twitter users’ perceptions of the potential impact of ChatGPT on the educational processes.
The ChatGPT and Education Tweets Dataset
Barandoni, Simone
Primo
;Chiarello, Filippo;Giordano, Vito;Fantoni, Gualtiero
2025-01-01
Abstract
The development of Large Language models, such as GPT, has already revolutionized the fields of Artificial Intelligence and Natural Language Processing, enabling human-like conversation generation. The increasing diffusion of ChatGPT is expected to influence many more areas and society as a whole. Education is one of the domains which is probably going to be affected the most by this revolution. Therefore, analyzing the general public’s opinions on emerging technologies, such as ChatGPT, becomes crucial to better understand their potential ethical implications and applications within the educational landscape, and might be a precious resource for various purposes, including performance evaluation, and identification of needs and expectations. This paper presents a novel dataset comprising 236 thousand tweets capturing public discourse surrounding ChatGPT and Education, along with enriched dimensions extracted from the original texts. By leveraging data enrichment techniques, we make the dataset accessible for analysis by researchers and practitioners who may not have expertise in programming and Natural Language Processing. This dataset serves as a valuable resource for performing exploratory data analysis about Twitter users’ perceptions of the potential impact of ChatGPT on the educational processes.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.