STEM disciplines rely heavily on visual content, such as charts, diagrams, formulas. These figures are often inaccessible to blind or visually impaired users due to the lack of meaningful alternative text. While automated image captioning has progressed, existing datasets are largely oriented toward general images and overlook the structural and semantic complexity of STEM visual contents. This paper presents a descriptive review of publicly available image datasets, evaluating their applicability for generating accessible descriptions of STEM images. Our analysis reveals major gaps: limited support for complex scientific content, shallow annotations, and little consideration for accessibility standards. We argue for the creation of a specialized dataset with rich, structured annotations aligned with accessibility goals. By identifying critical gaps, this work supports the development of AI tools and datasets that enhance inclusive access to STEM content.
A Descriptive Review of Image Datasets for Accessible Alternative Descriptions in STEM Domains
Cardia, Marco;Buzzi, Marina;Galesi, Giulio;Leporini, Barbara
2025-01-01
Abstract
STEM disciplines rely heavily on visual content, such as charts, diagrams, formulas. These figures are often inaccessible to blind or visually impaired users due to the lack of meaningful alternative text. While automated image captioning has progressed, existing datasets are largely oriented toward general images and overlook the structural and semantic complexity of STEM visual contents. This paper presents a descriptive review of publicly available image datasets, evaluating their applicability for generating accessible descriptions of STEM images. Our analysis reveals major gaps: limited support for complex scientific content, shallow annotations, and little consideration for accessibility standards. We argue for the creation of a specialized dataset with rich, structured annotations aligned with accessibility goals. By identifying critical gaps, this work supports the development of AI tools and datasets that enhance inclusive access to STEM content.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


