The demand for understanding machine learning models has led to the development of interpretable-by-design models that provide both outcomes and explanations. In this paper, we extend the concept of Prototypical Part Networks to the audio domain with SonicProtoPNet. This model enables a “this sounds like that” reasoning for audio classification, where a test instance audio is classified based on prototypical parts that most resemble specific areas of specific training instances. Quantitative results from genre and environmental sound classification, as well as musical instrument recognition tasks, demonstrate satisfactory per formance using the Log-Mel transformation of the audio input signal, further supported by backbone pre-training on image-input data. Furthermore, we introduce a high-quality back-soundification method for the learned sonic prototypes, facilitating intuitive interpretation of classification decisions through auditory inspection.
This Sounds Like That: Explainable Audio Classification via Prototypical Parts
Fedele, Andrea
;Guidotti, Riccardo;Pedreschi, Dino
2025-01-01
Abstract
The demand for understanding machine learning models has led to the development of interpretable-by-design models that provide both outcomes and explanations. In this paper, we extend the concept of Prototypical Part Networks to the audio domain with SonicProtoPNet. This model enables a “this sounds like that” reasoning for audio classification, where a test instance audio is classified based on prototypical parts that most resemble specific areas of specific training instances. Quantitative results from genre and environmental sound classification, as well as musical instrument recognition tasks, demonstrate satisfactory per formance using the Log-Mel transformation of the audio input signal, further supported by backbone pre-training on image-input data. Furthermore, we introduce a high-quality back-soundification method for the learned sonic prototypes, facilitating intuitive interpretation of classification decisions through auditory inspection.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


