During the last few years, the abundance of data has significantly boosted the performance of Machine Learning models, integrating them into several aspects of daily life. However, the rise of powerful Artificial Intelligence tools has introduced ethical and legal complexities. This paper proposes a computational framework to analyze the ethical and legal dimensions of Machine Learning models, focusing specifically on privacy concerns and interpretability. In fact, recently, the research community proposed privacy attacks able to reveal whether a record was part of the black-box training set or inferring variable values by accessing and querying a Machine Learning model. These attacks highlight privacy vulnerabilities and prove that GDPR regulation might be violated by making data or Machine Learning models accessible. At the same time, the complexity of these models, often labelled as “black-boxes”, has made the development of explanation methods indispensable to enhance trust and facilitate their acceptance and adoption in high-stake scenarios. Our study highlights the trade-off between interpretability and privacy protection. By introducing REVEAL, this paper proposes a framework to evaluate the privacy exposure of black-box models and their surrogate-based explainers, whether local or global. Our methodology is adaptable and applicable across diverse black-box models and various privacy attack scenarios. Through an in-depth analysis, we show that the interpretability layer introduced by explanation models might jeopardize the privacy of individuals in the training data of the black-box, particularly with powerful privacy attacks requiring minimal knowledge but causing significant privacy breaches.

Evaluating the Privacy Exposure of Interpretable Global and Local Explainers

Naretto F.;Monreale A.;
2025-01-01

Abstract

During the last few years, the abundance of data has significantly boosted the performance of Machine Learning models, integrating them into several aspects of daily life. However, the rise of powerful Artificial Intelligence tools has introduced ethical and legal complexities. This paper proposes a computational framework to analyze the ethical and legal dimensions of Machine Learning models, focusing specifically on privacy concerns and interpretability. In fact, recently, the research community proposed privacy attacks able to reveal whether a record was part of the black-box training set or inferring variable values by accessing and querying a Machine Learning model. These attacks highlight privacy vulnerabilities and prove that GDPR regulation might be violated by making data or Machine Learning models accessible. At the same time, the complexity of these models, often labelled as “black-boxes”, has made the development of explanation methods indispensable to enhance trust and facilitate their acceptance and adoption in high-stake scenarios. Our study highlights the trade-off between interpretability and privacy protection. By introducing REVEAL, this paper proposes a framework to evaluate the privacy exposure of black-box models and their surrogate-based explainers, whether local or global. Our methodology is adaptable and applicable across diverse black-box models and various privacy attack scenarios. Through an in-depth analysis, we show that the interpretability layer introduced by explanation models might jeopardize the privacy of individuals in the training data of the black-box, particularly with powerful privacy attacks requiring minimal knowledge but causing significant privacy breaches.
2025
Naretto, F.; Monreale, A.; Giannotti, F.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1309607
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact