CINECA IRIS Institutional Research Information System

The rise of sophisticated machine learning models has brought accurate but obscure decision systems, which hide their logic, thus undermining transparency, trust, and the adoption of AI in socially sensitive and safety-critical contexts. We introduce a local rule-based explanation method providing faithful explanations of the decision made by a black-box classifier on a specific instance. The proposed method first learns an interpretable, local classifier on a synthetic neighborhood of the instance under investigation, generated by a genetic algorithm. Then it derives from the interpretable classifier an explanation consisting of a decision rule, explaining the factual reasons of the decision, and a set of counterfactuals, suggesting the changes in the instance features that would lead to a different outcome. Experimental results show that the proposed method outperforms existing approaches in terms of the quality of the explanations and of the accuracy in mimicking the black-box.

Factual and Counterfactual Explanations for Black Box Decision Making

Guidotti R.^Primo;Monreale A.;Giannotti F.;Pedreschi D.;Ruggieri S.;Turini F.

2019-01-01

Abstract

The rise of sophisticated machine learning models has brought accurate but obscure decision systems, which hide their logic, thus undermining transparency, trust, and the adoption of AI in socially sensitive and safety-critical contexts. We introduce a local rule-based explanation method providing faithful explanations of the decision made by a black-box classifier on a specific instance. The proposed method first learns an interpretable, local classifier on a synthetic neighborhood of the instance under investigation, generated by a genetic algorithm. Then it derives from the interpretable classifier an explanation consisting of a decision rule, explaining the factual reasons of the decision, and a set of counterfactuals, suggesting the changes in the instance features that would lead to a different outcome. Experimental results show that the proposed method outperforms existing approaches in terms of the quality of the explanations and of the accuracy in mimicking the black-box.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Codice DOI
	
				https://dx.doi.org/10.1109/MIS.2019.2957223
			
	Tutti gli autori
	
						Guidotti, R.; Monreale, A.; Giannotti, F.; Pedreschi, D.; Ruggieri, S.; Turini, F.

File in questo prodotto:

File	Dimensione	Formato
IS_2019___LORE___CR.pdf accesso aperto Tipologia: Documento in Pre-print Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.49 MB Formato Adobe PDF Visualizza/Apri	1.49 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1023240

Citazioni

ND

249

165

social impact