Machine learning (ML) methods need to explain their reasoning to allow professionals to validate and trust their predictions, and employ those in real-world decision-making processes. To do so, explainable artificial intelligence (XAI) methods based on feature importance can be employed, even though those can be very computationally expensive. Moreover, it can be challenging to determine whether an XAI technique might introduce bias into the explanation (e.g., overestimating or underestimating the feature importance) in the absence of some reference feature importance measure or even some domain knowledge from which deriving an expected importance level for each feature. We address both these issues by (i) employing a counterfactualbased strategy, i.e. deriving a measure of feature importance by checking if some minor changes in one feature’s values significantly affect the ML model’s regression outcome, and (ii) employing both synthetic and real-world industrial data coupled with the expected degree of importance for each feature. Our experimental results show that the proposed approach (BoCSoRr) is more reliable and way less computationally expensive than DiCE, a well-known counterfactual-based XAI approach able to provide a measure of feature importance.
Counterfactual-Based Feature Importance for Explainable Regression of Manufacturing Production Quality Measure
Alfeo A. L.;Cimino M. G. C. A.
2024-01-01
Abstract
Machine learning (ML) methods need to explain their reasoning to allow professionals to validate and trust their predictions, and employ those in real-world decision-making processes. To do so, explainable artificial intelligence (XAI) methods based on feature importance can be employed, even though those can be very computationally expensive. Moreover, it can be challenging to determine whether an XAI technique might introduce bias into the explanation (e.g., overestimating or underestimating the feature importance) in the absence of some reference feature importance measure or even some domain knowledge from which deriving an expected importance level for each feature. We address both these issues by (i) employing a counterfactualbased strategy, i.e. deriving a measure of feature importance by checking if some minor changes in one feature’s values significantly affect the ML model’s regression outcome, and (ii) employing both synthetic and real-world industrial data coupled with the expected degree of importance for each feature. Our experimental results show that the proposed approach (BoCSoRr) is more reliable and way less computationally expensive than DiCE, a well-known counterfactual-based XAI approach able to provide a measure of feature importance.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.