Abstract
With the advent of explainable artificial intelligence (XAI) to explain the outputs of black-box machine learning models, the question arises how such explanations should be conceptualised. Specifically, assuming that XAI methods provide explanations, what types of explanations are these? Although this question is usually left implicit, informal discussion of XAI methods often suggests that XAI provides reason explanations that show a machine learning model’s reasons for its classifications. This suggestion has recently been explicitly defended by Zerilli and Baum et al. This paper argues that this idea is mistaken. Neither of the two main families of views of reason explanations—the causal/dispositional family, on the one hand, and the interpretative family, on the other—can be applied to XAI. On the first type of view, it is unclear how the causal or dispositional states of black-box models could ground the normative attitudes characteristic of motivating reasons. On the second type of view, a reason explanation should show a decision in a positive light from the decision-maker’s perspective. This means that the provider of a reason explanation should be able to (accurately) answer normative why-questions about the decision-maker. It also requires the assumption that the decision-maker has recognisable concerns and beliefs. Black-box models, and XAI methods designed to explain them, do not meet these conditions. It is concluded that other types of explanations, such as causal explanations, are more appropriate to conceptualise XAI outputs, and that a reason-based conceptualisation risks creating false expectations and possibly over-reliance in users.
| Original language | English |
|---|---|
| Article number | 36 |
| Number of pages | 25 |
| Journal | Minds and Machines |
| Volume | 35 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - 2 Aug 2025 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025.
Keywords
- Explainable artificial intelligence
- Motivating reasons
- Reason explanations