The explainability-trust hypothesis: when intuition leads us to oversimplification

Trustworthiness is widely quoted as a key property to enable effective deployment of AI (High-Level Expert Group on AI, 2019; Leslie, 2019). However, it is not obvious how to achieve it. The literature (Creel, 2020; Páez, 2019; Ribeiro et al., 2016) often assumes that explanations lead to trust. This has been called the Explainability-Trust Hypothesis (ET) (Kästner et al., 2021). It is common to use ET to argue for Explainable AI (XAI). However, the link between trust and explanations is complex, and I argue that taking ET for granted is problematic. 

The main goal of the paper is to clarify how strong ET is and to point out to its limitations. The principal constraint that I highlight is the fact that even though explanations may lead to trust, they do not guarantee it (as the literature suggests). By disclosing ET, I aim to put the focus on problems that are usually overlooked and that need to be taken into account when using explanations as a way to achieve trust. It has already been suggested by Kästner et al. (2021), that the nature of ET is epistemological rather than empirical. I elaborate on this issue with the aim to uncover how a certain mental state is reached, being such state trust. Trust is often seen as an attitude (Jones, 1996; Nguyen, 2021). I find that it can be considered as a mental state, in so far as it is the belief that the trustee will perform as expected. According to ET, trust can be reached through explanations. That is, explaining to the truster how a system works can change their beliefs, subsequently affecting their mental states. The problem is that ET aims to stablish a necessary connection, which enters in conflict with this epistemological nature. This is so because different trusters have different beliefs as baselines; thus, the ways to achieve certain mental state would need to be different. Different trusters require different explanations. Then, postulating a rule that aims to guarantee how to achieve an attitude such as trust may not hold for every case. 

The critique to ET lays on the fact that it tends to oversimplify the matter. The connection between trust and explanations is contingent and dependant on the mental state of the truster. This is key for XAI, since acknowledging ET’s limitations helps to shed light on which kind of explanations lead to trust and in which contexts. This acknowledgement also makes clearer which kind of (appropriate) trust XAI aims at: the cognitive attitude towards an agent on whom we are confident relying on based on rational proof. Spelling out which kind of explanations lead to which kind of trust, helps us to understand until which point it is fair to establish a connection between the two concepts.