A considerable portion of the literature on AI ethics is dedicated to developing an account of what it would take to make AI systems *trustworthy*. Here, we argue that trustworthy AI is not an appropriate goal around which to organize institutional policy. The argument takes the form of a dilemma, according to which the concept of trustworthiness can be interpreted in either a strong or a weak sense. According to the strong sense, trustworthiness requires elements of human sociality, such as pro-social motivation or the capacity to sustain social feedback loops (when Emma trusts Julia, it becomes more likely that Julia will trust Emma.) It is highly unlikely that AI will exhibit such properties in the near future. As a result, developing AI that is trustworthy in this strong sense is unrealistic.
The weak reading of trustworthiness is inspired by the work Thi Nguyen, who argues that trust is not essentially tied to human sociality, and that even simple artifacts, such as a rock climbing rope, can become trustworthy under the right conditions. For Nguyen, to trust an artifact is to take an unquestioning attitude toward it. Once you have taken an unquestioning attitude toward an artifact, you can get on with whatever you are doing, without having to expend cognitive energy on ensuring that it is working as you expect. With respect to AI, this is a dangerous attitude. AI systems are typically epistemically opaque, which entails, among other things, that we cannot foresee the nature of its mistakes. Under these circumstances, a call to actively cultivate an unquestioning attitude is irresponsible.
The philosophical literature on trustworthiness is full of divergent views, but almost everyone agrees that reliability constitutes an essential component. Moreover, reliability is a concept that applies unproblematically to epistemic tools such as measurement devices. We therefore explore the proposal that reliable AI, rather than trustworthy AI, should be our goal. Although reliability is an appropriate focal point for norms concerning many epistemic technologies, including computer simulation, it fails in the case of AI-generated decision making, especially where the decisions have significant social consequences. There are two reasons for this. First, many AI-generated decisions, such as the decision whether to grant a bank loan, are not truth-apt, and therefore not susceptible to standard conceptions of reliability. Second, as the literature on adversarial examples shows, in those cases where the AI-generated decisions are truth-apt, the mistakes can be egregious. As Paul Humphreys has recently argued, in such cases, reliability must be balanced against the extremity of the error. In the domain of AI, therefore, reliability, cannot do the justificatory work commonly assigned to trustworthiness.
To avoid these difficulties, we propose that the appropriate unit of analysis for the ethical use of AI technology is the larger social system that deploys that technology, of which the algorithm is a part, rather than the algorithm itself. We cannot have trustworthy AI, but we can have trustworthy institutions that deploy AI responsibly.