Measures for explainable AI: Explanation goodness, user satisfaction, mental models, curiosity, trust, and human-AI performance

Abstract

If a user is presented an AI system that portends to explain how it works, how do we know whether the explanation works and the user has achieved a pragmatic understanding of the AI? This question entails some key concepts of measurement such as explanation goodness and trust. We present methods for enabling developers and researchers to: (1) Assess the a priori goodness of explanations, (2) Assess users' satisfaction with explanations, (3) Reveal user's mental model of an AI system, (4) Assess user's curiosity or need for explanations, (5) Assess whether the user's trust and reliance on the AI are appropriate, and finally, (6) Assess how the human-XAI work system performs. The methods we present derive from our integration of extensive research literatures and our own psychometric evaluations. We point to the previous research that led to the measurement scales which we aggregated and tailored specifically for the XAI context. Scales are presented in sufficient detail to enable their use by XAI researchers. For Mental Model assessment and Work System Performance, XAI researchers have choices. We point to a number of methods, expressed in terms of methods' strengths and weaknesses, and pertinent measurement issues.

Document Details

Document Type
Pub Defense Publication
Publication Date
Feb 06, 2023
Source ID
10.3389/fcomp.2023.1096257

Entities

People

  • Gary A. Klein
  • Jordan Litman
  • Robert R. Hoffman
  • Shane T Mueller

Organizations

  • Defense Advanced Research Projects Agency

Tags

Readers

  • Neural Network Machine Learning.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference