Abstract
As AI-based clinical decision support (AI-CDS) is introduced in more and more
aspects of healthcare services, HCI research plays an increasingly important
role in designing for complementarity between AI and clinicians. However,
current evaluations of AI-CDS often fail to capture when AI is and is not
useful to clinicians. This position paper reflects on our work and influential
AI-CDS literature to advocate for moving beyond evaluation metrics like Trust,
Reliance, Acceptance, and Performance on the AI's task (what we term the "trap"
of human-AI collaboration). Although these metrics can be meaningful in some
simple scenarios, we argue that optimizing for them ignores important ways that
AI falls short of clinical benefit, as well as ways that clinicians
successfully use AI. As the fields of HCI and AI in healthcare develop new ways
to design and evaluate CDS tools, we call on the community to prioritize
ecologically valid, domain-appropriate study setups that measure the emergent
forms of value that AI can bring to healthcare professionals.
Citation
ID:
283246
Ref Key:
perer2025overrelying