On the Utility of Prediction Sets in Human-AI Teams
View / Open Files
Publication Date
2022-05-03Journal Title
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Conference Name
Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}
Publisher
International Joint Conferences on Artificial Intelligence Organization
Type
Conference Object
This Version
AM
Metadata
Show full item recordCitation
Babbar, V., Bhatt, U., & Weller, A. (2022). On the Utility of Prediction Sets in Human-AI Teams. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence https://doi.org/10.24963/ijcai.2022/341
Abstract
Research on human-AI teams usually provides experts with a single label,
which ignores the uncertainty in a model's recommendation. Conformal prediction
(CP) is a well established line of research that focuses on building a
theoretically grounded, calibrated prediction set, which may contain multiple
labels. We explore how such prediction sets impact expert decision-making in
human-AI teams. Our evaluation on human subjects finds that set valued
predictions positively impact experts. However, we notice that the predictive
sets provided by CP can be very large, which leads to unhelpful AI assistants.
To mitigate this, we introduce D-CP, a method to perform CP on some examples
and defer to experts. We prove that D-CP can reduce the prediction set size of
non-deferred examples. We show how D-CP performs in quantitative and in human
subject experiments ($n=120$). Our results suggest that CP prediction sets
improve human-AI team performance over showing the top-1 prediction alone, and
that experts find D-CP prediction sets are more useful than CP prediction sets.
Sponsorship
The Alan Turing Institute
Leverhulme Trust via CFI
DeepMind
Mozilla Foundation
Funder references
Leverhulme Trust (RC-2015-067)
EPSRC (EP/V025279/1)
Alan Turing Institute (TUR-000346)
Identifiers
External DOI: https://doi.org/10.24963/ijcai.2022/341
This record's URL: https://www.repository.cam.ac.uk/handle/1810/336716
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.