The value of standards for health datasets in artificial intelligence-based applications.

Artificial intelligence as a medical device is increasingly being applied to healthcare for diagnosis, risk stratification and resource allocation. However, a growing body of evidence has highlighted the risk of algorithmic bias, which may perpetuate existing health inequity. This problem arises in part because of systemic inequalities in dataset curation, unequal opportunity to participate in research and inequalities of access. This study aims to explore existing standards, frameworks and best practices for ensuring adequate data diversity in health datasets. Exploring the body of existing literature and expert views is an important step towards the development of consensus-based guidelines. The study comprises two parts: a systematic review of existing standards, frameworks and best practices for healthcare datasets; and a survey and thematic analysis of stakeholder views of bias, health equity and best practices for artificial intelligence as a medical device. We found that the need for dataset diversity was well described in literature, and experts generally favored the development of a robust set of guidelines, but there were mixed views about how these could be implemented practically. The outputs of this study will be used to inform the development of standards for transparency of data diversity in health datasets (the STANDING Together initiative).

Description

Acknowledgements: This project is funded by the NHS AI Lab at the NHS Transformation Directorate and The Health Foundation and managed by the National Institute for Health and Care Research (AI_HI200014). The views expressed in this publication are those of the author(s) and not necessarily those of the NHS Transformation Directorate, The Health Foundation or the National Institute for Health and Care Research. D.T. and F.M. are funded by the National Pathology Imaging Co-operative, NPIC (project no. 104687), supported by a £50 million investment from the Data to Early Diagnosis and Precision Medicine strand of the government’s Industrial Strategy Challenge Fund, managed and delivered by UK Research and Innovation (UKRI).

Funder: This project is funded by The NHS AI Lab at the NHS Transformation Directorate and the Health Foundation and managed by the National Institute for Health and Care Research (AI_HI200014). The views expressed in this publication are those of the author(s) and not necessarily those of the NHS Transformation Directorate, the Health Foundation or the National Institute for Health and Care Research.

Funder: DT and FMcK are funded by National Pathology Imaging Co-operative, NPIC (Project no. 104687), supported by a £50m investment from the Data to Early Diagnosis and Precision Medicine strand of the government’s Industrial Strategy Challenge Fund, managed and delivered by UK Research and Innovation (UKRI).

Keywords

Humans, Artificial Intelligence, Consensus, Delivery of Health Care, Systematic Reviews as Topic

Journal Title

Nat Med

Journal ISSN

1078-8956
1546-170X

Volume Title

29

Publisher

Springer Science and Business Media LLC

Publisher DOI

https://doi.org/10.1038/s41591-023-02608-w

Rights

Attribution 4.0 International

Collections

Jisc Publications Router