International evaluation of an AI system for breast cancer screening.
View / Open Files
Authors
McKinney, Scott Mayer
Sieniek, Marcin
Godbole, Varun
Godwin, Jonathan
Antropova, Natasha
Ashrafian, Hutan
Back, Trevor
Chesus, Mary
Corrado, Greg S
Darzi, Ara
Etemadi, Mozziyar
Garcia-Vicente, Florencia
Gilbert, Fiona J
Halling-Brown, Mark
Hassabis, Demis
Jansen, Sunny
Karthikesalingam, Alan
Kelly, Christopher J
King, Dominic
Ledsam, Joseph R
Melnick, David
Mostofi, Hormuz
Peng, Lily
Reicher, Joshua Jay
Romera-Paredes, Bernardino
Sidebottom, Richard
Suleyman, Mustafa
Tse, Daniel
Young, Kenneth C
De Fauw, Jeffrey
Shetty, Shravya
Publication Date
2020-01Journal Title
Nature
ISSN
0028-0836
Publisher
Springer Science and Business Media LLC
Volume
577
Issue
7788
Pages
89-94
Language
eng
Type
Article
This Version
AM
Physical Medium
Print-Electronic
Metadata
Show full item recordCitation
McKinney, S. M., Sieniek, M., Godbole, V., Godwin, J., Antropova, N., Ashrafian, H., Back, T., et al. (2020). International evaluation of an AI system for breast cancer screening.. Nature, 577 (7788), 89-94. https://doi.org/10.1038/s41586-019-1799-6
Abstract
Screening mammography aims to identify breast cancer at earlier stages of the disease, when treatment can be more successful1. Despite the existence of screening programmes worldwide, the interpretation of mammograms is affected by high rates of false positives and false negatives2. Here we present an artificial intelligence (AI) system that is capable of surpassing human experts in breast cancer prediction. To assess its performance in the clinical setting, we curated a large representative dataset from the UK and a large enriched dataset from the USA. We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers: the area under the receiver operating characteristic curve (AUC-ROC) for the AI system was greater than the AUC-ROC for the average radiologist by an absolute margin of 11.5%. We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%. This robust assessment of the AI system paves the way for clinical trials to improve the accuracy and efficiency of breast cancer screening.
Keywords
Artificial Intelligence, Breast Neoplasms, Early Detection of Cancer, Female, Humans, Mammography, Reproducibility of Results, United Kingdom, United States
Sponsorship
Professor Fiona Gilbert receives funding from the National Institute for Health Research (Senior Investigator award).
Funder references
Department of Health (via National Institute for Health Research (NIHR)) (NF-SI-0515-10067)
NETSCC (None)
Identifiers
External DOI: https://doi.org/10.1038/s41586-019-1799-6
This record's URL: https://www.repository.cam.ac.uk/handle/1810/299195
Rights
All rights reserved
Licence:
http://www.rioxx.net/licenses/all-rights-reserved
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk