Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence.
Authors
Bai, Xiang
Wang, Hanchen
Ma, Liya
Xu, Yongchao
Gan, Jiefeng
Fan, Ziwei
Yang, Fan
Ma, Ke
Yang, Jiehua
Bai, Song
Shu, Chang
Zou, Xinyu
Huang, Renhao
Zhang, Changzheng
Liu, Xiaowu
Tu, Dandan
Xu, Chuou
Zhang, Wenqing
Wang, Xi
Chen, Anguo
Zeng, Yu
Yang, Dehua
Wang, Ming-Wei
Holalkere, Nagaraj
Halin, Neil J
Kamel, Ihab R
Wu, Jia
Peng, Xuehua
Wang, Xiang
Shao, Jianbo
Mongkolwat, Pattanasak
Zhang, Jianjun
Liu, Weiyang
Teng, Zhongzhao
Beer, Lucian
Sanchez, Lorena Escudero
Rubin, Daniel
Zheng, Chuangsheng
Wang, Jianming
Li, Zhen
Schönlieb, Carola-Bibiane
Xia, Tian
Publication Date
2021-11-18Journal Title
ArXiv
ISSN
2522-5839
Publisher
Springer Science and Business Media LLC
Volume
3
Issue
12
Pages
1081-1089
Language
en
Type
Article
This Version
VoR
Metadata
Show full item recordCitation
Bai, X., Wang, H., Ma, L., Xu, Y., Gan, J., Fan, Z., Yang, F., et al. (2021). Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence.. ArXiv, 3 (12), 1081-1089. https://doi.org/10.1038/s42256-021-00421-z
Abstract
Artificial intelligence (AI) provides a promising substitution for streamlining COVID-19 diagnoses. However, concerns surrounding security and trustworthiness impede the collection of large-scale representative medical data, posing a considerable challenge for training a well-generalised model in clinical practices. To address this, we launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution under a federated learning framework (FL) without data sharing. Here we show that our FL model outperformed all the local models by a large yield (test sensitivity /specificity in China: 0.973/0.951, in the UK: 0.730/0.942), achieving comparable performance with a panel of professional radiologists. We further evaluated the model on the hold-out (collected from another two hospitals leaving out the FL) and heterogeneous (acquired with contrast materials) data, provided visual explanations for decisions made by the model, and analysed the trade-offs between the model performance and the communication costs in the federated training process. Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK. Collectively, our work advanced the prospects of utilising federated learning for privacy-preserving AI in digital health.
Keywords
Article, /4000/4008, /692/700, /639/705/1042, /631/326/596/4130, article
Sponsorship
Engineering and Physical Sciences Research Council (EP/N014588/1)
European Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (777826)
EPSRC (EP/S026045/1)
EPSRC (EP/T017961/1)
EPSRC (EP/T003553/1)
Leverhulme Trust (RC-2015-067)
EPSRC (EP/V025279/1)
Identifiers
s42256-021-00421-z, 421
External DOI: https://doi.org/10.1038/s42256-021-00421-z
This record's URL: https://www.repository.cam.ac.uk/handle/1810/335944
Rights
Licence:
http://creativecommons.org/licenses/by/4.0/
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.