Classification of intestinal T-cell receptor repertoires using machine learning methods can identify patients with coeliac disease regardless of dietary gluten status.

Change log
Foers, Andrew D 
Shoukat, M Saad 
Welsh, Oliver E 
Donovan, Killian 
Petry, Russell 

In coeliac disease (CeD), immune-mediated small intestinal damage is precipitated by gluten, leading to variable symptoms and complications, occasionally including aggressive T-cell lymphoma. Diagnosis, based primarily on histopathological examination of duodenal biopsies, is confounded by poor concordance between pathologists and minimal histological abnormality if insufficient gluten is consumed. CeD pathogenesis involves both CD4+ T-cell-mediated gluten recognition and CD8+ and γδ T-cell-mediated inflammation, with a previous study demonstrating a permanent change in γδ T-cell populations in CeD. We leveraged this understanding and explored the diagnostic utility of bulk T-cell receptor (TCR) sequencing in assessing duodenal biopsies in CeD. Genomic DNA extracted from duodenal biopsies underwent sequencing for TCR-δ (TRD) (CeD, n = 11; non-CeD, n = 11) and TCR-γ (TRG) (CeD, n = 33; non-CeD, n = 21). We developed a novel machine learning-based analysis of the TCR repertoire, clustering samples by diagnosis. Leave-one-out cross-validation (LOOCV) was performed to validate the classification algorithm. Using TRD repertoire, 100% (22/22) of duodenal biopsies were correctly classified, with a LOOCV accuracy of 91%. Using TCR-γ (TRG) repertoire, 94.4% (51/54) of duodenal biopsies were correctly classified, with LOOCV of 87%. Duodenal biopsy TRG repertoire analysis permitted accurate classification of biopsies from patients with CeD following a strict gluten-free diet for at least 6 months, who would be misclassified by current tests. This result reflects permanent changes to the duodenal γδ TCR repertoire in CeD, even in the absence of gluten consumption. Our method could complement or replace histopathological diagnosis in CeD and might have particular clinical utility in the diagnostic testing of patients unable to tolerate dietary gluten, and for assessing duodenal biopsies with equivocal features. This approach is generalisable to any TCR/BCR locus and any sequencing platform, with potential to predict diagnosis or prognosis in conditions mediated or modulated by the adaptive immune response. © 2020 The Authors. The Journal of Pathology published by John Wiley & Sons, Ltd. on behalf of The Pathological Society of Great Britain and Ireland.

T-cell receptor repertoire, T-lymphocyte, TRD, TRG, clustering, coeliac disease, duodenum, gluten, machine learning, Adult, Celiac Disease, Diet, Gluten-Free, Female, Humans, Intestine, Small, Machine Learning, Male, Middle Aged, Receptors, Antigen, T-Cell, gamma-delta
Journal Title
J Pathol
Conference Name
Journal ISSN
Volume Title
All rights reserved
Coeliac UK (ES01-14)
Medical Research Council (MC_PC_17185)