Repository logo
 

Enhancing Early Detection of Oesophageal Squamous Cell Carcinoma Through Shallow Whole-Genome Sequencing and Non-Endoscopic Sponge Sampling


Change log

Abstract

The incidence of oesophageal squamous cell carcinoma (OSCC) remains high, accounting for approximately 90% of the nearly half-million annual cases of oesophageal cancer globally. Although effective, conventional upper endoscopy is invasive and costly, limiting its widespread use for screening high-risk populations and surveillance of at-risk individuals, including those with a history of head and neck cancer. In this thesis, I explored the potential of shallow whole-genome sequencing (sWGS) to detect genome-wide copy number alterations (CNAs) in cells collected using a non-endoscopic sponge pan-oesophageal sampling device (Cytosponge) for the early detection of OSCC and its precursor lesions, oesophageal squamous cell dysplasia (OSD).

To achieve this, I utilised previously unexamined samples from two prospective studies: the Early Detection and Surveillance of Oesophageal Cancer with Cytosponge and Nucleic Acid Biomarkers (EDEN) study and the Oesophageal Squamous cell CARcinoma (OSCAR) project. Across both studies, I analysed 242 oesophageal specimens collected using the Cytosponge device from 178 participants, comprising early stage OSCC (n = 71), high-risk individuals (n = 75), and healthy controls with dyspeptic symptoms (n = 96). These samples underwent 1.5-2x sWGS, alongside standard pathological assessment including haematoxylin & eosin (H&E) staining for atypia detection and p53 immunohistochemistry (IHC).

For the first time, I demonstrated the feasibility of detecting CNAs indicative of OSD and early cancer from non-micro-dissected, pan-oesophageal Cytosponge samples. My novel genomic approach integrated chromosome arm-level CNAs with a new metric, the genome-wide instability score (GWIS), which quantifies genomic instability across the entire genome. Using a generalised linear model with elastic net regularisation, I identified specific alterations in chromosome arms 2q, 3q, 9p, and 11q, along with the GWIS, as significant predictors of OSD and early OSCC.

A logistic regression model integrating these CNA-based biomarkers achieved a cross-validated area under the curve (AUC) of 0.925 (95% CI: 0.913-0.936), with sensitivity of 0.896 (95% CI: 0.877-0.915), specificity of 0.887 (95% CI: 0.864-0.910), and accuracy of 0.888 (95% CI: 0.877-0.898). This outperformed standard pathological biomarkers , which had a cross-validated AUC of 0.867 (95% CI: 0.851-0.884), sensitivity of 0.851 (95% CI:0.833-0.869), specificity of 0.831 (95% CI: 0.801-0.861), and accuracy of 0.832 (95% CI: 0.819-0.846). Combining the CNA-based markers and pathology data provided additional refinement in diagnostic accuracy, with a cross-validated AUC of 0.953 (95% CI: 0.943-0.963), sensitivity of 0.923 (95% CI: 0.908-0.939), specificity of 0.934 (95% CI: 0.917-0.952), and accuracy of 0.906 (95% CI: 0.895-0.917).

To elucidate the biological significance and cellular origin of the genomic signals detected in the heterogeneous pan-oesophageal Cytosponge samples, I evaluated the identified genomic markers in micro-dissected regions of archival, paraffin-embedded ESD specimens. These specimens provided discrete areas of normal, dysplastic, and cancerous tissues, allowing for the precise assessment of marker performance across different pathological states. When applied to this independent ESD dataset, the Cytopsonge CNA-based model demonstrated an AUC of 0.879, with an accuracy of 0.855, sensitivity of 0.823, and specificity of 0.896. When low-grade dysplasia (LGD) samples were excluded, which are known to be more challenging to categorise, the performance improved to an AUC of 0.902, with an accuracy of 0.892, sensitivity of 0.889, and specificity of 0.896. The key findings of this study—the identification of generalised instability, quantified by the GWIS, and specific CNAs in chromosome arms 2q, 3q, 9p, and 11q as strong predictors of OSCC and its precursor lesions—align closely with the established literature on OSCC progression. The GWIS effectively captured increasing genomic instability from normal/benign conditions to dysplastic and cancerous states, corroborating existing literature that recognises genomic instability as a hallmark of OSCC progression. Furthermore, the identified chromosomal regions harbour biologically significant genes known to play crucial roles in OSCC development, including CDKN2A (9p), NFE2L2 (2q), TSLC1 (11q), and SOX2 and PIK3CA (3q).

This thesis presents the first comprehensive evaluation of a genomic model for OSCC detection, integrating chromosome arm-level CNAs and a novel genomic instability metric—the GWIS—using the minimally invasive Cytosponge method. The integration of these genomic features with traditional biomarkers significantly enhanced diagnostic accuracy, marking a substantial advancement in early OSCC detection. Biological investigation using micro-dissected ESD specimens further supported the robustness of the identified genomic markers, closely aligning with the established literature on OSCC tumourigenesis. The minimally invasive nature of the Cytosponge method offers a promising tool for screening in diverse clinical settings, with the potential to improve compliance and effectiveness, particularly in high-risk populations. If validated in independent test sets, this approach could substantially reduce the public health burden of OSCC through earlier interventions and more efficient screening strategies.

Description

Date

2024-09-24

Advisors

Fitzgerald, Rebecca

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
The Bill & Melinda Gates Foundation