Repository logo

Open access-enabled evaluation of epigenetic age acceleration in colorectal cancer and development of a classifier with diagnostic potential.

Published version

Repository DOI

Change log


Widayati, Tyas Arum 
Schneider, Jadesada 
Panteleeva, Kseniia 
Chernysheva, Elizabeth 
Hrbkova, Natalie 


Aberrant DNA methylation (DNAm) is known to be associated with the aetiology of cancer, including colorectal cancer (CRC). In the past, the availability of open access data has been the main driver of innovative method development and research training. However, this is increasingly being eroded by the move to controlled access, particularly of medical data, including cancer DNAm data. To rejuvenate this valuable tradition, we leveraged DNAm data from 1,845 samples (535 CRC tumours, 522 normal colon tissues adjacent to tumours, 72 colorectal adenomas, and 716 normal colon tissues from healthy individuals) from 14 open access studies deposited in NCBI GEO and ArrayExpress. We calculated each sample's epigenetic age (EA) using eleven epigenetic clock models and derived the corresponding epigenetic age acceleration (EAA). For EA, we observed that most first- and second-generation epigenetic clocks reflect the chronological age in normal tissues adjacent to tumours and healthy individuals [e.g., Horvath (r = 0.77 and 0.79), Zhang elastic net (EN) (r = 0.70 and 0.73)] unlike the epigenetic mitotic clocks (EpiTOC, HypoClock, MiAge) (r < 0.3). For EAA, we used PhenoAge, Wu, and the above mitotic clocks and found them to have distinct distributions in different tissue types, particularly between normal colon tissues adjacent to tumours and cancerous tumours, as well as between normal colon tissues adjacent to tumours and normal colon tissue from healthy individuals. Finally, we harnessed these associations to develop a classifier using elastic net regression (with lasso and ridge regularisations) that predicts CRC diagnosis based on a patient's sex and EAAs calculated from histologically normal controls (i.e., normal colon tissues adjacent to tumours and normal colon tissue from healthy individuals). The classifier demonstrated good diagnostic potential with ROC-AUC = 0.886, which suggests that an EAA-based classifier trained on relevant data could become a tool to support diagnostic/prognostic decisions in CRC for clinical professionals. Our study also reemphasises the importance of open access clinical data for method development and training of young scientists. Obtaining the required approvals for controlled access data would not have been possible in the timeframe of this study.


Peer reviewed: True

Acknowledgements: The authors are grateful to the studies which made their data openly available. We also thank the UCL Cancer Institute Medical Genomics lab for the stimulating and inspiring discussions.


CRC, colon tissue methylation, colorectal cancer, epigenetic age, epigenetic age acceleration, epigenetic clock

Journal Title

Front Genet

Conference Name

Journal ISSN


Volume Title



Frontiers Media SA