Edited by: Timothy James Kinsella, Warren Alpert Medical School of Brown University, United States
Reviewed by: Wei Zhao, Stanford University, United States; Sean P. Collins, Georgetown University, United States
This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology
†These authors have contributed equally to this work
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Radiotherapy represents the most effective non-surgical modality for the potentially curative treatment of prostate cancer. Around a half of survivors underwent radiotherapy as part of their curative care (
Despite the fact that prognosis is very good in terms of patients' survival rates, it is widely acknowledged that long-term side-effects after radiotherapy can affect a patient's quality-of-life (
Radiation toxicity is a multifactorial problem, related not only to the cumulative delivered dose, but also to an intrinsic process within tissues responding to cellular injury. Individual genetic background and biological expression pattern, premorbid conditions, concomitant oncological therapies, as well as the cellular microenvironment, could be important factors in the development of side-effects, although their exact contributions are unknown.
With increased interest in this field and relevant data collection on this topic, predictive models have been developed to identify patients likely to develop side effects during radiotherapy (
The identification of genetic factors associated with susceptibility to radiation toxicity represents an emerging research area in oncology. A number of different approaches have been explored (
REQUITE (validating pREdictive models and biomarkers of radiotherapy toxicity to reduce side effects and improve QUalITy of lifE in cancer survivors) was established with the aim of validating models and biomarkers for the prediction of adverse effects following radiotherapy (
The specific purpose of the present study was to attempt to validate genetic risk factors for late toxicity (rectal bleeding and late urinary symptoms) after prostate cancer radiotherapy in the REQUITE population using a deep learning algorithm. This technique aims to identify patient-specific features that define patients with toxicity (“unhealthy”) as outliers with respect to the population of irradiated patients without toxicity (“healthy”).
Deep learning has the potential to overcome the difficulties in replication of results faced by the widespread single-SNP association methods used by genome wide association studies (GWAS). The statistical power of GWAS is limited by a combination of the large number of hypotheses being tested simultaneously and the inherently small effect size of the single SNP (
Deep learning approaches, with their intrinsic hierarchical structure (where each layer performs a combination of the outcomes of the previous layers), seem particularly adapt at mimicking complex dependencies within data. The method addresses effectively the following issues: (i) unstable selections of correlated variables and inconsistent selections of linearly dependent genetic variables (
REQUITE prostate cancer patients treated with external beam radiotherapy (with/without hormonal therapy, with/without a previous prostatectomy, no brachytherapy) and complete 2-year follow-up were included. Details on the REQUITE population are given in Seibold et al. (
Prostate cancer patients were recruited prior to radiotherapy between April 2014 and October 2016. Recruitment was at ten main sites in eight countries (Belgium, France, Germany, Italy, the Netherlands, Spain, UK, US). Conventionally fractionated or hypo-fractionated radiotherapy was prescribed according to local standard-of-care regimens. The patients were followed prospectively for at least 24 months, with longer follow-up encouraged where possible. All patients gave written informed consent. The study was approved by local Ethical Committees and is registered at
Demographic, co-morbidity, treatment, physics, longitudinal toxicity (CTCAE v4.0 healthcare professional and patient reported), quality-of-life, and treatment outcome data were collected prospectively using standardized case report forms. CTCAE v4.0 based questionnaires developed to collect patient reported outcomes were adapted from those published elsewhere for the male pelvis (
All patients donated at least two blood samples prior to the start of radiotherapy: an EDTA sample for SNP genotyping plus a PAXgene sample. Genotyping data were generated using the Illumina Infinium OncoArray-500K beadchip. Following standard quality control procedures (
We undertook a comprehensive search of Medline and PubMed databases using the keywords “prostate,” “prostatic,” “radiotherapy,” “radiation,” “irradiation,” “toxicity,” “adverse effects,” “side-effects,” “morbidity,” “injury,” “genetic variation,” “SNP,” “GWAS,” and “polymorphism.” This search identified 60 SNPs published (up to May 31st, 2019) in GWAS patient studies with
Forty-three of 60 SNPs were available for the REQUITE population (either directly determined or after imputation) and were included in the analysis. These SNPs were identified in five papers (
Full list of SNPs selected from the literature for validation and associated toxicity endpoint following prostate radiotherapy.
rs10519410 | 3.7 | 1.3 × 10−6 | ( |
rs17055178 | 1.95 |
6.2 × 10−10 | ( |
rs17599026 | 3.12 | 4.16 × 10−8 | ( |
rs342442 | 0.51 | 3.86 × 10−7 | ( |
rs8098701 | 2.41 | 2.11 × 10−6 | ( |
rs7366282 | 3.2 | 2.03 × 10−6 | ( |
rs10209697 | 2.66 | 2.27 × 10−6 | ( |
rs4997823 | 0.49 | 2.35 × 10−6 | ( |
rs7356945 | 1.74 | 3.71 × 10−6 | ( |
rs6003982 | 0.51 | 4.28 × 10−6 | ( |
rs10101158 | 1.8 | 4.39 × 10−6 | ( |
rs7720298 | 2.71 | 3.21 × 10−8 | ( |
rs17362923 | 2.7 | 6.79 × 10−7 | ( |
rs76273496 | 3.68 | 2.71 × 10−6 | ( |
rs144596911 | 3.6 | 2.94 × 10−6 | ( |
rs62091368 | 4.36 | 3.95 × 10−6 | ( |
rs141342719 | 3.5 | 3.97 × 10−6 | ( |
rs673783 | 2.49 | 4.33 × 10−6 | ( |
rs10969913 | 3.92 |
2.9 × 10−10 | ( |
rs11122573 | 1.92 |
1.8 × 10−8 | ( |
rs708498 | 0.24 | n.a. |
( |
rs845552 | 0.95 | n.a. |
( |
rs1799983 | 0.19 | n.a. |
( |
rs1045485 | 0.27 | n.a. |
( |
rs10497203 |
1.48 | 8.84 × 10−11 | ( |
rs7582141 |
1.45 | 4.64 × 10−11 | ( |
rs6432512 |
1.42 | 1.97 × 10−10 | ( |
rs264651 |
1.49 | 1.48 × 10−7 | ( |
rs264588 |
1.45 | 3.08 × 10−10 | ( |
rs264631 |
1.43 | 6.4 × 10−10 | ( |
rs147596965 | 1.95 | 6.19 × 10−8 | ( |
rs77530448 | 1.43 | 7.36 × 10−8 | ( |
rs4906759 | 1.73 | 1.55 × 10−7 | ( |
rs71610881 | 1.82 | 5.41 × 10−7 | ( |
rs141799618 | 1.55 | 1.22 × 10−6 | ( |
rs2842169 | 1.32 | 1.45 × 10−6 | ( |
rs11219068 | 1.32 | 1.74 × 10−6 | ( |
rs8075565 | 1.32 | 2.20 × 10−6 | ( |
rs6535028 | 1.34 | 2.70 × 10−6 | ( |
rs4775602 | 1.26 | 3.20 × 10−6 | ( |
rs7829759 | 1.39 | 3.84 × 10−6 | ( |
rs79604958 | 1.60 | 4.33 × 10−6 | ( |
rs12591436 | 1.20 | 5.66 × 10−6 | ( |
Toxicity endpoints were defined using CTCAE v4.0 scoring reported by health professionals or Patient Reported Outcomes, as detailed for each single endpoint. As the frame of the DSAE is to identify SNPs who would tag a patient as exceptionally “sensitive” to radiation (an “outlier”), patients with other possible known intrinsic higher risk of exhibiting radiation toxicity were always excluded, in particularly patients who had systemic lupus erythematosus, rheumatoid arthritis and other collagen vascular diseases.
The following endpoints were considered:
Late rectal bleeding grade≥1 (CTCAE v4.0 scoring): patients exhibiting at least mild bleeding (even requiring no intervention) at 12 or at 24 months. Patients with grade≥1 at baseline and grade ≤ 1 during follow-up were considered as not bleeders; patients with hemorrhoids before radiotherapy treatment were excluded.
Late urinary frequency grade≥2 (CTCAE v4.0 scoring): patients with urinary frequency limiting instrumental activities of daily living or if urinary frequency requiting medical management at 12 or at 24 months. Patients with urinary frequency grade≥2 at baseline and grade ≤ 2 during follow-up were considered as not exhibiting this endpoint.
Late haematuria grade ≥1 (CTCAE scoring): patients with asymptomatic haematuria (clinical or diagnostic observations only, no intervention indicated) at 12 or 24 months. Patients with haematuria grade≥1 at baseline and grade ≤ 1 during follow-up were considered as not exhibiting the endpoint.
Late nocturia grade ≥2 (Patient Reported Outcome): patients declaring need to urinate at least two-three times per night at 12 or 24 months. Patients with nocturia grade≥2 at baseline and grade ≤ 2 during follow-up were considered as not exhibiting the endpoint.
Late grade≥1 (Patient Reported Outcome): patients scored with hesitant or dripping stream at 12 or 24 months. Patients with decreased urinary stream grade≥1 at baseline and grade ≤ 1 during follow-up were considered as not exhibiting the endpoint.
Patients who underwent transurethral resection of the bladder and patients on anti-muscarinic drugs (factors which could constitute a confounding factor in the scoring of urinary toxicity) were excluded when considering all urinary endpoints.
The methodology described in Massi et al. (
An AutoEncoder (AE) is a neural network with an output that reconstructs the input (
A more sophisticated version of AE (named Deep AE) has
Simplified scheme of a Deep AutoEncoder.
In order to get an
(i)
(ii)
(iii)
(iv)
The steps (i)-(iii) are repeated 50 times in order to reduce a possible selection bias induced by the sampling step (i), thus obtaining 50
In order to identify which features should be selected for characterizing the minority class with respect to the majority class, in step (iv) the average Reconstruction Error per feature per class is computed according to that proposed in Massi et al. (
Schematic representation of the workflow used to identify which features to select to characterize the minority class (i.e., patients with toxicity) with respect to the majority class (patients without toxicity).
Finally, to define which SNPs are associated with late toxicity endpoints, we set possible thresholds equal to the 70-th, 80-th, the 90-th and the 95-th percentiles of the distribution of the Reconstruction Error differences, Δ. This means that we investigated the SNPs associated with the top 30%, the top 20%, the top 10% and the top 5% differences. These thresholds identity the effect size of identified SNPs, a large effect size (Odds Ratio>2) for SNPs in the 90-th/95-th percentiles, a moderate (Odds Ratio~2) and small (Odds Ratio <2) effect size for SNPs in the 80-th and 70-th percentiles, respectively.
For the interested reader, in this section we provide some more specific details regarding the development and specific implementation of the DSAE for the applications described in this paper. For more details on the methodology, its strenghts and all model's hyperparameters mentioned below, refer to the description in Massi et al. (
The experiments were implemented and carried out using Python Keras framework for Deep Learning with Tensorflow as backend.
For better comparability of results in the experiments we structured the DSAEs included in the
REQUITE enrolled 1,681 prostate cancer patients who were treated with external beam radiotherapy without brachytherapy. One thousand four hundred and fifty patients with complete 2-year follow-up were available for analysis. Forty-nine patients were excluded because of an intrinsic higher risk of exhibiting radiation toxicity, due to their co-morbidities (patients with a diagnosis of systemic lupus erythematosus, rheumatoid arthritis and other collagen vascular diseases). Details on the clinical characteristics of the cohorts selected for each toxicity endpoint are given in
One hundred and sixty of 1,366 available patients (11.7%) had late rectal bleeding grade≥1.
Results for late rectal bleeding grade≥1 from the Deep Sparse AutoEncoder. The 43 considered SNPs are reported in the x-axis and the averaged Reconstruction Errors (RE) are reported in the y-axis (top panel), red columns refer to patients with toxicity, while blue columns refer to patients without toxicity. In the lower panel the difference between averaged Reconstruction Errors between the two classes are represented for each SNP (i.e., differences between red and blue columns). For most SNPs, the difference is close to zero (red line in the bottom panel of the figure). The chosen thresholds in this difference (i.e., highest 30, 20, 10, and 5% differences) are selecting SNPs associated to the toxicity outcome. Green circles refer to SNPs that were previously identified as associated with late rectal bleeding, while blue circles refer to SNPs that were previously associated with overall toxicity as defined by calculation of the Standardized Total Average Toxicity (STAT) score (
Deep Sparse AutoEncoder testing of SNPs associated with Late Rectal Bleeding
rs10519410 | ( |
Not validated | Not validated | Not validated | Not validated |
rs17055178 | ( |
Not validated | Not validated | Not validated | Not validated |
rs264631 | ( |
Not validated | Not validated | ||
rs141799618 | ( |
Not validated | Not validated |
Fifty-six of 1,334 available patients (4.2%) experienced late urinary frequency grade≥2. Patients were excluded from the analysis if they had urinary frequency grade≥2 at baseline (
Results for late urinary frequency grade≥2 from the Deep Sparse AutoEncoder. The 43 considered SNPs are reported in the x-axis and the averaged Reconstruction Errors (RE) are reported in the y-axis (top panel), red columns refer to patients with toxicity, while blue columns refer to patients without toxicity. In the lower panel the difference between averaged Reconstruction Errors between the two classes are represented for each SNP (i.e., differences between red and blue columns). For most SNPs, the difference is close to zero (red line in the bottom panel of the figure). The chosen thresholds in this difference (i.e., highest 30, 20, 10, and 5% differences) are selecting SNPs associated to the toxicity outcome. Green circles refer to SNPs that were previously identified as associated with late urinary frequency, while blue circles refer to SNPs that were previously associated with overall toxicity as defined by calculation of the Standardized Total Average Toxicity (STAT) score (
Results from Deep Sparse AutoEncoder testing of SNPs associated with Urinary Frequency
rs17599026 | ( |
Not validated | |||
rs342442 | ( |
Not validated | Not validated | Not validated | Not validated |
rs8098701 | ( |
||||
rs7366282 | ( |
||||
rs10209697 | ( |
Not validated | Not validated | ||
rs4997823 | ( |
Not validated | Not validated | Not validated | Not validated |
rs7356945 | ( |
Not validated | Not validated | Not validated | Not validated |
rs6003982 | ( |
Not validated | Not validated | Not validated | Not validated |
rs10101158 | ( |
Not validated | Not validated | Not validated | Not validated |
rs147596965 | ( |
Not validated | Not validated | Not validated | |
rs77530448 | ( |
||||
rs8075565 | ( |
Not validated | Not validated | Not validated | |
rs12591436 | ( |
Not validated | Not validated | Not validated |
Seventy-four of 1,343 available patients (5.5%) experienced late haematuria grade≥1. Seventeen patients were excluded from the analysis because they had haematuria at baseline grade≥1, while 41 were excluded because underwent transurethral resection of the bladder or were using anti-muscarinic drugs.
Results for late haematuria grade≥1 from the Deep Sparse AutoEncoder. The 43 considered SNPs are reported in the x-axis and the averaged Reconstruction Errors (RE) are reported in the y-axis (top panel), red columns refer to patients with toxicity, while blue columns refer to patients without toxicity. In the lower panel the difference between averaged Reconstruction Errors between the two classes are represented for each SNP (i.e., differences between red and blue columns). For most SNPs, the difference is close to zero (red line in the bottom panel of the figure). The chosen thresholds in this difference (i.e., highest 30, 20, 10, and 5% differences) are selecting SNPs associated to the toxicity outcome. Green circles refer to SNPs that were previously identified as associated with late haematuria, while blue circles refer to SNPs that were previously associated with overall toxicity as defined by calculation of the Standardized Total Average Toxicity (STAT) score (
Results from Deep Sparse AutoEncoder testing of SNPs associated with Late Haematuria
rs11122573 | ( |
Not validated | Not validated | Not validated | Not validated |
rs708498 | ( |
Not validated | |||
rs845552 | ( |
Not validated | Not validated | ||
rs147596965 | ( |
Not validated | |||
rs77530448 | ( |
Not validated | Not validated | ||
rs7829759 | ( |
Not validated | Not validated | ||
rs79604958 | ( |
Not validated | Not validated | ||
rs12591436 | ( |
Not validated | Not validated |
Two hundred and twenty-three patients out of 1,250 available patients (17.8%) experienced late nocturia grade≥2. One hundred and ten patients were excluded from analysis because they had nocturia grade≥2 at baseline, while 41 were excluded because underwent transurethral resection of the bladder or were using anti-muscarinic drugs.
Results for late nocturia grade≥2 from the Deep Sparse AutoEncoder. The 43 considered SNPs are reported in the x-axis and the averaged Reconstruction Errors (RE) are reported in the y-axis (top panel), red columns refer to patients with toxicity, while blue columns refer to patients without toxicity. In the lower panel the difference between averaged Reconstruction Errors between the two classes are represented for each SNP (i.e., differences between red and blue columns). For most SNPs, the difference is close to zero (red line in the bottom panel of the figure). The chosen thresholds in this difference (i.e., highest 30, 20, 10, and 5% differences) are selecting SNPs associated to the toxicity outcome. Green circles refer to SNPs that were previously identified as associated with late nocturia, while blue circles refer to SNPs that were previously associated with overall toxicity as defined by calculation of the Standardized Total Average Toxicity (STAT) score (
Results from Deep Sparse AutoEncoder testing of SNPs associated with Late Nocturia
rs1799983 | ( |
Not validated | Not validated | Not validated | |
rs1045485 | ( |
Not validated | Not validated | Not validated | Not validated |
rs10497203 | ( |
Not validated | Not validated | ||
rs264651 | ( |
Not validated | Not validated | ||
rs77530448 | ( |
Not validated | Not validated | ||
rs11219068 | ( |
Not validated | Not validated |
Two hundred and eleven out of 1,234 available patients (17.1%) experienced late decreased stream grade≥1. One hundred and twenty-six patients were excluded from analysis because they had decreased stream grade≥1 at baseline, while 41 were excluded because underwent transurethral resection of the bladder or were using anti-muscarinic drugs. Eleven SNPs were selected: two SNPs previously identified for decreased urinary stream (
Results for late decreased urinary stream grade≥1 from the Deep Sparse AutoEncoder. The 43 considered SNPs are reported in the x-axis and the averaged Reconstruction Errors (RE) are reported in the y-axis (top panel), red columns refer to patients with toxicity, while blue columns refer to patients without toxicity. In the lower panel the difference between averaged Reconstruction Errors between the two classes are represented for each SNP (i.e., differences between red and blue columns). For most SNPs, the difference is close to zero (red line in the bottom panel of the figure). The chosen thresholds in this difference (i.e., highest 30, 20, 10, and 5% differences) are selecting SNPs associated to the toxicity outcome. Green circles refer to SNPs that were previously identified as associated with late decreased urinary stream, while blue circles refer to SNPs that were previously associated with overall toxicity as defined by calculation of the Standardized Total Average Toxicity (STAT) score (
Results from Deep Sparse AutoEncoder testing of SNPs associated with Late Decreased Urinary Stream*.
rs7720298 | ( |
Not validated | Not validated | Not validated | Not validated |
rs17362923 | ( |
Not validated | Not validated | Not validated | Not validated |
rs76273496 | ( |
Not validated | |||
rs144596911 | ( |
Not validated | Not validated | Not validated | Not validated |
rs62091368 | ( |
Not validated | Not validated | Not validated | Not validated |
rs141342719 | ( |
Not validated | Not validated | Not validated | Not validated |
rs673783 | ( |
Not validated | Not validated | Not validated | |
rs10969913 | ( |
Not validated | Not validated | Not validated | Not validated |
rs77530448 | ( |
Not validated | Not validated | Not validated | |
rs6535028 | ( |
Not validated | Not validated | Not validated |
A simple validation approach, using univariate logistic analysis, identified eight SNPs with
In recent years Normal Tissue Complication Probability (NTCP) models have been developed to attempt to predict before the start of treatment patients at risk of long-term radiation toxicity. These recent developments were also characterized by the shift from NTCP dose-based modeling to the wider field of more “comprehensive” predictive models. In the speculative case that two patients receive exactly the “same dose distribution,” the risk of toxicity is always modulated by the single individual profile.
The fact that “dose is not enough” was clear from the early days of radiobiology but is receiving constantly growing attention in the current “omics” epoch (Bentzen, 2006): the availability of individual information characterizing patients and potentially influencing their reactions to radiation is increasingly important, especially in the era of image-guided radiotherapy that can spare the organs at risk in most patients.
The purpose of any predictive model in oncology is to provide valid outcome predictions for new patients. Essentially, the main interest of a dataset used to develop a model is to learn for the future. Systematic validation in multi-center collaborative settings hence is a crucial aspect in the process of predictive modeling. REQUITE is the largest multi-center observational study in this field to date, collecting standardized data longitudinally. The study was specifically designed to enable validation of models and biomarkers that predict a patient's risk of developing long-term side-effects following radiotherapy.
The present work focused on the validation of findings from previous GWAS of radiation toxicity after radiotherapy for prostate cancer. To the best of our knowledge, few validation studies in this frame have been conducted so far. Barnett et al. (
Genome-wide radiogenomic studies are identifying and validating SNPs. However, to date these studies have relied on the classical single marker association test (both in the discovery and validation setting), which is hampered by the need for multiple-testing corrections. For typical study sizes, this method can detect only relatively large effect size and has limited power to identify reliably modest effects from the many SNPs that are likely to contribute to a polygenic risk profile associated with radiation toxicity. Genome-wide studies miss SNPs that make small but real contributions to risk.
Machine learning has already been proposed as a promising alternative approach to estimate overall genetic risk (
Here, we extended the use of machine learning methods by using a method that addresses an important limitation of studies on radiation toxicity: the imbalance of classes, with a lower frequency of patients
Dealing with imbalance requires non-classical statistical solutions. Here, we explore novel methods for feature selection that come from the Deep Learning research field (
We used DSAE to obtain the best possible representation of the majority class (without toxicity) and so to identify which features (SNPs) distinguish the minority class (with toxicity). The encoder and decoder functions are usually non-linear (i.e., sigmoid, hyperbolic tangent, rectified linear unit etc.), which enables a better reconstruction of the input by the capture of complex non-linear relationships among SNPs. Training on healthy patients allows the overall SNP pattern of normal radio-sensitivity to be established. Testing measures the “distance” between each new patient and the pattern of normal radio-sensitivity to identify SNPs associated with the highest reconstruction errors (i.e., highest distances) between the pattern of normality and the SNP profile of patients scored with toxicity (i.e., radio-sensitive patients). The distribution of the reconstructed errors allows identification and classification of SNPs with very large/large effect (SNPs associated with the top 95th percentile and 90th percentile of the distribution of reconstructed errors) and with moderate/small effects (SNPs associated with the top 80th percentile and 70th percentile of the distribution of reconstructed errors).
The DSAE successfully validated multiple SNPs contributing to an increased risk of toxicity. Some SNPs were already associated with the specific considered endpoint, others were previously associated with overall toxicity, and some were previously associated with other toxicities.
As common in GWAS, many significant SNPs lie in non-coding regions, and it is premature to speculate on their functional significance. We refer readers to the original publications which discuss possible gene functions (
The main strength of our study is use of a large international prospective multi-center cohort of patients treated with modern radiotherapy techniques and fractionation schemes. The patients were specifically enrolled to validate models and biomarkers for predicting radiation toxicity, and the study design involved a standardized data collection scheme for collecting healthcare professional and patient-reported outcomes. The extensive role of data management also allowed for quality assurance of data collected, and we used “real world” data coming from “data-farming” (
A possible limitation of our study was use of 2-year follow-up toxicity data. The REQUITE study is still maturing, normal tissue reactions in the intestinal and urinary tract develop gradually from 6 months after radiotherapy till to around 3 years for the intestinal syndrome and to 5 years for the urinary syndrome. Recent additional funding is allowing extension of the REQUITE study with the aim of reaching standardized collection of follow-up data till year 5.
The use of grade 1 and grade 2 events is another possible limitation of this study. As the application of deep learning techniques requires a suitable number of events, the choice of mild or moderate (when possible) toxicity was forced by the number of morbidity events registered in the REQUITE population. The low number of severe toxicity is for sure a reflection of modern radiotherapy techniques which allow a substantial sparing of normal tissues, at least for the case of prostate cancer irradiation. Yet, some grade 1 and grade 2 toxicity can assume a chronic behavior, with substantial impact on the quality of life of long term survivors, for example, this could happen, for grade 2 urinary frequency and nocturia which are impairing daily activities and the quality of sleep for many years (
We have shown our approach is worth studying further and the next step would be to use it to identify patterns of SNPs to define polygenic risk scores that can be included into integrated normal tissue complication probability models, together with validated dosimetric and clinical risk factors.
The DSAE methodology underlines that, within the current RT, experiencing no toxicity could be considered as the “normal” situation, with patients with mild/moderate toxicity being outliers. The possible knowledge of the single patient intrinsic radiosensitivity and the identification of these outlier subjects could help in tailoring decision making. This should not entail changing the probability of tumor control to avoid mild/moderate side-effects, yet it should be focused on maximizing uncomplicated tumor control, even considering the patient inclination toward the different side-effects. The availability of such models would be relevant for the clinic, allowing the single patient optimization, thus constituting an important step toward the implementation of predictive modeling in the clinic. This approach would allow tailoring of therapeutic approach (i.e., active surveillance vs. prostatectomy vs. brachytherapy vs. external beam radiotherapy) and of doses (both to tumor and organs at risk) to the specific patient anatomy, clinical situation and individual biology. Combining biological stratification with toxicity reducing techniques (such as imaging fusion, image guidance, fractionation and reduced margins for Planning Target Volume) could further decrease treatment related toxicity rates and allow for dose escalation to enhance tumor control. Integrated predictive models will also be an essential tool in the design of interventional trials to modify the radiotherapy strategies. A detailed discussion of the potential ways in which biomarker/SNP assays might be implemented in routine clinical practice can be found in Azria et al. (
Other future work could study the possibility of “scaling” the use of DSEAs to the discovery of new genetic signatures using the whole GWAS information available in the REQUITE population, thus achieving the possibility of considering millions of features to detect outliers.
A deep learning approach can validate SNPs associated with toxicity after radiotherapy. The method can identify complex SNP signatures for multiple toxicity endpoints and should be studied further to extract polygenic risk scores to include in integrated normal tissue complication probability models that could be used to personalize radiotherapy planning.
Funding for the five year REQUITE project ended on 30th September 2018. REQUITE does not benefit financially from supplying data and/or samples to researchers, but does make a charge to cover its costs and support continued maintenance of the database and biobank beyond the ending of the funding period. To facilitate this continued access to researchers, the REQUITE Steering Committee approved a tiered cost recovery model for access to data and/ or samples. Contact REQUITE (
The REQUITE study was reviewed and approved by North West - Great Manchester East Ethics Committee (UK, reference 14 NW 0035) and by the local Ethics Committees of all participating centers. The patients provided their written informed consent to participate in this study and for the publication of the data included in this article.
MM, AP, FG, TRan, and CW: study design. MM, FG, and TRan: study development. AP, FI, AM, PZ, RE, and JC-C: coordination/supervision of the study. LV, PO, VF, TRat, PRS, KJ, ML, KH, GdM, DdR, BV, EvL, ACh, ES, CH, MV, BA, RV, DA, M-PJ, RS, KC, and PC: patient enrolment and follow-up. CT, TG, ACi, BR, AV, and MA-B: collection of the data. LF, AD, SK, and DP: SNP assay. JC-C, PS, AW, and RE: trial and data management. MM, FG, NF, FI, AP, AM, and TRan: statistical analysis. MM, FG, NF, TRan, and CW: draft of the paper. All authors: critical revision of the manuscript/final approval.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at: