Increased use of cross-sectional imaging for follow-up does not improve post-recurrence survival of surgically treated initially localized R.C.C.: results from a European multicenter database (R.E.C.U.R.).

Abstract Objective: Modality and frequency of image-based renal cell carcinoma (R.C.C.) follow-up strategies are based on risk of recurrence. Using the R.E.C.U.R.-database, frequency of imaging was studied in regard to prognostic risk groups. Furthermore, it was investigated whether imaging modality utilized in contemporary follow-up were associated with outcome after detection of recurrence. Moreover, outcome was compared based on whether the assessment of potential curability was a pre-defined set of criteria’s (per-protocol) or stated by the investigator. Materials and methods: Consecutive non-metastatic R.C.C. patients (n = 1,612) treated with curative intent at 12 institutes across eight European countries between 2006 and 2011 were included. Leibovich or U.I.S.S. risk group, recurrence characteristics, imaging modality, frequency and survival were recorded. Primary endpoints were overall survival (O.S.) after detection of recurrence and frequency of features associated with favourable outcome (non-symptomatic recurrences and detection within the follow-up-programme). Results: Recurrence occurred in 336 patients. Within low, intermediate and high risk for recurrence groups, the frequency of follow-up imaging was highest in the early phase of follow-up and decreased significantly over time (p < 0.001). However, neither the image modality for detection nor ≥ 50% cross-sectional imaging during follow-up were associated with improved O.S. after recurrence. Differences between per protocol and investigator based assessment of curability did not translate into differences in O.S. Conclusions: As expected, the frequency of imaging was highest during early follow-up. Cross-sectional imaging use for detection of recurrences following surgery for localized R.C.C. did not improve O.S. post-recurrence. Prospective studies are needed to determine the value of imaging in follow-up.


Introduction
Among the purposes for follow-up after radical treatment of renal cell carcinoma (R.C.C.) are observation of renal function, recovery from surgery, oncological control to detect recurrence of disease manifestations and, finally, a psychosocial need for both patient and physician following cancer treatment [1].
It seems deeply rooted that early detection of cancer recurrences results in more effective treatment, which improves survival. Based on this assumption, most of the readily used R.C.C. follow-up strategies adapt their imaging modality and frequency to the risk and potential site of recurrence [2][3][4]. During the last decades, this risk-based approach to follow-up has resulted in increased recommendations for follow-up imaging and subsequently an increased use of cross-sectional imaging in R.C.C. follow-up [2,[5][6][7][8][9][10][11][12]. The literature investigating the impact of follow-up imaging after R.C.C. treatment is limited [13][14][15], but a recent study failed to show superiority in regard to post-recurrence survival for more intensive use of follow-up imaging [12]. However, to our knowledge, there are no comparative studies exploring if a specific imaging modality actually translates into improved overall survival after R.C.C. recurrence.
The European Association of Urology (E.A.U.) R.C.C. Guidelines Panel has established a collaborative multi-centre consortium (R.E.C.U.R.) to investigate comparators for evidence-based follow-up recommendation for localized R.C.C. In contrast to previously published follow-up studies, the focus of R.E.C.U.R. is on further management and outcome once a recurrence is detected. To achieve uniform definitions for comparisons between groups, the R.E.C.U.R. database utilizes per protocol-based data collection. However, arbitrary global per protocol assessments of potential curability of R.C.C. recurrence may be disputed, and as such an investigator-based assessment of curability is also registered in the R.E.C.U.R. database.
The aim of the present study was primarily to describe contemporary frequencies of follow-up imaging stratified by risk of recurrence groups. Secondly, to look for potential differences in outcome after recurrence, based on the imaging modalities used for follow-up and recurrence detection. Finally, to explore if there were significant differences in the outcome results dependent on use of global per protocol or investigator-based assessments of curability of the recurrences.

Materials and methods
The R.E.C.U.R.-database, quality assurance, exclusions and ethical considerations R.E.C.U.R. collected data from 1889 patients with localized R.C.C. from 12 centres (all with appropriate institutional approval) in eight European countries (see Supplementary Appendix) in this current study. Eligible patients underwent surgery with curative intent from January 2006 (the start of the Tyrosine kinase inhibitor era) to December 2011, allowing for a minimum of 4 years of follow-up for patients still alive and without recurrence at inclusion in the study. All data were audited for quality and completeness by a urological surgeon (S.D.). After exclusions (Figure 1), the final study population consisted of 1612 patients for the current analysis. The median follow-up for patients who did not experience recurrence or died was 63 months (I.Q.R. = 58-76). Patient characteristics are shown in Table 1.

Definitions used for analyses
The validated risk grouping system described by Leibovich [16,17] was used for clear cell R.C.C., while the University of California Los Angeles Integrated Staging System (U.I.S.S.) system [18] was used for non-clear cell R.C.C. Overall survival after recurrence was defined as the time from recurrence until death of any cause or, for patients still alive, to the date of last follow-up.
Imaging frequency was defined as the total number of imaging studies during follow-up until recurrence or last follow-up, divided by years of follow-up. As most of the institutional follow-up imaging strategies utilized were both riskand time-dependent, with more imaging in the early years after treatment, we devised three follow-up groups (followup until recurrence or last follow-up or death of other causes): short-term follow-up (0-2.49 years), mid-term followup (2.5-5.49 years) and long-term follow-up (> 5.5 years)) after treatment of primary tumour for all three risk groups, resulting in nine patient groups.
Methods of imaging were cross-sectional imaging (C.S.I.; computerized tomography (C.T.) or magnetic resonance imaging (M.R.I.)) or conventional (chest x-ray (C.X.R.) or ultrasound (U.S.)). Ratios for abdominal and thoracic imaging were calculated by dividing cross-sectional by conventional imaging.
All patients were further divided into two groups depending on their C.S.I. percentage of the total number of imaging tests (! 50% vs < 50%). The cut point for dichotomization was chosen for simplicity as it was close to the median.
The primary endpoints were detection of recurrence either as non-symptomatic or detection within institutional follow-up, as this may serve as surrogate indicators of improved outcome after recurrence [19]. Secondary analyses were: (i) the relationship between the primary endpoints and methods of imaging during follow-up; and (ii) the correlation between methods of imaging and overall survival after recurrence.
The global per protocol definition of a potentially curable (P.C.) R.C.C. recurrence was, as previously published [12,19], taken to be local recurrence, single metastasis or oligometastasis ( 3 lesions at a single site). All other recurrences were considered probably incurable (P.I.). Additionally, the investigator-based assessment of each patient with recurrence (investigator-based assessment based P.C. or P.I.) were also established by an investigator from each contributing R.E.C.U.R. institute.

Statistical analysis
Descriptive statistics were presented as categorical variables with percentages and continuous variables as median and interquartile range (I.Q.R.). For categorical and non-parametric data, exact Chi-square test and Mann-Whitney U-test or Kruskal-Wallis test, respectively, were used. Correlation for group allocation for P.C./P.I. was evaluated with Kappa statistics. Kaplan-Meier method with Log-Rank test was performed for overall survival. For all statistical comparisons, a two-   Table 2).

Imaging modalities and frequencies
Irrespective of risk group, the highest frequency of imaging was during early follow-up, and decreased significantly with longer follow-up (overall p < 0.001). The median frequency of imaging increased with increasing risk group allocation in all follow-up groups. The frequency of imaging was not significantly different between patients who developed recurrences and those who did not, except for the mid-term follow-up group of high risk group patients, where those with recurrences underwent more imaging (p ¼ 0.002; Table 3).

Recurrences and outcome
Recurrences were detected by C.S.I. in 257 of 336 patients (76%), and 210 patients (63%) had > 50% of their follow-up imaging performed by C.S.I. In the low and intermediate risk groups, more recurrences were detected as part of regular follow-up when > 50% C.S.I. was performed during follow-up. The difference, however, was only statistically significant for the intermediate risk group (Table 4). For detection of nonsymptomatic recurrences, no significant difference was seen between the high and low C.S.I. group (Table 4).
There was a non-significant tendency towards more recurrences being detected via routine follow-up and being nonsymptomatic at detection if the frequency of imaging was above the median rather than below the median (see Supplementary Table S1).
There was no significant difference in overall survival between P.C. and P.I. patients stratified for the type of imaging resulting in detection of their recurrence (Figure 2a).
Similarly, neither was there any significant difference in overall survival after recurrence based on high (! 50%) or low (< 50%) C.S.I. percentage during follow-up ( Figure 2b). Moreover, exploratory analyses with quartiles for C.S.I. frequencies gave similar results.
Global per protocol assessment vs investigator-based assessment of curability Of 336 recurrences, by the global per protocol definition of recurrence curability, 152 (45%) were classified as P.C., while the remaining 184 (55%), with multiple metastases, were considered P.I. When applying the investigator-based assessment of recurrence curability, the numbers were 123 (37%) and 213 (63%) for P.C. and P.I., respectively. Investigatorbased assessment classified 40 P.C. patients as P.I. and 11 P.I. patients as P.C. The kappa value for the scoring was 0.69.
In 20 of 70 solitary, 16 of 38 oligometastatic and four of 25 local recurrences, investigator-based assessment classified them as P.I. rather than P.C. These patients were older (68 years vs 65 years, p ¼ 0.102), and in $ 50% of cases there was an investigator's note in the R.E.C.U.R. database stating comorbidity and/or patient's wishes prohibiting curative intended procedures (surgery/ablation/radiation (i.e. stereotactic radiotherapy)). Kaplan-Meier estimates showed that the median overall survival for P.C. patients was 50 months vs 43 months for the investigator-based assessment and global per protocol groups, respectively (p ¼ 0.2) (Figure 3). For P.I. patients the median overall survival was 16 months for both the investigator-based assessment and global per protocol assessment.

Discussion
It is generally believed that regular imaging has the potential to reveal recurrences early while small and asymptomatic. However, for such imaging strategies to be useful the disease has to behave in a predictable pattern in the majority of patients, with recurrences growing linearly, disseminating to pre-determined sites and in a predictable fashion. We have previously shown that only 2% of patients with initially localized R.C.C. in the high risk group. will, after recurrence detection, remain disease free after resection of recurrence [19]. In this study we showed that the frequency of C.S.I. and mode of imaging at detection for patients with recurrent disease had no bearing on the oncological outcome. 29.2 n, number of patients; I.Q.R., Inter Quartile Range; cm, centimetres; R.C.C., Renal cell carcinoma; pT, pathological tumor stage; pN, pathological lymph node stage. a For clear cell R.C.C. the risk group allocation is based on the system by Leibovich et al. [16] and for non-clear cell R.C.C. by the U.I.S.S. system [17]. Imaging in most cancer follow-up protocols follows defined intervals and with the highest frequency in periods for which historic data have shown that recurrences are most likely to be diagnosed. Our results demonstrate that the participating institutions, during the study period, used follow-up imaging relatively similar to the present recommendations from E.A.U. [3], both in regard to use of imaging based on risk stratification and the duration of follow-up. As the frequency of imaging for R.C.C. patients developing recurrence and those remaining disease free is relatively similar, our figures most likely represent the daily practice at the institutions.
In the 2017 edition, the E.A.U. R.C.C. guidelines removed C.X.R. from the follow-up recommendation. The present study in patients treated between 2006 and 2011 shows that C.X.R. was the most used modality for investigation of the thorax. Similarly, the use of ultrasound was more frequent than recommended by the E.A.U. guidelines. With the updated recommendations in mind, it is intriguing that imaging modality utilized does not seem to translate into a survival benefit. If no gain can be identified by the use of C.T.T. instead of C.X.R., questions about cost-effectiveness and increased radiation exposure may be justified.
It is well documented that micrometastatic R.C.C. cells may remain dormant for a long time before they develop into macroscopic, detectable disease. The reasons for dormancy may be multiple [20]; including genomic classification [21], the inability to recruit blood vessels, immune surveillance, cell cycle arrest or tumor microenvironment interactions. There may be several causes for these disease foci to start growing at some time point. Some of these are tumor regulated, such as the onset of chromosomal instability [22], but they may also be triggered by external factors such as other diseases and surgical [23] or other traumas (e.g. fractures or other traumatic injuries). It is hypothesized that increased levels of growth factors may stimulate several dormant tumors at the same time, resulting in disseminated visible metastatic disease in a short period of time [23,24]. Moreover, unlike some other cancers with predicable patterns of recurrence, e.g. prostate cancer, R.C.C. has the potential to metastasize to most organs. The sites of R.C.C. recurrences not covered by C.T. of the thorax and abdomen are not negligible, and up to 16% is reported [25]. Hence, such an image-based follow-up programme has an a priori inherent failure rate. Furthermore, these recurrences will in most cases be detected as symptomatic, with known poorer prognosis [25,26].
Within the R.E.C.U.R. collaboration, a global per protocol assessment of potential curability has been established [12,19]. The definite advantage of this methodology is that it  only accounts for disease-related factors such as type and number of metastatic sites, and is, thus, uniform and reproducible. In contrast, an investigator-based assessment is subjective and appears to be affected by both disease and patient related factors like age, comorbidity and patients choices. In our opinion, and especially in a retrospective setting, the need for limitation of potential confounders is important and might be better solved by a per protocol approach. However, for a study to be considered valid and useful, the results need to be recognized by clinicians. Therefore, to reconcile a per protocol assessment to an investigator-based assessment is important. In this study, we found differences in the assessment, but these did not translate into significant differences in overall survival post-recurrence for the P.C. and P.I. groups. In our opinion, these results reinforce the decision to use a per protocol assessment of curability within R.E.C.U.R. As our study is retrospective, and thus has obvious limitations, interpretations must be made with caution. All R.E.C.U.R. institutes used their own follow-up protocols with varying intervals between each imaging performed. Therefore, it was not possible to demonstrate to what extent each patient underwent imaging at the recommended time point. We acknowledge that C.T. detects lesions with higher resolution than ultrasound/C.X.R. [27]. However, there is little evidence that C.T. have impacted the results significantly. The fact that all histological R.C.C. sub-types were included in the current analysis may be a further limitation. Indeed, there are published histological sub-type-specific follow-up strategies but, nevertheless, the major guidelines (E.A.U., A.U.A. and N.C.C.N.) currently continue to provide F.U. strategies indiscriminately of R.C.C. sub-type [2][3][4].
The present study did not evaluate quality-of-life aspects. Follow-up definitely serves a psychosocial need which may be as important as the aspect of oncological control. Anxiety after surgery for cancer leaves patients with a timely reassurance that they remain free of disease. Therefore, some kind of routine follow-up is probably indicated. However, the present study questions the need to increase the use of static and regular follow-up imaging.
It is likely that improved risk stratification tools will become available. Several molecular panels have shown prognostic utility [28][29][30]. Competing risk analyses have been introduced for tailoring follow-up programmes to the individual patient, factoring in age and comorbidities [19,31], suggesting that routine follow-up be reduced in patients where the risk of death of other causes supersedes the risk of dying from R.C.C. The future of R.C.C. follow-up is most likely to be much more personalized and routine follow-up will be replaced by tailored imaging during periods when recurrences are most likely to occur. Moreover, in the future, new follow-up programmes will have to be cost-effective [32].

Conclusion
The present study suggests that the mode of imaging for follow-up, detection of recurrences and the frequency at which  imaging is applied do not affect subsequent overall survival. Prospective studies are needed to confirm these findings and help design optimal follow-up strategies which may be less intense but more personalized than those currently used.