|
|||||
|
|
||||||
Journal of Clinical Oncology, Vol 25, No 11 (April 10), 2007: pp. 1316-1322 © 2007 American Society of Clinical Oncology. DOI: 10.1200/JCO.2006.06.1218 Multi-Institutional Validation of a New Renal Cancer–Specific Survival Nomogram
From the Cancer Prognostics and Health Outcomes Unit, University of Montreal Health Center, Montreal, Quebec, Canada; Department of Urology, University of Verona, Verona; Urology Unit, G. Rummo Hospital, Benevento; Department of Urology, University of Padua, Padua, Italy; Department of Urology, Henri Mondor University Hospital, Creteil; Department of Urology, North Hospital, University Hospital of Saint-Etienne, Saint-Etienne; Department of Urology, Angers University Hospital, Angers; Department of Urology, Rennes 1 University Hospital, Rennes; Comité de Cancérologie de l'Association Française d'Urologie, France; Department of Urology, University Medical Center, Nijmegen, the Netherlands; and the Department of Urology, University Hospital, Medical University of Graz, Graz, Austria Address reprint requests to Pierre I. Karakiewicz, MD, Cancer Prognostics and Health Outcome Unit, University of Montreal Health Center, 1058, rue St-Denis, Montreal, Quebec, Canada H2X 3J4; e-mail: pierre.karakiewicz{at}umontreal.ca
Purpose We tested the hypothesis that the prediction of renal cancer–specific survival can be improved if traditional predictor variables are used within a prognostic nomogram. Patients and Methods Two cohorts of patients treated with either radical or partial nephrectomy for renal cortical tumors were used: one (n = 2,530) for nomogram development and for internal validation (200 bootstrap resamples), and a second (n = 1,422) for external validation. Cox proportional hazards regression analyses modeled the 2002 TNM stages, tumor size, Fuhrman grade, histologic subtype, local symptoms, age, and sex. The accuracy of the nomogram was compared with an established staging scheme. Results Cancer-specific mortality was observed in 598 (23.6%) patients, whereas 200 (7.9%) died as a result of other causes. Follow-up ranged from 0.1 to 286 months (median, 38.8 months). External validation of the nomogram at 1, 2, 5, and 10 years after nephrectomy revealed predictive accuracy of 87.8%, 89.2%, 86.7%, and 88.8%, respectively. Conversely, the alternative staging scheme predicting at 2 and 5 years was less accurate, as evidenced by 86.1% (P = .006) and 83.9% (P = .02) estimates. Conclusion The new nomogram is more contemporary, provides predictions that reach further in time and, compared with its alternative, which predicts at 2 and 5 years, generates 3.1% and 2.8% more accurate predictions, respectively.
Accurate prediction of cancer-specific survival in patients with renal cortical tumors (RCs) is important for counseling, planning of follow-up, and selection for appropriate adjuvant trial designs. The TNM-derived American Joint Committee on Cancer (AJCC) classification represents the gold standard staging scheme after nephrectomy for RC.1-3 Moreover, integrated staging systems demonstrate that the contribution of symptoms at presentation, tumor histology, and tumor size are important in prediction of prognosis.4-6 We hypothesized that the application of routinely available RC-specific survival predictors in a nomogram setting could yield predictive accuracy that exceeds currently available models.
Patient Population Five participating institutions contributed 2,576 patients treated with either radical or partial nephrectomy for RC. This cohort constituted the nomogram development cohort, whereas 1,430 additional patients from six institutions were included in the external validation cohort (Table 1). Clinical and pathologic data for these multi-institutional cohorts were gathered prospectively at each center and included patient age at nephrectomy and sex. The symptom classification was defined as described previously.7 The Eastern Cooperative Oncology Group (ECOG) performance status was only available in the development cohort. The pathologic tumor specimen was used to define the 2002 T stage, tumor size, and Fuhrman grade. Histologic subtypes were defined according to the 2002 Union International Contre le Cancer classifications and only tumors of clear cell, chromophobe, and papillary histology were included.8 Presence of nodal and metastatic disease was defined according to intraoperative, pathologic, and radiographic findings. In all patients, the presence of nodal or distant metastases was confirmed by biopsy or pathologic analysis. Patients were staged preoperatively with computed tomography of the abdomen and pelvis, chest computed tomography or chest x-ray, serum electrolytes, and liver function tests. All data collection and analyses were undertaken with the approval and institutional oversight of the Institutional Review Board for the Protection of Human Subjects.
Within the nomogram development cohort of 2,576, 46 patients were excluded because of missing data on tumor size (n = 8), Fuhrman grade (n =5), histologic subtype (n = 30), cause of death (n = 2), or symptom classification (n = 1). Within the external validation cohort of 1,430 patients, 53 patients were excluded due to missing sex (n = 1) and histologic subtype (n = 52). The analyses of cancer-specific survival after nephrectomy were performed on 2,530 assessable patients with complete records, whereas external validation was accomplished on 1,377 patients. In both cohorts, assessment of mortality and determination of the cause of death were performed by the treating physician, who relied on chart review and/or death certificate. In cancer-specific survival analyses, perioperative deaths (within 30 days of surgery) were censored.
Statistical Analyses Given that a proportion of patients with RC die as a result of other causes, competing risk regression was used to test the significance of the described variables in predicting RC-specific mortality, after accounting for other-cause mortality. Competing risks regression models account for the effect of other-cause mortality. Strong effect of competing mortality may result in extensive censoring due to cancer-unrelated deaths. Censoring due to other-cause mortality may artificially reduce the pool of individuals at risk of RC-specific events. This may in turn overestimate the effect of RC-specific mortality. The effect of other-cause mortality cannot be accounted for in Cox regression models. However, the use of competing risks regression, as described by Fine and Gray,13 can remove this limitation by distinguishing between RC-specific and other-cause mortality. Unfortunately, there are no commercially available statistical packages that would allow applying competing risks regression within a nomogram setting. Actuarial survival probabilities were estimated using the Kaplan-Meier method. Multivariate Cox regression coefficients were then used to generate the nomogram. The predictive accuracy of nomograms is usually quantified with receiver operating characteristics–derived area under the curve estimates.9,10 In Cox regression models, the area under the curve is substituted with Harrell's concordance index, which was used in this analysis.11,12 A value of 100% indicates perfect predictions, whereas 50% is equivalent to a toss of a coin. Internal validation relied on 200 bootstrap resamples.14 Predictive accuracy estimates were compared using the Mantel-Haenszel test. Calibration plots were generated to explore the performance characteristics of the nomogram at 1, 2, 5, and 10 years after nephrectomy. Finally, we used the external validation cohort to compare the final, reduced, nomogram-predicted RC-specific mortality versus the observed RC-specific mortality at 1, 2, 5, and 10 years. Nomogram-derived mortality predictions were computed with the nomogram formula, which was derived from the development cohort. Subsequently, the nomogram-predicted probabilities of RC-specific mortality were compared with the observed rates of RC-specific mortality at 1, 2, 5, and 10 years after nephrectomy, and the accuracy of time-specific predictions was quantified using the predicted probability validation method (val.prob) from the S-Plus design library (Statistical Sciences, Seattle, WA). Moreover, the relationship between predicted and observed rates was explored graphically using the val.surv function from the R statistical package. The University of California, Los Angeles Integrated Staging System (UISS), which predicts RC-specific survival at 2 and 5 years, was used as a comparison benchmark for the newly developed nomogram.4 Given that the UISS requires the inclusion of the ECOG performance status, its accuracy was tested within the cohort of 2,530 patients with available ECOG performance status. This validation of the UISS represents an external validation of this prognostic scheme. The Mantel-Haenszel test was used to compare the difference between the accuracy of our nomogram and that of the UISS prognostic scheme. All statistical tests were performed with S-Plus Professional and R statistical packages, and statistical significance was set at .05.
The descriptive statistics of the 2,530 assessable patients are listed in Table 1. T3 stages predominated and accounted for 900 patients (35.6%). Tumor size ranged from 0.5 to 25 cm (mean, 6 cm). Fuhrman grade was I in 665 (26.3%), II in 835 (33%), III in 832 (32.9%), and IV in 198 (7.8%) patients. Conventional clear cell histology was reported in 2,245 (88.7%) patients. Local symptoms were present in 900 (35.5%) patients and systemic symptoms were present in 482 (19.1%) patients. Of 2,530 patients, 798 (31.5%) died, and 598 of 798 deaths (74.9%) were attributable to RC. As shown in Appendix Figure A1 (online only), the median time to RC-specific mortality was not reached in either the internal (mean, 15.8 years) or the external (mean, 16.2 years) cohorts. Within the internal validation cohort of 2,530 patients, 2,043 (9,570 person-years; 80.9%), 1,648 (7,765 person-years; 66%), 937 (4,066 person-years; 34.4%), and 344 (1,242 person-years; 10.5%) patients remained at risk of dying at 1, 2, 5, and 10 years, respectively. Actuarial cancer-specific survival probabilities were 89.7% (95% CI, 88.4% to 90.9%), 83.2% (95% CI, 81.5% to 84.7%), 74.2% (95% CI, 72.2% to 76.2%), and 67.2% (95% CI, 64.6% to 69.7%) at 1, 2, 5, and 10 years after surgery, respectively (Appendix Fig A1, online only). For the internal validation cohort, Kaplan and Meier plots of cancer-specific survival according to all clinically and/or pathologically available predictor variables are shown in Appendix Figure A2 (online only).
Table 2 lists the univariate and multivariate Cox RC-specific survival models that were developed on the cohort of 2,530 patients. In univariate analyses, the TNM stages, age, symptoms, tumor size, Fuhrman grade, and histologic subtypes represented highly statistically significant predictors of cancer-specific survival (all P
Table 3 lists univariate and multivariate competing-risks regression models that were developed on the cohort of 2,530 patients. In multivariate competing-risks regression models, all variables (all P < .04), except for Fuhrman grade II versus I (P = .6) and for histologic subtypes (P = .1 and P = .06), were statistically significant predictors of RC-specific mortality, after accounting for other-cause mortality (Table 3).
Cox model-based analyses of univariate predictive accuracy (Table 2) revealed that T-stage represents the key univariate contributor (76.8%). The combined predictive accuracy of all variables, within the full nomogram (Fig 1) model was 86.5% and exceeded the accuracy of any individual predictor. After backward step-down variable selection, TNM stages, tumor size, Fuhrman grade, and symptom classification remained in the model. These variables yielded the most predictive and the most parsimonious, reduced model nomogram with 86.3% accuracy.
The calibration plots of the internally validated (200 bootstraps) reduced model nomogram (Fig 2) are shown for 1-, 2-, 5-, and 10-year predictions. The internal validation demonstrates virtually no departures from ideal predictions. We compare the predictive accuracy of the reduced nomogram with that of the UISS. In the external validation cohort, the accuracy of our model at 1, 2, 5, and 10 years was 87.8%, 89.2%, 86.7%, and 88.8%, respectively. Conversely, at 2 and 5 years, the UISS was 86.1% and 83.9% accurate, which corresponds respectively to 3.1% (P = .007) and 2.9% (P = .02) gains, relative to the UISS. Finally, Figure 3 shows the graphical comparison between the nomogram-predicted probabilities and the actual fraction surviving within the external validation cohort. The curve virtually follows the 45-degree slope, which indicates ideal performance.
Accurate prediction of cancer control after definitive treatment for RC is important for patient counseling, follow-up, and treatment planning. Recently, multivariate models based on various clinical, pathologic, and molecular parameters have been created.4-6 We hypothesized that the inclusion of traditional predictors of RC-specific survival could result in more accurate predictions than those of currently available staging systems. We used Harrell's methodology to quantify these potential gains.9,10 Our model development cohort consisted of 2,530 assessable patients who were treated with nephrectomy for RC of various stages at five European institutions. This model was subjected to 200 bootstrap samples to validate our findings internally. Given that the gold standard for assessing the performance of a model rests on external validation, we used an additional cohort of 1,377 patients to validate our findings externally. The TNM stages were combined with tumor size, Fuhrman grade, histologic subtype, age, sex, and symptoms at presentation. Cancer-specific mortality represented our end point of interest. Assessment of mortality and determination of the cause of death were performed by the treating physician, who relied on chart review and/or death certificate. This method is consistent with other large series, in which survival was assessed by the investigators.6,8 We used the nomogram approach described by Harrell et al9,10 and popularized by Kattan et al15,16 and Sorbellini et al.17 The 200 bootstrap-adjusted predictive accuracy was used to validate internally the predictive accuracy of the nomogram. The same index was used by three other groups, which devised alternative multivariate prognostic RC survival models.6,18,19 Our results yielded a multivariate model, which constitutes the basis for our nomogram. The predictors, which were retained within the most accurate and the most parsimonious nomogram, termed the reduced nomogram, consisted of TNM stages, Fuhrman grade, tumor size, and of symptom classification. The internally validated predictive accuracy of this nomogram (86.3%) exceeded that of the most informative individual predictor (T stage; 76.8%). External validation of the nomogram at 1, 2, 5, and 10 years after nephrectomy revealed predictive accuracy of 87.8%, 89.2%, 86.7%, and 88.8%, respectively. Conversely, the UISS prognostic scheme predicting at 2 and 5 years was less accurate, as evidenced by 86.1% (P = .006) and 83.9% (P = .02) accuracy. Kattan et al5 as well as Sorbellini et al17 developed multivariate nomograms predicting the probability of recurrence after nephrectomy for renal cell carcinoma. The accuracy of these two models was 74% (n = 601) and 82% (n = 701), respectively. Although, the accuracy of our model cannot be compared directly with the accuracies of these nomograms, our results indicate that the discriminant properties of our model (86.3%) are comparable to other models that addressed similar end points. Zisman et al4 (n = 661) and Frank et al6 (n = 1,060) developed disease-specific survival models. Frank's model relied on TNM stages, tumor size cutoff of 5 cm, nuclear grade, and tumor necrosis. Its bootstrap-adjusted concordance index was 83.8%. The original Zisman UISS model relied on the AJCC-stage, Fuhrman grade, and ECOG performance status. However, the authors did not provide accuracy estimates. Its stratification criteria were applied to a cohort of 1,060 patients and demonstrated concordance indices that ranged from 79% to 86%, with a mean of 83%. Patard et al19 also applied the Zisman criteria to a large (n = 4,202) multi-institutional cohort and demonstrated a bootstrap-adjusted concordance index of 80.9%. However, no external validation was performed. Relative to these studies, our nomogram predicts between 2.5% and 5.4% better. This suggests that if predictions are generated for 1,000 consecutive patients, between 25 and 54 patients may be ranked incorrectly if other models are chosen over ours. This is not negligible, especially not to those potentially incorrectly ranked patients, who deserve the most accurate survival predictions. Other advantages of our nomogram over that of previous studies reside in our sample size. Our nomogram-development cohort (n = 2,530) is the second largest and our external validation cohort (n = 1,377) is equally impressive. In addition, our cohorts are more contemporary than those used for the development and validation of the previous models. We used the most stringent statistical methodology to develop and validate our tool. Our model provides accurate predictions that span a 10-year period after nephrectomy, which exceeds the prognostic range of other models. Finally, our tool requires routinely used variables that are virtually invariably recorded. Conversely, the model developed by Frank et al6 requires inclusion of tumor necrosis, which is not included routinely in the pathology report. Finally, our model demonstrated higher predictive accuracy than the UISS model when both models were subjected to external validation within the current study. Our analytic approach also distinguishes itself from previous work by accounting for competing risks. Not all patients with RC die as a result of RC. Other-cause mortality may alter the risk of RC-specific mortality. This effect should be accounted for in disease models, in which the natural history of treated disease allows other-cause mortality to claim patient lives. We complemented our analyses with competing-risks regression, which addressed the significance of the combined multivariate contribution of all risk factors to RC-specific mortality, after accounting for other-cause mortality. This analysis demonstrated that, of variables included in the reduced nomogram, only Fuhrman grade II relative to grade I was not a statistically significant predictor of RC-specific mortality. Lack of availability of commercial software that allows development of competing-risks regression-based nomograms prevented us from relying on this methodology in our prognostic tool. Our study is not devoid of limitations. Despite having achieved accuracy that exceeded that of other existing models, our nomogram is not perfect. Indeed, 13.7% of predictions will be made incorrectly. This flaw is shared with virtually all predictive models, given that 100% correct predictions virtually are never achieved.5,15-17,20-22 The multi-institutional nature of our data set may be interpreted as a limitation, given that it groups the contribution of multiple surgeons and pathologists and relies on different surgical approaches, in addition to other differences that might distinguish the five contributing centers. Alternatively, our strategy provides a unique opportunity to pool data and increase the statistical power of these outcome studies. Lack of central pathology review might represent another weakness. Central pathology review might have contributed to higher accuracy of pathologically assessed variables and could have improved the overall ability of the nomogram to predict RC-specific survival. Conversely, the use of local pathology analysis confirms the validity of the nomogram when it is used at large. Finally, patients with stage IV disease received a variety of treatments, which ranged from interferon to high-dose interleukin-2. Unfortunately, treatment details were not captured in the institutional databases and could not be included in this analysis. In summary, we developed a highly accurate (86.3%) nomogram. Its accuracy is superior to all other survival tools, and in a comparison within this study, surpasses the accuracy of the UISS model.
The authors indicated no potential conflicts of interest.
Conception and design: Pierre I. Karakiewicz, Alberto Briganti, Felix K.-H. Chun, Quoc-Dien Trinh, Paul Perrotte, Vincenzo Ficarra, Luca Cindolo, Alexandre De La Taille, Jacques Tostain, Peter F.A. Mulders, Laurent Salomon, Richard Zigeuner, Tommaso Prayer-Galetti, Denis Chautard, Antoine Valeri, Eric Lechevallier, Jean-Luc Descotes, Herve Lang, Arnaud Mejean, Jean-Jacques Patard Provision of study materials or patients: Pierre I. Karakiewicz, Alberto Briganti, Felix K.-H. Chun, Paul Perrotte, Vincenzo Ficarra, Luca Cindolo, Alexandre De La Taille, Jacques Tostain, Peter F.A. Mulders, Laurent Salomon, Richard Zigeuner, Tommaso Prayer-Galetti, Denis Chautard, Antoine Valeri, Eric Lechevallier, Jean-Luc Descotes, Herve Lang, Arnaud Mejean, Jean-Jacques Patard Collection and assembly of data: Pierre I. Karakiewicz, Alberto Briganti, Felix K.-H. Chun, Paul Perrotte, Vincenzo Ficarra, Luca Cindolo, Alexandre De La Taille, Jacques Tostain, Peter F.A. Mulders, Laurent Salomon, Richard Zigeuner, Tommaso Prayer-Galetti, Denis Chautard, Antoine Valeri, Eric Lechevallier, Jean-Luc Descotes, Herve Lang, Arnaud Mejean, Jean-Jacques Patard Data analysis and interpretation: Pierre I. Karakiewicz, Alberto Briganti, Felix K.-H. Chun Manuscript writing: Pierre I. Karakiewicz, Alberto Briganti, Felix K.-H. Chun, Quoc-Dien Trinh Final approval of manuscript: Pierre I. Karakiewicz, Alberto Briganti, Felix K.-H. Chun, Quoc-Dien Trinh, Paul Perrotte, Vincenzo Ficarra, Luca Cindolo, Alexandre De La Taille, Jacques Tostain, Peter F.A. Mulders, Laurent Salomon, Richard Zigeuner, Tommaso Prayer-Galetti, Denis Chautard, Antoine Valeri, Eric Lechevallier, Jean-Luc Descotes, Herve Lang, Arnaud Mejean, Jean-Jacques Patard
Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.
1. Guinan P, Sobin LH, Algaba F, et al: TNM staging of renal cell carcinoma: Workgroup No. 3—Union International Contre le Cancer (UICC) and the American Joint Committee on Cancer (AJCC). Cancer 80:992-993, 1997[CrossRef][Medline] 2. Levy DA, Slaton JW, Swanson DA, et al: Stage specific guidelines for surveillance after radical nephrectomy for local renal cell carcinoma. J Urol 159:1163-1167, 1998[CrossRef][Medline] 3. Gettman MT, Blute ML, Spotts B, et al: Pathologic staging of renal cell carcinoma: Significance of tumor classification with the 1997 TNM staging system. Cancer 91:354-361, 2001[CrossRef][Medline] 4. Zisman A, Pantuck AJ, Dorey F, et al: Improved prognostication of renal cell carcinoma using an integrated staging system. J Clin Oncol 19:1649-1657, 2001 5. Kattan MW, Reuter V, Motzer RJ, et al: A postoperative nomogram for renal cell carcinoma. J Urol 166:63-67, 2001[CrossRef][Medline] 6. Frank I, Blute ML, Cheville JC, et al: An outcome prediction model for patients with clear cell renal cell carcinoma treated with radical nephrectomy based on tumor stage, size, grade and necrosis: The SSIGN score. J Urol 168:2395-2400, 2002[CrossRef][Medline] 7. Patard JJ, Leray E, Cindolo L, et al: Multi-institutional validation of a symptom based classification for renal cell carcinoma. J Urol 172:858-862, 2004[CrossRef][Medline] 8. Patard JJ, Leray E, Rioux-Leclercq N, et al: Prognostic value of histologic subtypes in renal cell carcinoma: A multicenter experience. J Clin Oncol 23:2763-2771, 2005 9. Harrell FE Jr, Lee KL, Mark DB: Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 15:361-387, 1996[CrossRef][Medline] 10. Harrell FE, Califf RM, Pryor DB, et al: Evaluating the yield of medical tests. JAMA 247:2543-2546, 1982 11. Atkinson AC: A note on the generalized information criterion for choice of a model. Biometrika 67:413-418, 1980 12. Grambsch PM, Therneau TM: Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 81:515-526, 1994 13. Fine JP, Gray RJ: A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc 94:496-509, 1999[CrossRef] 14. Bradley E, Tibshirani RJ: Monographs on Statistics and Applied Probability: An Introduction to the Bootstrap. Boca Raton, FL, Chapman and Hall/CRC Press, 1993, p 275 15. Kattan MW, Wheeler TM, Scardino PT: Postoperative nomogram for disease recurrence after radical prostatectomy for prostate cancer. J Clin Oncol 17:1499-1507, 1999 16. Kattan MW, Leung DH, Brennan MF: Postoperative nomogram for 12-year sarcoma-specific death. J Clin Oncol 20:791-796, 2002 17. Sorbellini M, Kattan MW, Snyder ME, et al: A postoperative prognostic nomogram predicting recurrence for patients with conventional clear cell renal cell carcinoma. J Urol 173:48-51, 2005[Medline] 18. Han KR, Bleumer I, Pantuck AJ, et al: Validation of an integrated staging system toward improved prognostication of patients with localized renal cell carcinoma in an international population. J Urol 170:2221-2224, 2003[CrossRef][Medline] 19. Patard JJ, Kim HL, Lam JS, et al: Use of the University of California Los Angeles integrated staging system to predict survival in renal cell carcinoma: An international multicenter study. J Clin Oncol 22:3316-3322, 2004 20. Chun FK, Steuber T, Erbersdobler A, et al: Development and internal validation of a nomogram predicting the probability of prostate cancer Gleason sum upgrading between biopsy and radical prostatectomy pathology. Eur Urol 49:820-826, 2006[CrossRef][Medline] 21. Steuber T, Graefen M, Haese A, et al: Validation of a nomogram for prediction of side specific extracapsular extension. J Urol 175:939-944, 2006[CrossRef][Medline] 22. Cindolo L, Patard JJ, Chiodini P, et al: Comparison of predictive accuracy of four prognostic models for nonmetastatic renal cell carcinoma after nephrectomy: A multicenter European study. Cancer 104:1362-1371, 2005[CrossRef][Medline] Submitted February 9, 2006; accepted January 9, 2007.
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||
|
Copyright © 2007 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
|