|
|||||
|
|
||||||
© 2002 American Society for Clinical Oncology Influence of Unrecognized Molecular Heterogeneity on Randomized Clinical TrialsByFrom the Department of Biostatistics, Harvard School of Public Health and Department of Pathology and Neurosurgical Service, Massachusetts General Hospital and Harvard Medical School, Boston, MA, and the Department of Oncology, University of Western Ontario and the London Regional Cancer Centre, London, Ontario, Canada. Address reprint requests to Rebecca A. Betensky, PhD, Harvard School of Public Health, 655 Huntington Ave, Boston, MA 02115; email: betensky{at}hsph.harvard.edu
PURPOSE: In solid tumor oncology, decisions regarding treatment and eligibility for trials are governed by histologic diagnosis. Despite this reliance on histology and the assumption that histology defines the disease, underlying molecular heterogeneity likely differentiates among patients outcomes. PATIENTS AND METHODS: To illustrate how unrecognized molecular heterogeneity might obscure a truly effective new therapy for cancer, we analyzed the planning assumptions and results of a hypothetical randomized controlled trial of chemoradiotherapy for a cancer found to be drug sensitive in preliminary phase II studies. RESULTS: Randomized controlled trials of effective cancer therapies can be falsely negative if therapeutic benefit is overestimated during study design because of enrichment of phase II trials for treatment-sensitive subtypes, a beneficial effect in responding patients is diluted by large numbers of nonresponding patients, or a beneficial effect in responders is reversed by a negative effect in nonresponders. CONCLUSION: Molecular heterogeneity, if it confers different risks to patients and is unaccounted for in the design of a randomized study, can result in a clinical trial that is underpowered and fails to detect a truly effective new therapy for cancer.
HISTOLOGIC ASSESSMENT is the cornerstone of diagnosis in solid tumor oncology and, together with staging, guides the treatment of patients with all types of cancer. Histologic diagnosis is also the principal eligibility criterion for clinical trials in oncology. Implicit in these longstanding practices is the assumption, likely erroneous, that cancers that seem histologically identical constitute a single disease. This notion is surely false, because clinicians have known for decades that cancers of a specific histologic type often respond differently to treatment and because scientists have now demonstrated that different patterns of gene expression can be seen in cancers that appear indistinguishable under the microscope.1 Moreover, evidence is emerging that the variations in response to treatment and survival that are seen among patients with histologically indistinguishable cancers are governed by such biologic differences.2 In neuro-oncology, for example, specific molecular changes already identify several genetic subtypes of anaplastic oligodendroglioma, only one of which is highly chemosensitive.3 These clinically distinct subtypes of anaplastic oligodendroglioma are indistinguishable microscopically. Identification of molecular subtypes and the use of molecular signatures to direct treatment are presently becoming part of clinical practice in neuro-oncology. The prospect that each type of cancer may be several distinct diseases will have important implications for the design and interpretation of randomized trials of new therapies in oncology, especially if molecular heterogeneity implies that only some patients in a clinical trial harbor sensitive tumors. In statistical terms, when important variables are omitted, errors occur in clinical trial design and analysis because the regression model is inadvertently misspecified.4-9 All of these studies present general results for analyses of survival that are applicable to the omission of any variable that divides the patients into two distinct groups. These results include the characterization of the bias in the estimate of the treatment effect and of the loss of statistical efficiency. With one exception,10 these studies assume that the treatment effect does not depend on the omitted variable (ie, there is no interaction). In a more applied study,11 the authors are solely concerned with the dilution of a treatment effect by the presence of a group of subjects for whom there was no treatment effect. They examine only dichotomous end points (ie, response and survival to a certain time point) and assume there to be no treatment effect for one of the groups defined by the omitted variable. In this study, we critique power and sample size calculations for a realistic hypothetical trial illustrating that unrecognized molecular heterogeneity can lead to an underpowered, falsely negative study. Our work can be viewed as an application of the theoretical results of Lagakos and Schoenfeld,10 with an emphasis on the case in which the treatment effect depends on the omitted covariate; in contrast to efficiency calculations, we express our results in terms of power and sample size. We consider several scenarios that describe the impact of genetic subtype on patient survival in the context of treatment effects, extending the work of Fijal et al11 in two important directions. First, our results are derived from a proportional hazards regression model and therefore do not depend on a specific-time point, and, secondly, we consider the implications of different treatment effects for each molecular subtype. The issues raised here are applicable to any disease that seems to be a single disorder but, in reality, is several disorders. They are also applicable to any omitted variable (eg, demographic, clinical, radiographic, and so on) that is predictive of the end point of interest.
Hypothetical Trial The randomized trial under consideration compares two approaches to the initial treatment of patients with a specific type of cancer. The experimental treatment is chemoradiotherapy, the standard treatment is radiotherapy alone, and the outcome of interest is overall survival. Interest in chemoradiotherapy arose when it became apparent that this type of cancer was also chemosensitive. Chemosensitivity was demonstrated in previously irradiated recurrent cases. In this setting, high rates of response and long durations of response to chemotherapy were noted. Also, responders to chemotherapy had unexpectedly long overall survival times. This observation led investigators to hypothesize that patients treated with chemoradiotherapy at diagnosis would live significantly longer than patients receiving standard treatment who, historically, lived 4 years on average. A randomized controlled trial is designed in which patients treated with chemoradiotherapy at diagnosis are anticipated to live 50% longer than those receiving radiotherapy alone (ie, median survival times, 6 v 4 years). Unknown to the trialists, however, the cancer type under study is of two genetic subtypes that are indistinguishable histologically but confer different risks to patients. In the general population, 50% of patients with this cancer have genetic subtype 1 and 50% have subtype 2. Also unknown to the investigators, patients with genetic subtype 1 are overrepresented in the phase II trials that spawn the randomized study. Because of inadvertent selection bias, a characteristic of uncontrolled trials, 90% of the patients in the successful phase II studies have genetic subtype 1. Thus, the trialists hypothesis that chemoradiotherapy will confer a 50% increase in survival for patients in the experimental arm is premature and may be overly optimistic, because the effectiveness of chemotherapy for patients whose tumors harbor genetic subtype 2 has not yet been assessed.
Statistical Models The hazard functions for patients in each of the treatment groups, derived from models 1 and 2 under the assumption of constant baseline hazard functions, are listed in Table 1. These derived hazard functions are used to calculate the power of the simple log-rank test (based on model 1) when model 2 holds. Note that, in general, the hazard functions derived from model 2 depend on t and thus are not proportional as specified by model 1.
Study Scenarios We now consider three plausible scenarios that capture different effects of treatment and genetic subtype on survival. Each scenario is defined by different values for the ß parameters in statistical model 2. In each scenario, we assume that the true effect of chemoradiotherapy is to increase the median survival of patients with genetic subtype 1 by 50%, as hypothesized by the trialists. Also, in each scenario, we assume that the median survival after radiotherapy alone is 6 years for patients with genetic subtype 1 and 2 years for patients with genetic subtype 2, implying that the median survival after chemoradiotherapy is 9 years for patients with genetic subtype 1. In scenario I, we assume that the beneficial effect of chemoradiotherapy is independent of genetic subtype, and hence the median survival of patients with genetic subtype 2 also increases by 50% to 3 years. Under these circumstances, the values for ß in statistical model 2 are as follows: ß1 = -0.405, ß2 = -1.1, and ß3 = 0. In scenario II, we assume that chemoradiotherapy is ineffective for genetic subtype 2 but remains effective for subtype 1; under these circumstances, ß1 = 0, ß2 = -1.1, and ß3 = 0.405. In scenario III, we assume that chemoradiotherapy has a deleterious effect in some situations and decreases the median survival of patients with genetic subtype 2 by 25% to 1.5 years; under these circumstances, ß1 = 0.286, ß2 = -1.1, and ß3 = -0.691.
Given that the median survival of patients receiving standard treatment (ie, radiotherapy alone) is 4 years and 50% of such patients have genetic subtype 1, we calculate the true baseline hazard,
Power Calculations
For each scenario, Fig 1 depicts the power of the log-rank test, which implicitly assumes model 1 to hold and, thus, no genetic heterogeneity. Power is displayed as a function of the true probability of genetic subtype 1 in the population of patients eligible for the randomized study. Also for each scenario, Table 2 lists the corresponding sample sizes required to achieve 80% power for the survival comparison between chemoradiotherapy and radiotherapy alone. The sample sizes are calculated as a function of the true probability of genetic subtype 1 in the population of study-eligible cases. It is evident from Fig 1 and Table 2 that, in most situations, the hypothetical randomized controlled trial designed to compare chemoradiotherapy to radiotherapy alone is underpowered and too small. Study design errors occur because the trialists are unaware of molecular heterogeneity and also unaware that patients with genetic subtype 1 are overrepresented in preliminary studies; consequently, they mistakenly choose statistical model 1 to calculate power and sample size for the randomized trial and overestimate the beneficial effect of chemoradiotherapy for this cancer type.
For scenario I, where the effect of chemoradiotherapy is independent of genetic subtype, that is, both types benefit, the power of the log-rank test is 80% or greater only when the true proportion of patients eligible for the randomized trial with genetic subtype 1 is less than 0.2. This is attributable to the shorter survival times among patients with genetic subtype 2, who are predominant, leading to more observed deaths and thus greater power. If, as expected, the true proportion of genetic subtype 1 in the population of randomized cases is 0.5, the phase III study will have 68% power for the chemoradiotherapy versus radiotherapy alone comparison. The sample size required to have 80% power in this situation is approximately 386, substantially higher than the 286 calculated by the trialists, and is attributable to the difference in the expected number of deaths in the two genetic groups. For scenario II, in which chemoradiotherapy has an effect only for patients with genetic subtype 1, the power of the log-rank test is always less than 80%. If, as expected, the proportion of genetic subtype 1 in the population of patients eligible for randomized is 0.5, the trial will have 21% power for the chemoradiotherapy versus radiotherapy alone comparison. Under these circumstances, the sample size required for 80% power is 1,693. The reduced power and the need for a much larger sample size are a consequence of the dilution of the benefit for subtype 1 by subtype 2 cases. For scenario III, in which chemoradiotherapy has a detrimental effect on patients with genetic subtype 2, the power is a nonmonotone function of the true probability of genetic subtype 1. This is attributable to the opposing effects of chemoradiotherapy in the two genetic subgroups; the power for detecting the effect for subtype 1 is large for high true probabilities, whereas the power for detecting the effect for subtype 2 is large for low true probabilities. If, as expected, the proportion of genetic subtype 1 in the population of patients eligible for randomization is 0.5, the trial will have 7% power. Under these circumstances, the sample size required for 80% power is 14,159. In this scenario, power is reduced additionally and sample size requirements exceed the scope of conventionally sized randomized trials, because the survival advantage conferred by chemoradiotherapy for patients with genetic subtype 1 is negated by the deleterious effects of this treatment approach for those with subtype 2.
Unrecognized molecular heterogeneity, when it confers different risks to patients, can undermine the power of a randomized trial to detect a truly beneficial therapy. In the hypothetical phase III study described in this study, power and sample size calculations are satisfactory only when the beneficial survival effect of chemoradiotherapy is independent of genetic subtype and the proportion of patients entering the trial with genetic subtype 2 is greater than 80%. In all other situations that we have considered, the randomized study is underpowered and too small to demonstrate a difference between chemoradiotherapy and radiotherapy alone, even though patients with genetic subtype 1 truly benefit from chemoradiotherapy. Hence, falsely negative randomized trials can result from overestimation of a therapeutic effect used for study design because of enrichment of phase II studies for a treatment-sensitive subtype, dilution of a beneficial effect in responding patients by large numbers of nonresponding patients, or reversal of a beneficial effect in responders by a negative effect in nonresponders (ie, negatively responding patients). Consideration of these potential causes of falsely negative trials within a proportional hazards regression model extends previous studies that have considered only dichotomous end points11 or have focused only on the second cause.4-11 Underpowered trials are of great concern, because cancer therapies that are truly effective for specific subsets of patients may be missed. Such was the case for estrogen therapy in prostate cancer, for which the real benefit of treatment in younger men was obscured by its severe cardiovascular toxicity in older patients.13 Molecular heterogeneity may soon be an important issue in clinical research. Consider anaplastic oligodendroglioma, where at least three clinically distinct genetic subtypes have now been identified: those with allelic loss of chromosome 1p (1p LOH), those with 1p intact and a TP53 mutation, and those with neither 1p LOH nor TP53 mutation.3 Anaplastic oligodendrogliomas with 1p LOH respond to chemotherapy, and patients with this subtype have long survival times. In contrast, when chromosome 1p and the TP53 gene are intact, responses to chemotherapy are infrequent and such patients have short survival times. Moreover, the proportions of these genetic subtypes of anaplastic oligodendroglioma differ in different cohorts of patients. In recurrent cases, the situation in which sensitivity to chemotherapy was first noted, up to 90% have had allelic loss of chromosome 1p.14 However, in series of newly diagnosed anaplastic oligodendrogliomas, in which rates of response to chemotherapy have been lower, less than 60% have had 1p LOH.3 Such differences may be important, because the ongoing randomized trials of chemoradiotherapy for anaplastic oligodendroglioma were designed based on phase II experience, where virtually all tumors of this histologic type seemed to be drug sensitive. The North American and European trials comparing chemotherapy plus radiotherapy versus radiotherapy alone for patients with anaplastic oligodendroglioma are nearing completion. Because both consortia collected tumor tissue on all randomized cases, a post hoc clinical-molecular correlative study will be feasible, affording an unprecedented opportunity to explore the issues raised in this hypothetical analysis in a real-life setting. Correlation of the molecular alterations now known to be associated with anaplastic oligodendroglioma with the outcomes of the patients in these trials may enhance the interpretation of the results of both studies, whether they are positive or negative. In the case of glioblastoma, another genetically heterogeneous cancer, randomized controlled trials have failed to demonstrate a survival benefit after the addition of chemotherapy to radiotherapy despite the fact that radiographic responses to chemotherapy occur occasionally. Although there is no evidence to support the routine use of adjuvant chemotherapy for patients with glioblastoma, the possibility that specific subsets of patients might benefit from such treatment remains an open question. Randomized controlled trials of adjuvant chemotherapy have not been designed to exclude this possibility. Indeed, as shown here, a substantial increase in survival can be lost in a randomized trial if the proportion of patients benefiting from the new therapy is small or if the intervention, although helpful for some, is deleterious for others. Moreover, the frequency of specific genetic subtypes of glioblastoma is highly dependent on patient age. For glioblastoma, age seems to be a surrogate marker for genetic signature. Hence, in the context of negative trials, age may be an important confounding variable. Case series and clinical experience suggest that radiographic responses to chemotherapy occur more commonly in younger patients with glioblastoma.15 However, in randomized trials of adjuvant chemotherapy, most of which have been negative, older patients predominate. Furthermore, phase II studies of chemotherapy for recurrent tumors may select for glioblastomas that are inherently less aggressive (presumably, genetically determined), whereas less selection may occur in phase III trials. The recent finding that proliferative indices are significantly lower in younger patients with glioblastoma speaks to the heterogeneous biologic nature of this cancer.16 In practice, there are steps that can be taken to avoid the statistical pitfalls highlighted here. For example, if there is a suspicion of unobserved heterogeneity among patients entering a randomized trial, one might consider many different plausible outcomes, evaluate their effects on the power of the trial, and increase the sample size to protect against these outcomes. However, this approach has limitations, because the worst-case-scenario sample sizes may be unachievably large or, more likely, the nature of the heterogeneity may be completely unknown. Alternatively, interim analyses can be used to update the design based on the observed data and thereby protect against an underpowered trial. In particular, the hazard functions for the two treatment groups can be estimated at an interim time point and the sample size recomputed accordingly.17,18 Oncologists have long suspected that subsets of patients who benefit from specific therapies might be hidden in larger groups of resistant cases. Only recently, however, has it become apparent that these elusive treatment-sensitive subtypes of cancer may have specific molecular signatures that we can learn to recognize. In due course, molecular characterization of cancers may supplant traditional histologic diagnosis as the principal eligibility criterion for randomized trials of new therapies; at the very least, molecular features are likely to become important stratification variables in such trials. Additionally, molecular characterization may become an integral part of the analysis of phase II studies, because molecular markers that distinguish responding from nonresponding tumors could be of great assistance in planning subsequent randomized trials that evaluate new therapies more thoroughly. The challenge, therefore, will be for clinical research methodologies to keep pace with molecular diagnostic and therapeutic breakthroughs.
Supported by the Canadian Institutes for Health Research grant no. 37849 and National Institutes of Health grant nos. R01 CA57683 and R29 CA75971. We thank Glenn Bauman, MD, Eric Winquist, MD, and Yi Li, PhD, for reviewing the manuscript and Rebecca Desgroseilliers for preparing it.
1. Perou CM, Sorlie T, Eisen MB, et al: Molecular portraits of human breast tumours. Nature 406: 747-752, 2000[CrossRef][Medline] 2. Alizadeh AA, Eisen MB, Davis RE, et al: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403: 503-511, 2000[CrossRef][Medline]
3.
Ino Y, Betensky RA, Zlatescu MC, et al: Molecular subtypes of anaplastic oligodendroglioma: Implications for patient management at diagnosis. Clin Cancer Res 7: 839-845, 2001
4.
Gail MH, Wieand S, Piantadosi S, et al: Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika 71: 431-444, 1984
5.
Solomon PJ: Effect of misspecification of regression models in the analysis of survival data. Biometrika 71: 291-298, 1984
6.
Struthers CA, Kalbfleisch JD: Misspecified proportional hazard models. Biometrika 73: 363-369, 1986 7. Chastang C, Byar D, Piantadosi S: A quantitative study of the bias in estimating the treatment effect caused by omitting a balanced covariate in survival models. Stat Med 7: 1243-1255, 1988[Medline] 8. Bretagnolle J, Huber-Carol C: Effects of omitting covariates in Coxs model for survival data. Scand J Stat 15: 125-138, 1988 9. Schmoor C, Schumacher M: Effects of covariate omission and categorization when analysing randomized trials with the Cox model. Stat Med 16: 225-237, 1997[CrossRef][Medline] 10. Lagakos SW, Schoenfeld DA: Properties of proportional-hazards score tests under misspecified regression models. Biometrics 40: 1037-1048, 1984[CrossRef][Medline] 11. Fijal BA, Hall JM, Witte JS: Clinical trials in the genomic era: effects of protective genotypes on sample size and duration of trial. Control Clin Trials 21: 7-20, 2000[CrossRef][Medline]
12.
Schoenfeld D: The asymptotic properties of nonparametric tests for comparing survival distributions. Biometrika 68: 316-319, 1981 13. Byar DP: Assessing apparent treatment: Covariate interactions in randomized clinical trials. Stat Med 4: 255-263, 1985[Medline]
14.
Cairncross JG, Ueki K, Zlatescu MC, et al: Specific genetic predictors of chemotherapeutic response and survival in patients with anaplastic oligodendrogliomas. J Natl Cancer Inst 90: 1473-1479, 1998 15. Grant R, Liang BC, Page MA, et al: Age influences chemotherapy response in astrocytomas. Neurology 45: 929-933, 1995[Abstract]
16.
McKeever PE, Junck L, Strawderman MS, et al: Proliferation index is related to patient age in glioblastoma. Neurology 56: 1216-1218, 2001 17. Scharfstein DO, Tsiatis AA: The use of simulation and bootstrap in information-based group sequential studies. Stat Med 17: 75-87, 1998[CrossRef][Medline] 18. Proschan MA, Hunsberger SA: Designed extension of studies based on conditional power. Biometrics 51: 1315-1324, 1995[CrossRef][Medline] Submitted June 25, 2001; accepted March 1, 2002.
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||
|
Copyright © 2002 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
|