Advertisement
Journal of Clinical Oncology  
Search for:
Limit by:
  Browse by Subject or Issue
Home Search or Browse JCO My JCO Subscriptions Customer Service Site Map

Journal of Clinical Oncology, Vol 26, No 22 (August 1), 2008: pp. 3715-3720
© 2008 American Society of Clinical Oncology.
DOI: 10.1200/JCO.2007.14.1044

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a colleague
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Save to my personal folders
Right arrow Download to citation manager
Right arrowRights & Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goffin, J. R.
Right arrow Articles by Tu, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goffin, J. R.
Right arrow Articles by Tu, D.
Related Articles
Right arrowRelated Correspondence
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Phase II Stopping Rules That Employ Response Rates and Early Progression

John R. Goffin, Dongsheng Tu

From the Juravinski Cancer Centre, McMaster University, Hamilton; and the National Cancer Institute of Canada Clinical Trials Group and Department of Mathematics and Statistics, Queen's University, Kingston, Ontario, Canada

Corresponding author: John R. Goffin, Juravinski Cancer Centre, 699 Concession St, Hamilton, Ontario L8V 5C2; e-mail: john.goffin{at}hrcc.on.ca


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
Purpose Phase II oncology trials traditionally have used response rate (RR) as the primary end point, but newer targeted agents require the consideration of alternative end points. High rates of early progressive disease (EPD) suggest inadequate drug activity and may be useful in the early stopping of trials. This study used a simulation to define a set of rules to assess a combined end point of RR and EPD.

Methods The simulation assumed a two-stage trial with a specified {alpha} error and power. It randomly generated the true response rate, r, of the agent under study and its true rate of early progressive disease, epd, for each run of the simulation. Two pairs of parameters were specified: (rnul, epdnul) and (ralt, epdalt). A drug was considered uninteresting for further development if r was less than or equal to rnul and epd was greater than or equal to epdnul (ie, the null hypothesis) and interesting for further development if r was greater than or equal to ralt or epd was less than or equal to epdalt (ie, the alternate hypotheses). Thresholds for the required number of patients with responses, nr and EPD, np, were generated for each set of parameters.

Results Thresholds for nr and np that satisfied the specified error rates were generated. There was at least an 89% likelihood that a study would be stopped at the first stage of accrual if r and epd were uninteresting.

Conclusion The simulation was able to establish stopping rules by combining the RR and the EPD that achieved the desired error rates. High rates of early stopping suggest that this design could shorten phase II trials of inactive agents.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
Drug development in oncology has become increasingly resource-intensive, even as more agents come under study.1,2 The phase II study has a limited ability to discriminate drugs that are suitable for phase III study,3,4 and improvement of the efficiency of phase II studies could accelerate drug development.

The conventional use of response rate (RR) in phase II trials is tied to the assumption that such responses are associated with better survival. Data generally support this association,5-8 but some drugs induce minimal response in particular settings while still improving survival.9,10 In addition, stable disease is likely to be associated with an improvement in survival.6,11-14 Recent drug development has been directed toward more targeted therapies. Some of these drugs are likely to cause cell death, but other agents—drugs termed cytostatic—may be more likely to induce only stabilization of disease and, as such, will require different considerations when activity is assessed.15

To monitor for at least a minimal signal of disease stabilization, one can employ the end point of early progressive disease (EPD) in phase II trials. This may be defined as progression of disease at the first point of disease remeasurement after treatment. Drugs that allow a significant proportion of the population to undergo early disease progression are potentially unattractive.

Traditional, two-stage, phase II designs with one binary end point, as developed by Fleming,16 Gehan,17 and others18 can be used to assess RR or proportions of EPD. Zee et al19 proposed a new design that would incorporate both RR and proportions of EPD as the end points in a phase II trial. The goal was to increase the likelihood of early termination of trials with agents that showed excessive early progression and a limited rate of response. The desired alternate hypothesis was to find drugs with greater than minimally specified RR or less than the specified EPD rate. Although attractive in concept, the authors found from simulations that the power of the studies with the decision rule from this design was less than that set in the design.20

Chang et al21 considered a two-stage design of phase II trials with both RR and the EPD rate in the context of a window-of-opportunity study. Their study employed an "and" in the alternate hypothesis and included the expectation that drugs should have both a good RR and a minimal EPD rate to be attractive in a previously untreated population. They indicated that their design may be used to assess cytostatic agents or previously treated patients by switching the null and alternative hypotheses. However, their design allows for early acceptance of seemingly interesting drugs at the end of the first stage, which is not a standard practice in phase II oncology trials.

This article develops stopping rules for phase II trials by using the same framework as Zee et al.19 It employs both RR and EPD rate and can be used to assess drugs in the setting of advanced disease. An approach that is based on simulations was adopted.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
The simulation lets r be the true response rate of the agent that is being studied and epd its true rate of early progressive disease. In this simulation, one assumes that two pairs of parameters, (rnul, epdnul) and (ralt, epdalt), can be specified that would render a drug uninteresting for further development if r is less than or equal to rnul and epd is greater than or equal to epdnul and interesting for further development if r is greater than or equal to ralt or epd is less than or equal to epdalt. That is, we are interested in testing the following hypotheses in a phase II trial: the null hypothesis of r being less than or equal to rnul and epd being greater than or equal to epdnul versus the alternate hypothesis of r being greater than or equal to ralt or epd being less than or equal to epdalt.

This article considers the following two-stage procedure for testing the above hypotheses: In the first stage, n1 are entered. Then, n1r and n1p are assumed to be the number of patients who responded and the number who had early progression among these n1 patients, respectively. The trial would be stopped at this stage if n1r is less than or equal to n1r-nul and n1p is greater than or equal to n1p-nul, in which n1r-nul and n1p-nul are two thresholds determined during the design of the study. Otherwise, n2 additional patients are entered in the second stage of the study. Let n2r and n2p be respectively the number of patients who responded and who had early progression among these n2 patients. The drug will be declared interesting at the end of the second stage if n1r + n2r ≥ n1r-alt + n2r-alt or n1p + n2p ≤ n1p-alt + n2p-alt, in which n2r-alt and n2p-alt are another two thresholds determined during the design of the study.

Simulations were performed with TreeAge Pro Healthcare software (Williamstown, MA) to determine thresholds (program available on request). For each simulation, the following parameters were prespecified: design parameters (rnul, epdnul) and (ralt, epdalt), desired power and {alpha} error of the study, and the number of patients in the first stage (n1) and second stage (n2). For each set of potential thresholds, 1,000,000 simulations were run to assess the {alpha} error and power of the study as the frequency of rejection of the null hypothesis. The final set of thresholds was selected as the one which yielded the {alpha} error and power closest to that specified. When the {alpha} error or power could be better than specified, this was permitted if it did not worsen the other error. The r and epd in each of the simulations were generated by using one of the following two methods:

The full-space method, for simulations of the alternative hypothesis, used a weighted probability (that was based on the space created by the selected [ralt, epdalt] parameters) to randomly select a value for r or epd from within the range of interest. Thus, either r greater than or equal to ralt or epd less than or equal to epdalt is generated. The remaining parameter then is randomly selected from the values that satisfy r + epd ≤ 1.

For simulations of the null hypothesis, the same method was used, but r and epd both were required to satisfy r less than or equal to rnul and epd greater than or equal to epdnul.

The borderline-value method assumed that extremely desirable values for r and epd were unlikely and that the inclusion of such values in assessed populations would under-power a study to detect drugs with borderline r or epd characteristics. In the borderline-value method, cohorts to assess the alternate hypothesis were generated by randomly assignment of r as ralt or epd as epdalt. The nonprecedent parameter then was randomly assigned a value within its noninteresting range (ie, r ≤ ralt or epd ≥ epdalt) such that r + epd ≤ 1. Cohorts for the null hypothesis assessment were generated as described for the full-space method.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
Thresholds Created by the Full-Space Method
Table 1 indicates the patient thresholds required for acceptance or rejection of the null hypothesis after stage 1 and 2, when drugs of interest were assumed to have r greater than or equal to ralt or to have epd less than or equal to epdalt. Power was specified at .80; {alpha} error was specified at .05; 15 patients were assessed in each stage.


View this table:
[in this window]
[in a new window]

 
Table 1. Thresholds Created by the Full-Space Method

 
Results would be read as follows: For the parameters of ralt ≥ 0.4, rnul ≤ 0.2, epdalt ≤ 0.2, and epdnul ≥ 0.4, the drug would be rejected and the study would be stopped after stage 1 of accrual if the number of responders was less than or equal to four of 15 and the number of patients with EPD was less than or equal to four of 15 or if the number of responders was less than or equal to five of 15 and the number of patients with EPD was less than or equal to five of 15. If these two pairs of criteria were not met, the study would accrue the second stage; early stopping was not permitted for drugs that met the criteria for the alternate hypothesis, although these pairs were listed. At the end of stage 2, the drug would be accepted as active if the number of responders was greater than or equal to 12 of 30 or if the number of patients with EPD was less than or equal to six of 30. In practice, it would otherwise be rejected. Strictly, however, the program was designed to reject drugs if they fell within the ranges of response and progressive disease that were considered uninteresting. The listed criteria for drug rejection were, therefore, the basis for the {alpha} error calculation. In Table 1, a few pairs may be found between the acceptance and rejection criteria at stage 2; therefore, the {alpha} error in practice would be slightly lower (ie, better) than that stated by the program.

In all assessments of these thresholds by the full-space method, the error rates achieved were better than required. Table 1 also indicates the expected average study size when a drug met either the criteria of interest or disinterest. As expected with the low {alpha} value, studies in which uninteresting drugs were assessed had greater than 96% likelihood of stopping at the first stage; this provided a corresponding average accrual number just greater than 15, which was the number of patients recruited at stage 1.

Thresholds Created by the Borderline-Value Method
Despite the apparent high power in Table 1, drugs with r or epd of marginal but passable levels of interest may not be detected. The borderline-value method generated drugs with either r equal to ralt and epd greater than epdalt or with epd equal to epdalt and r less than ralt. Table 1 lists thresholds generated by the full-space method that have poor power to detect such borderline drugs.

However, the power and {alpha} error met the specified values with the use of thresholds designed for such borderline-value drugs, as indicated in Table 2. Compared with the thresholds generated in Table 1, the borderline-value method generated threshold pairs for the alternate hypothesis that were more generous; n1r+ n2r is lower and n1p+ n2p is higher. Thus, if the borderline-value thresholds are applied to assess a population of drugs generated by the full-space method, the power increases to greater than .97 in all cases, and the {alpha} errors remain essentially unchanged (data not shown).


View this table:
[in this window]
[in a new window]

 
Table 2. Thresholds Created by the Borderline-Value Method

 
The thresholds in Table 2 assume that uninteresting drugs have both r less than or equal to rnul and epd greater than or equal to epdnul, and the values were taken from the full space of possibility. However, drugs of borderline disinterest can be assessed. Table 3 lists thresholds that will detect interesting drugs of borderline value (as Table 2) but will also detect uninteresting drugs with both r equal to rnul and epd equal to epdnul. This more challenging situation results in thresholds of interest and disinterest that are immediately adjacent. It fails to achieve the desired {alpha} error in all instances. Only slight improvements were achieved by using a trial size of n = 45 (data not shown).


View this table:
[in this window]
[in a new window]

 
Table 3. Thresholds With Both Interesting and Uninteresting Drugs That Have Borderline Values

 
A sensitivity analysis on the basis of the borderline method of Table 2 is listed in Table 4. A decrease in the difference between epdnul and epdalt or an alteration of the stage size resulted in small changes in error rates.


View this table:
[in this window]
[in a new window]

 
Table 4. Sensitivity Testing on the Basis of the Borderline-Value Method

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
The EPD rate provides another means of drug activity assessment and offers the potential to stop studies early when the rate is undesirably high. This article creates a set of rules to incorporate RR and EPD rate.

Two different methods to determine patient thresholds for response and EPD are described. The full-space method simulates drugs, and r and epd are chosen from throughout the ranges relevant to the null or alternate hypothesis. It is attractive in its absence of bias or presupposition about the characteristics of the drugs under study. As indicated in Table 1, thresholds were generated that met the specified error rates. The power is greater than required so that the thresholds detect drugs with either a favorable RR or a favorable EPD rate. A lesser power could have been chosen but would have been without gain, as the {alpha} error was sufficiently low.

Thresholds generated by the full-space method may dismiss drugs of marginal activity that still meet the alternate hypothesis, as listed in Table 1. In contrast, the borderline-value method creates thresholds that assume marginal activity for a drug and achieves the specified power and {alpha} error rates (Table 2). A sensitivity analysis (Table 4) on the basis of the borderline method shows small changes in error rates. Thresholds did not change in every case, partly because of the model's intent on preserving power. A larger sample size did not necessarily lead to better power; the parameter set (n1 = 20, n2 = 20; null hypothesis, r ≤ 0.05 and epd ≥ 0.6, versus alternate hypothesis, r ≥ 0.2 or epd ≤ 0.4) did not result in a need for the program to reconcile overlapping thresholds for the null and alternate hypotheses, an instance that contributes to improved {alpha} error at the expense of β error.

In addition, Table 3 lists thresholds created by drugs that have borderline characteristics for both cohorts of interest and of disinterest. The rules fail to meet the specified {alpha} error rate, which could only be improved at the expense of power.

Although extremely good drugs are unlikely to exist, poor drugs are more frequently encountered; RRs near zero are observed in early trials, and the EPD rate can be high in advanced or pretreated disease. Therefore, the value of rules designed to sensitively detect drugs of only borderline disinterest is doubtful. The utility of the parameters of Table 3 is in indicating the limits of such rules.

Other cohorts of drugs could be used to generate thresholds that have distributions that favor particular portions of the space of interest or disinterest. The difficulty is in knowing what RR or EPD rate distribution would be likely. For practical reasons, we favor the thresholds capable of detecting drugs of borderline interest as generated for Table 2.

The initial paper that added EPD to the assessment of phase II trials was found later by its authors to have poorer power than intended.19,20 That paper differs from the present paper in the employment of multiple pairs of threshold criteria that reject the null hypothesis. Because the alternate hypothesis is designed to detect drugs with either interesting parameter, a single pair suffices. However, that initial paper indicated the importance of the detection of drugs with minimal but sufficient activity, and it supports the use of rules of Table 2.

Unlike this article, Chang et al21 designed a trial to allow acceptance of a drug after the first stage of accrual, which may be appropriate for extremely active drugs or limited resources. In a heterogeneous population, many investigators prefer to accrue additional patients. This will modestly improve the confidence intervals around outcome rates and will allow better planning of phase III studies for drugs with marginal activity. Correspondingly, desired error rates may be difficult to achieve after only the first stage of accrual; after stage 1, in Table 2, {alpha} error rates were .06, .11, .11, and .12 respectively.

The consequence of rejection of the null hypothesis at stage 1 can be shown. Chang et al21 indicate that their hypotheses can be reversed and give one example with the hypotheses set as in the present paper. By using specified parameters (power = .8; alpha = .05; n1 = 21, n2 = 21; null hypothesis, r ≤ 0.4 and epd ≥ 0.3, versus alternate hypothesis, r ≥ 0.6 or epd ≤ 0.1), their method rejects the null hypothesis at stage 1 if n1r is greater than or equal to 15 or if n1p equals 0, accepts the null hypothesis if n1r is less than or equal to seven and if n1p is greater than or equal to five, and it otherwise accrues stage 2. At the second stage, if n1r + n2r ≥ 23 or n1p + n2p ≤ 6, the null hypothesis is rejected; otherwise, it is accepted and achieves {alpha} = .0494 and power = .804. With the same parameters, our borderline model accepts the null hypothesis at stage 1 if n1r is less than or equal to 10 and if n1p is greater than or equal to five and otherwise continues to stage 2. At the second stage, if n1r + n2r ≥ 21 or if n1p + n2p ≤ 8, the null hypothesis is rejected, and {alpha} = .0145 and power = .890. The thresholds given by Chang et al,20 thus, are more likely to reject the null hypothesis at stage 1 but are less likely to do so at stage 2. This results, most likely, from their method that allows early rejection of the null hypothesis at stage 1 and thus needs to provide less liberal rejection criteria at stage 2 to achieve the desired {alpha} error.

By applying the rules of Zee et al19 to 39 completed, phase II trials that were assessed by RR only, Dent et al22 found that rules that incorporated EPD were more likely to stop apparently ineffective drugs at the first stage of accrual. Conversely, after stage 2, the method of Zee et al19 suggested activity in two instances, whereas the method of Fleming16 did not. The disputed drugs were not additionally studied, so activity could not be confirmed. The study of Dent et al22 demonstrates the potential for more frequent early stopping of inactive drugs as well as the unconfirmed possibility that rules that incorporate EPD may be more sensitive to drug activity.

Another potential benefit to the use of EPD, as suggested by Zee et al,18 is that its presence can be determined at first tumor measurement, even if responses can not be fully assessed; this information may allow rapid commencement of stage 2 of accrual after stage 1 is completed, which would allow avoidance of the delay induced by waiting for potential responses.

Although the described method capably arrests the study of inactive drugs after stage 1, the "or" alternate hypothesis may best be applied to situations in which only modest drug activity is expected. Cytostatic agents or populations with poor rates of expected response (eg, pancreatic cancer) still may accrue benefit from nonprogressive disease, and low EPD rates may indicate drugs worthy of further consideration. Alternatively, drugs active against specific biologic subsets of disease (eg, trastuzumab) may demonstrate some responses but otherwise demonstrate little stable disease if assessed in unselected populations. In the above situations, the demands of an "and" alternate hypothesis may result in drug discard. Conversely, the ethics of window-of-opportunity studies demand that only the most active drugs continue to stage 2, which supports the use of an "and" hypothesis.

The assessment of EPD in addition to response could shorten phase II studies through early stopping when the EPD rate is undesirably high. Although EPD may also increase the sensitivity of phase II studies to drug activity, this must be confirmed and will depend on the criteria of the alternate hypothesis. This article creates rules for an alternate hypothesis that allows a drug to be considered interesting if the criteria are met for either a sufficiently high RR or a sufficiently low rate of EPD. It is hoped that the use of such rules may improve drug development.


    AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
The author(s) indicated no potential conflicts of interest.


    AUTHOR CONTRIBUTIONS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
Conception and design: John R. Goffin, Dongsheng Tu

Collection and assembly of data: John R. Goffin

Data analysis and interpretation: John R. Goffin, Dongsheng Tu

Manuscript writing: John R. Goffin, Dongsheng Tu

Final approval of manuscript: John R. Goffin, Dongsheng Tu


    Appendix
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
Detailed Methods Section of Phase II Stopping Rules That Employ Response Rates and Early Progression. The simulation lets r be the true response rate of the agent under study and epd its true rate of early progressive disease. Assume that one can specify two pairs of parameters, (rnul, epdnul) and (ralt, epdalt), which would render a drug uninteresting for further development if r is less than or equal to rnul and epd is greater than or equal to epdnul and less interesting for further development if r is greater than or equal to ralt or epd is less than or equal to epdalt. That is, in a phase II trial, we are interested in the testing of the following hypotheses: the null r less than or equal to rnul and epd greater than or equal to epdnul versus alternate hypothesis: r greater than or equal to ralt or epd less than or equal to epdalt.

This article considers the following two-stage procedure for the testing of the above hypotheses: In the first stage, n1 are entered. If n1r and n1p are, respectively, the number of patients who responded and had early progression among these n1 patients, the trial would be stopped at this stage if n1r was less than or equal to n1r-nul and if n1p was less than or equal to n1p-nul, in which n1r-nul and n1p-nul are two thresholds determined during the design of the study. Otherwise, n2 additional patients are entered in the second stage of the study. Let n2r and n2p be, respectively, the number of patients who responded and had early progression among these n2 patients. The drug will be declared as interesting at the end of the second stage if n1r + n2r ≥ n1r-alt + n2r-alt or n1p+ n2p ≤ n1p-alt + n2p-alt, in which n2r-alt and n2p-alt are another two thresholds determined during the design of the study.

Simulations were performed by using TreeAge Pro Healthcare software (Williamstown, MA) to determine thresholds (program available on request). For each simulation, the following parameters were prespecified: design parameters (rnul, epdnul) and (ralt, epdalt), desired power and {alpha} error of the study, and the number of patients in the first stage and second stage. Patient cohorts for error assessment of thresholds were generated as integer pairs of actual r and epd by two methods: the full-space method and the borderline-value method.

In the full-space method, simulations of the alternative hypothesis randomly select a value for either r or epd from within the range of interest. The choice of precedence of r or epd was assigned randomly on the basis of a frequency assessment of the overall potential space in which a drug would have a value of r greater than or equal to ralt and/or epd less than or equal to epdalt. Thus, either r greater than or equal to ralt or epd less than or equal to epdalt is generated. The remaining (nonprecedent) parameter then was selected randomly from the values that satisfy r + epd ≤ 1. In Figure A1, for example, a pair of r and epd would be selected with equal probability from anywhere above and/or left of the line of border for values of interest (ie, a line defined by ralt = 20% and epdalt = 40%).


Figure 1
View larger version (25K):
[in this window]
[in a new window]
[PowerPoint Slide for Teaching]
 
Fig A1. Patient thresholds to achieve specified error rates (Note: n = 30, 1-β = 0.8, {alpha} = 0.05).

 
For simulations of the null hypothesis, the same method was used, but both r and epd must satisfy r less than or equal to rnul and epd greater than or equal to epdnul.

The borderline-value method assumed that extremely desirable values for r and epd were unlikely and that inclusion of such values in assessed populations would under-power a study to detect drugs with borderline r or epd characteristics. In the borderline-value method, cohorts for the assessment of the alternate hypothesis were generated by random assignment of r = ralt or epd = epdalt. The nonprecedent parameter then was randomly assigned a value within its noninteresting range (ie, r < ralt or epd > epdalt), such that r + epd ≤ 1. Cohorts for the null hypothesis assessment were generated as described for the full-space method.

Successive patient thresholds for nr-alt and np-alt and for nr-nul and np-nul were run through 106 simulations of randomly generated r and epd values (by using either the full-space or borderline-value method) to determine the power or {alpha} error that each would generate. Fig A1 demonstrates that the desired power can be achieved with different threshold pairs. With the parameters of {alpha} = .05, power = .8, n = 30 in a single stage, ralt = 0.20, rnul = 0.05, epdalt = 0.4, and epdnul = 0.6, several pairs of nr-alt and np-alt might be taken that achieve the required power. To miss the fewest number of drugs that fall near the borderline acceptable values, the method selects the most generous combination of the pairs, in this instance combining the nr-alt = 8 from the program generated option 2 with the nepd-alt = 10 from option 3. For power in the borderline value method, thresholds were taken when possible from pairs that did not encroach on the area of disinterest.

More than one pair could be taken for the {alpha} error thresholds, as such pairs could encompass different areas of rnul and epdnul. The program preferentially selected thresholds in which neither nr-nul nor np-nul overlapped with thresholds for the alternate hypothesis; in instances of overlap, thresholds were reconciled such that the thresholds for power were given precedence.

After the thresholds were determined for stages 1 and 2, cohorts were run through a two-stage trial by using 106 simulations.


    NOTES
 
Supported by a grant from the Amgen Career Development Award (J.R.G.).

Presented in part at the 43rd Annual Meeting of the American Society of Clinical Oncology, June 1-5, 2007, Chicago, IL.

Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 AUTHORS' DISCLOSURES OF...
 AUTHOR CONTRIBUTIONS
 Appendix
 REFERENCES
 
1. DiMasi JA, Hansen RW, Grabowski HG: The price of innovation: New estimates of drug development costs. J Health Econ 22:151-185, 2003[CrossRef][Medline]

2. Booth B, Glassman R, Ma P: Oncology's trials. Nat Rev Drug Discov 2:609-610, 2003[CrossRef][Medline]

3. Goffin J, Baral S, Tu D, et al: Objective responses in patients with malignant melanoma or renal cell cancer in early clinical studies do not predict regulatory approval. Clin Cancer Res 11:5928-5934, 2005[Abstract/Free Full Text]

4. DiMasi JA, Grabowski HG: Economics of new oncology drug development. J Clin Oncol 25:209-216, 2007[Abstract/Free Full Text]

5. Buyse M, Thirion P, Carlson RW, et al: Relation between tumour response to first-line chemotherapy and survival in advanced colorectal cancer: A meta-analysis—Meta-Analysis Group in Cancer. Lancet 356:373-378, 2000[CrossRef][Medline]

6. Graf W, Pahlman L, Bergstrom R, et al: The relationship between an objective response to chemotherapy and survival in advanced colorectal cancer. Br J Cancer 70:559-563, 1994[Medline]

7. Markman M: Why does a higher response rate to chemotherapy correlate poorly with improved survival? J Cancer Res Clin Oncol 119:700-701, 1993[Medline]

8. Paesmans M, Sculier JP, Libert P, et al: Response to chemotherapy has predictive value for further survival of patients with advanced non–small-cell lung cancer: 10 years experience of the European Lung Cancer Working Party. Eur J Cancer 33:2326-2332, 1997[CrossRef][Medline]

9. Burris HA III, Moore MJ, Andersen J, et al: Improvements in survival and clinical benefit with gemcitabine as first-line therapy for patients with advanced pancreas cancer: A randomized trial. J Clin Oncol 15:2403-2413, 1997[Abstract/Free Full Text]

10. Shepherd FA, Dancey J, Ramlau R, et al: Prospective randomized trial of docetaxel versus best supportive care in patients with non–small-cell lung cancer previously treated with platinum-based chemotherapy. J Clin Oncol 18:2095-2103, 2000[Abstract/Free Full Text]

11. Cesano A, Lane SR, Poulin R, et al: Stabilization of disease as a useful predictor of survival following second-line chemotherapy in small-cell lung cancer and ovarian cancer patients. Int J Oncol 15:1233-1238, 1999[Medline]

12. Howell A, Mackintosh J, Jones M, et al: The definition of the ‘no change' category in patients treated with endocrine therapy and chemotherapy for advanced carcinoma of the breast. Eur J Cancer Clin Oncol 24:1567-1572, 1988[CrossRef][Medline]

13. Murray N, Coppin C, Coldman A, et al: Drug delivery analysis of the Canadian multicenter trial in non–small-cell lung cancer. J Clin Oncol 12:2333-2339, 1994[Abstract/Free Full Text]

14. Rapp E, Pater JL, Willan A, et al: Chemotherapy can prolong survival in patients with advanced non–small-cell lung cancer: Report of a Canadian multicenter randomized trial. J Clin Oncol 6:633-641, 1988[Abstract]

15. Roberts TG Jr, Lynch TJ Jr, Chabner BA: The phase III trial in the era of targeted therapy: Unraveling the "go or no go" decision. J Clin Oncol 21:3683-3695, 2003[Abstract/Free Full Text]

16. Fleming TR: One-sample multiple testing procedure for phase II clinical trials. Biometrics 38:143-151, 1982[CrossRef][Medline]

17. Gehan EA: The determination of the number of patients required in a preliminary and a follow-up trial of a new chemotherapeutic agent. J Chronic Dis 13:346-353, 1961[CrossRef][Medline]

18. Simon R: Optimal two-stage designs for phase II clinical trials. Control Clin Trials 10:1-10, 1989[Medline]

19. Zee B, Melnychuk D, Dancey J, et al: Multinomial phase II cancer trials incorporating response and early progression. J Biopharm Stat 9:351-363, 1999[CrossRef][Medline]

20. Freidlin B, Dancey J, Korn EL, et al: Multinomial phase II trial designs. J Clin Oncol 20:599, 2002[Free Full Text]

21. Chang MN, Devidas M, Anderson J: One- and two-stage designs for phase II window studies. Stat Med 26:2604-2614, 2007[CrossRef][Medline]

22. Dent S, Zee B, Dancey J, et al: Application of a new multinomial phase II stopping rule using response and early progression. J Clin Oncol 19:785-791, 2001[Abstract/Free Full Text]

Submitted August 21, 2007; accepted January 3, 2008.


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Facebook Facebook   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?

Related Correspondence

  • Problems Identified With Phase II Stopping Rules That Employ Response and Early-Progression Rates
    James R. Anderson and Mark D. Krailo
    JCO 2009 27: 646-647 [Full Text]


This article has been cited by other articles:


Home page
JCOHome page
J. R. Anderson and M. D. Krailo
Problems Identified With Phase II Stopping Rules That Employ Response and Early-Progression Rates
J. Clin. Oncol., February 1, 2009; 27(4): 646 - 647.
[Full Text] [PDF]


Home page
JCOHome page
J. R. Goffin
In Reply
J. Clin. Oncol., February 1, 2009; 27(4): 647 - 647.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a colleague
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Save to my personal folders
Right arrow Download to citation manager
Right arrowRights & Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goffin, J. R.
Right arrow Articles by Tu, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goffin, J. R.
Right arrow Articles by Tu, D.
Related Articles
Right arrowRelated Correspondence
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

About
JCO
 Editorial
Roster
 Advertising
Information
 Librarians &
Institutions
 Rights &
Permissions
 PDA Services

Copyright © 2008 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
Terms and Conditions of Use
  HighWire Press HighWire Press™ assists in the publication of JCO Online