|
|||||
|
|
||||||
Journal of Clinical Oncology, Vol 26, No 13 (May 1), 2008: pp. 2236-2237 © 2008 American Society of Clinical Oncology. DOI: 10.1200/JCO.2007.15.6885
In ReplyDepartment of Gastrointestical Medical Oncology, The University of Texas M.D. Anderson Cancer Center, Houston, TX I thank Dr Gonen1 for responding to my letter.2 I have certainly benefited from his instructive response and I am hoping that this could be just the beginning of a continued educational process to benefit many of us who rely frequently on survival statistics to make treatment decisions, choose research directions, and teach. Statistics and survival curves are sometimes discussed with the patients and their families to justify the recommendations we make. All clinical oncologists need to understand nearly everything listed on the figure and the meaning of the probability (P) value. In explaining mean and median survival times, Gonen has added a layer of mystery to the survival statistics for us. He points out that the area between the curves (calculated by the Kaplan-Meier method) may be the same in two different studies (each comparing two survival curves to each other) but can result in a different P value calculated by the log-rank test. This is an important statement because the area between the two Kaplan-Meier curves in different regions of the curves would have a different impact on the hazard function and would be dependent on the number of patients at risk. For example, a smaller area between the first 25% of the curves might carry larger differences in the hazard than the same area between the two curves toward the last 25% of the curves. Thus, we are not to observe the area between the curves alone but also to understand how many patients are at risk at a given time point, but this exercise is not totally satisfactory unless the number of patients at risk is listed on each figure. We know that for GI cancer studies there would almost always fewer patients at risk in the last 25% of the curves than in the first 25% of the curves. Another important point by Gonen is that the survival curves and P value are calculated by two different methods. We need to know more about this. Although he acknowledges at the end of his letter that the late separation of the curves might be from a small subset of patients that benefit from therapy, he expresses less confidence in this observation. If the late separation of the curves is not caused by the small fraction of patients that derive benefit from our current suboptimal therapies, how can we view or understand this event? We know that no matter what type of therapy we choose (surgery, chemoradiotherapy, biochemotherapy, or any of the combinations), often only a few patients seem to benefit (or sometimes none of the patients benefit) by this empiric selection of therapy. I feel that we nonstatisticians need considerably more understanding of statistical methods as well as their limitations and advantages. A better comprehension would reduce our conventional emphasis on the median time or point when stipulating study statistics before the protocol starts accruing patients and after the results are analyzed when the study is completed. The emphasis on the median, although it may have been justifiable in the past, relies more on the age-old philosophy that patient outcomes are more likely to be homogeneous rather than heterogeneous. In contrast we now know that patient outcomes are more heterogeneous than they are homogeneous. This old notion is thoroughly incorporated in the assumptions made by the survival statistics even today.3 This approach and understanding may need to change. We need new statistical methods to accommodate the new knowledge that there is considerable heterogeneity in outcome (and patients). The basic premise of the null hypothesis is against this new understanding because the null hypothesis (which must be proved wrong to make advances in oncology and other disciplines) assumes that almost everything is the same. To better understand the Kaplan-Meier method and the log-rank test, I explored the Internet. There are several podcasts and videocasts available (keithbower.com is interesting and the podcasts of live classes from University of California at Berkley, Massachusetts Institute of Technology, Yale University, and a few others; search iTunes to find these). Some aspects of these tutorials are simplified enough that the nonstatisticians can benefit, but I did not find any tutorials on biostatistics. I think there is an opportunity for cancer centers and universities to launch biostatical tutorials on the Internet. I found the most satisfactory series to be periodically published in the British Medical Journal by Bland and Altman. These authors have dealt with specific topics, including the Kaplan-Meier method3 and the log-rank test.4 They take particular precaution to simplify their message by using actual examples. It appears that they know that their audience is predominantly nonstatisticians. About the Kaplan-Meier method, it is explained that it is a step-wise analysis of patients within a specified timeframe. For example, we can look at the proportion surviving at the end of the first month or between the beginning and end of the second month (with the assumption that the patients at risk during this timeframe survived the entire timeframe) and so on. More than one population can be represented on the same figure; however, the censored patients at a given time create a number of uncertainties and certain precautions are listed.3 Also, as Gonen points out, the Kaplan-Meier method uses the mean survival time for the patient and does not provide comparison of total survival. In contrast, the log-rank test tests the null hypothesis by considering the time to event (taking the total survival experience into account) to calculate the P value. I wonder if it is time for a new way to express the survival comparisons. If a new statistical method is not forthcoming, should journals not require depiction of the number of patients at risk at specified times and the inclusion of survival estimates? To paraphrase from one of the podcasts, "if the P value is low, the null hypothesis must go" and "assume that everything is the same unless proved different." Are we closer to the reality here or veering away from it? Improved understanding of biostatistical methods and the results they produce is critical in research and clinical practice, and I hope that we will soon have a comprehensive systematic experience enlightening us all. POST SCRIPT: Since submitting my response to Dr Gonen's letter, two important and pertinent articles have been published. Royston et al5 makes an attempt to deal with uncertainties created by censoring of the patients. It considers patient heterogeneity (by using prognostic markers). Very interesting! The accompanying editorial deals with the weaknesses of the Kaplan-Meier method.6 We are at the beginning of something new. AUTHOR'S DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST The author(s) indicated no potential conflicts of interest. REFERENCES 2. Ajani JA: The area between the curves get no respect: Is it because of the median madness? J Clin Oncol 25:5531, 2007 3. Bland JM, Altman DG: Survival probabilities (the Kaplan-Meier method). BMJ 317:1572, 1998 4. Bland JM, Altman DG: The logrank test. BMJ 328:1073, 2004 5. Royston P, Parmar MKB, Altman DG: Visualizing length of survival in time-to-event studies: A complement to Kaplan-Meier plots. J Natl Cancer Inst 100:92-97, 2008 6. Wittes J: Times to event: Why are they hard to visualize? J Natl Cancer Inst 100:80-81, 2008
Related Correspondence
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||
|
Copyright © 2008 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
|