|
|||||
|
|
||||||
© 2002 American Society for Clinical Oncology The Mayo Lung Cohort: A Regression Analysis Focusing on Lung Cancer Incidence and MortalityByFrom the Division of Hematology-Oncology, Roger Williams Medical Center, Providence, RI; and Department of Medicine, Boston University School of Medicine, Boston, MA. Address reprint requests to Gary M. Strauss, MD, MPH, Division of Hematology-Oncology, Roger Williams Medical Center, 825 Chalkstone Ave, Providence, RI 02908; email: gstrauss{at}ids.net
PURPOSE: The Mayo Lung Project has been interpreted as negative because it failed to demonstrate a significant mortality reduction among those randomized to chest x-ray and cytology. In contrast, survival suggests that screening is highly effective. This report was undertaken to analyze the trial as a closed cohort study, in an effort to identify predictors of lung cancer incidence and mortality, and to determine whether survival or mortality was unbiased. PATIENTS AND METHODS: The Mayo Lung Cohort comprised all 9,192 randomized individuals. Cox proportional hazards regression was used both to determine predictors of incidence and mortality in the population and to identify predictors of mortality among cases. Survival analyses using intent-to-treat principles and measuring survival from randomization were used to evaluate length bias and lead-time bias. Multivariate Cox regression was used to investigate the extent to which the data are consistent with overdiagnosis. RESULTS: Cox regression demonstrates that, in addition to age and smoking, randomization to screening predicted increased lung cancer incidence (hazard ratio, 1.30; 95% confidence interval [CI], 1.06 to 1.60). Predictors of mortality were similar, except randomization to screening was not significant (hazard ratio, 1.06; 95% CI, 0.83 to 1.37). Among cases, survival was significantly superior in the experimental population. Higher incidence in the experimental group accounts for the mortality/survival discrepancy. Both lead-time and length biases can be excluded, because survival from randomization was superior in the experimental population. Overdiagnosis is eliminated because resection was the only significant multivariate predictor of survival. Overall, 50% of resected and 0% of unresected cases were cured. CONCLUSION: Survival was superior in the screened population, and this advantage was not attributable to lead-time bias, length bias, or overdiagnosis bias. Mortality was biased, because incidence differences confounded the ability of mortality to reflect the true effect of screening. Indeed, survival provided an unbiased surrogate for cure in the Mayo Lung Cohort.
SCREENING FOR LUNG cancer is not currently recommended.1-3 This is because randomized trials have consistently failed to demonstrate a significant reduction in lung cancer mortality in populations randomized to screening. In the 1970s, the National Cancer Institute (NCI) sponsored three randomized trials on screening for lung cancer. The Memorial-Sloan Kettering and Johns Hopkins Lung Projects were explicitly designed to evaluate sputum cytology, insofar as they compared annual chest x-ray (CXR) alone to annual CXR and triannual cytology.4,5 Although both failed to demonstrate a mortality reduction by the addition of cytology to CXR, stage distribution, resectability, and 5-year survival were approximately three-fold better than contemporary estimates based on the NCIs Surveillance, Epidemiology, and End Results database.6 Accordingly, these results suggest that CXR screening was superior to no screening. The Mayo Lung Project was the most influential of the NCI randomized trials, because it was designed to evaluate both CXR and cytology. Although its major results are well known, proper interpretation remains highly controversial.
Mayo Lung Project: The Controversy The Mayo Lung Project consisted of both prevalence screen and incidence screening. A total of 10,933 men underwent prevalence screening, consisting of CXR and cytology.7 Prevalence cancers were detected in 91, of which 65% were detected by CXR, 19% by cytology, and 16% by both CXR and cytology. Fifty-four percent had occult or stage I disease, and resection was also accomplished in 54%. Five-year survival was 40%.7,8 Among those free of detectable lung cancer on prevalence screening, 9,212 were randomized to an experimental group (EG), which underwent CXR and cytology every 4 months, or to a control group (CG).8 CG participants were advised to undergo annual CXR and cytology, although they underwent no screening as part of the study. Approximately half of CG participants underwent annual CXR during the trial, and 73% underwent CXR during the final 2 years.9 On average, screening in the EG continued for 6 years and was followed by 3 years of observation, whereas the CG was followed, on average, for 9 years.8 After 9 years, lung cancer mortality was slightly higher in the EG (relative risk, 1.06; 95% confidence interval [CI], 0.83 to 1.37). Conversely, 5-year and 9-year survival significantly favored the EG. Such apparently paradoxic findings were possible because lung cancer incidence was significantly higher in the EG (relative risk, 1.30; 95% CI, 1.06 to 1.60). Because cause-specific mortality is assumed to provide an unbiased measure of screening efficacy,10 and because survival (and other clinical end points) may be confounded by conventional screening biases, the Mayo Lung Project has been widely interpreted as supporting the conclusion that screening was ineffective. Indeed, the constellation of improved survival, higher EG incidence, and similar mortality led to the suggestion that screening was associated with the overdiagnosis of lung cancer.11 In this regard, Morrison10 coined the term "pseudodisease" to describe "a lesion that becomes known only as a result of screening; it would not be discovered otherwise." Overdiagnosis occurs when screening detects pseudodisease.12 Marcus et al13 recently extended follow-up of Mayo participants to a median of 20.5 years, and found no reduction in lung cancer mortality with prolonged follow-up. They concluded, "similar mortality but better survival for individuals in the intervention arm indicates that some lesions with limited clinical relevance may have been identified in the intervention arm." Conversely, overdiagnosis was not an a priori hypothesis the Mayo Lung Project was designed to confirm. Moreover, it contradicts abundant other evidence.14 The possibility of randomization failure has been suggested as an alternative to overdiagnosis.15 This report was undertaken to analyze the study from the perspective of a closed cohort study using multivariate regression techniques. In this regard, the Mayo Lung Cohort (MLC) is defined as individuals randomized to incidence screening in the Mayo Lung Project. This report has four objectives with regard to the MLC: (1) to identify significant predictors of lung cancer incidence and mortality; (2) to demonstrate that multivariate techniques facilitate an analysis that permits definitive conclusions regarding the extent to which lead-time bias, length bias, and/or overdiagnosis bias confound survival comparisons; (3) to examine evidence that randomization failure may have confounded mortality comparisons; and (4) to draw appropriate inferences regarding the efficacy of CXR and sputum cytology.
Statistical Methods Two categories of covariates are defined. Population parameters are defined for every MLC participant. Population variables include group assignment, age at entry, number of cigarettes smoked, and decades smoked. Additional population parameters include history of bronchitis/pneumonia; history of emphysema; and history of exposure to air pollution, arsenic, asbestos, nickel or chromium, radioactive material, and tuberculosis. The second category of covariates are clinical parameters, which are defined only for those diagnosed with lung cancer. Clinical parameters include method of detection, tumor histology, stage of disease, and resectability. Method of detection was reported in three categories (CXR, sputum cytology, or symptoms), although CXR- or cytology-detected lesions were collapsed into a single screen-detected category for many analyses. Only six (1.6%) cases in the MLC were detected by both CXR and cytology; these were categorized as CXR-detected. Although histologic subtype (squamous cell, adenocarcinoma, large-cell, small-cell, and other) was available, histology was collapsed into nonsmall-cell lung cancer (NSCLC) or small-cell categories for regression analysis. Although a stage variable was available, it was extremely limited. Patients were classified as early stage if they had postsurgical stage I or II disease. All others were classified as advanced stage. Because all early-stage patients were resected, stage was collinear with resectability (Spearman correlation coefficient = 0.88). Accordingly, stage was excluded from multivariate analyses in which resection was modeled. All statistical analyses were carried out using the STATA 6 statistical software package (STATA Corp, College Station, TX). Cox proportional hazards regression was used to model both lung cancer incidence and mortality. In the Cox model, the primary outcome measure is the hazard ratio (HR). Depending on the context, mortality can be considered either a population parameter or a clinical parameter. It is coded as present or absent for every MLC participant; when analyzed in this form, mortality is a population parameter. When mortality among cases is of interest, mortality becomes a clinical parameter. The reader should understand that mortality among cases is more commonly referred to as fatality.
Median and maximum follow-up for all 9,192 MLC participants was 8.6 and 11.7 years, respectively. Table 1lists summary data for each population parameter, subdivided by group. Other than a marginally significant (P = .049) difference in arsenic exposure, there were no other significant differences.
During the course of the study, 366 lung cancers were detected in the MLC; 181 (49.5%) were screen-detected, of which 163 (90%) were detected by CXR and 18 (10%) by cytology alone. Table 2 lists clinical parameters, subdivided by group. Among cases, median age at study entry (59 years) and median age at diagnosis (64 years) were identical. Two hundred six cases were identified in the EG and 160 in the CG. Although histology was similar, there were significant differences with regard to method of detection, stage distribution, resectability, and mortality among cases. For example, screen detection was much higher in the EG (65% v 30%, P < .0001), as was resectability (46% v 32%, P = .010). Mortality among cases (ie, fatality) was significantly lower in the EG (59% v 72%, P = .015).
Lung Cancer Incidence and Mortality Table 3 lists univariate predictors of lung cancer incidence and mortality. With regard to incidence, randomization to the EG significantly increased lung cancer risk (HR, 1.30; 95% CI, 1.06 to 1.60). Age and duration and extent of smoking also significantly increased risk. Bronchitis/pneumonia and emphysema also increased risk, whereas other exposures did not. Although asbestos and arsenic are known lung cancer carcinogens,16 both paradoxically were associated with a trend toward reduced risk.
With the notable exception of group assignment, predictors of incidence also predicted for increased lung cancer mortality (although amount smoked and emphysema did not quite reach significance). Group, which predicted for a 30% increase in incidence (P = .013), only predicted for a 6% increase in mortality (P = .62). Stratified Cox regression was conducted to determine whether risk factors for lung cancer were identical in the EG and the CG. As shown in Table 4, predictors of incidence were different.
For example, bronchitis/pneumonia only significantly predicted for lung cancer in the EG. Similarly, emphysema was only significantly predictive in the CG. Most striking was the finding that exposure to air pollution, which was not a significant risk factor when the entire cohort was included, significantly predicted for higher lung cancer incidence in the CG (HR, 1.54; 95% CI, 1.10 to 2.17). In the EG, air pollution was associated with a trend toward lower incidence (HR, 0.82; 95% CI, 0.57 to 1.17). Stratified regression for mortality is also listed in Table 4. Results are similar to the stratified analysis for incidence. The finding that bronchitis/pneumonia, emphysema, or air pollution predicted differently for lung cancer incidence and mortality in the EG or CG raises the possibility of effect modification. To test this hypothesis, models using the exposure variable, group, and a group-exposure variable interaction term were created. The results of these analyses are listed in Table 5. As shown, the interaction term was not significant with regard to the group-bronchitis/pneumonia and group-emphysema tests. On the other hand, the interaction term for group-air pollution was statistically significant, in terms of both incidence and mortality.
This analysis provides statistically significant evidence for effect modification with regard to group assignment and air pollution. These statistical analyses do not exclude effect modification with regard to possible interactions between group and bronchitis/pneumonia or group and emphysema, because statistical tests have little power to demonstrate significance with regard to interaction terms. The epidemiologic significance of effect modification will be discussed later. Table 6 lists the best fitting multivariate models that predict for lung cancer incidence and mortality in the MLC. The model includes group, other significant univariate predictors, and the group-air pollution interaction term.
It should be noted that emphysema, which was significant on univariate analyses, was not a significant predictor of either incidence or mortality on multivariate testing. This is because the univariate effect of emphysema was through its association with smoking. With regard to incidence, multivariate analysis demonstrates that group, age, duration and extent of smoking, bronchitis/pneumonia, air pollution, and the group-air pollution interaction term were all significant predictors for lung cancer incidence. When adjusted for all other predictors of incidence, randomization to the EG was associated with a 49% increase in lung cancer incidence (HR, 1.49; 95% CI, 1.18 to 1.90). With regard to mortality, the results of the multivariate analysis are similar. The major exception is that group assignment was not a statistically significant predictor of lung cancer mortality. Nonetheless, when adjusted for other significant predictors, assignment to the EG was associated with a 24% increase in lung cancer mortality (HR, 1.24; 95% CI, 0.92 to 1.67). The finding that randomization to the EG significantly predicted for increased lung cancer incidence but not mortality in the MLC provides the basis for the overdiagnosis hypothesis. The next three sections examine the extent to which observed MLC results are consistent with lead-time bias, length bias, or overdiagnosis bias.
Lead-Time Bias Figure 1 shows Kaplan-Meier survival (in years) from time of lung cancer diagnosis, and demonstrates a significant advantage for the EG (P = .0021). The survival plateau was 29% (95% CI, 21% to 39%) in the EG and 13% (95% CI, 6% to 22%) in the CG.
Because survival plateaus do not converge, lead-time bias is unlikely to fully account for observed survival differences. However, one cannot exclude the possibility that average lead-time was so prolonged that longer follow-up was needed for convergence to be observed. However, in the context of a mature randomized trial, lead-time bias can be excluded by comparing survival among cases from time of randomization. This is demonstrated in Fig 2, which demonstrates a significant advantage favoring the EG (P = .012). Because time of diagnosis has been eliminated as a variable, the survival difference cannot be attributed to lead-time bias.
Length Bias Length-biased sampling refers to the tendency of screening to preferentially detect cases that have a long preclinical duration. Length bias refers to survival advantages that would be anticipated when comparing screen-detected to symptom-detected cases. This is precisely what is done in Fig 3, which demonstrates a highly significant survival advantage for screen-detected cases (P < .0001). The survival plateau after 7 years among screen-detected cases was 38% (95% CI, 28% to 47%); among symptom-detected cases, 5-year survival was 3% (95% CI, 0% to 12%).
Figure 3 clearly reflects an element of length bias. However, length bias can be eliminated in a randomized trial by conducting an intent-to-treat analysis. In this regard, survival among all EG and CG cases is compared, including screen-detected and symptom-detected cases. Symptom-detected cases must include interval cancers, cancers among those who refused screening, and cancers detected during the follow-up period. However, Figs 1 and 2 demonstrate the results of such an intent-to-treat analysis. Accordingly, these figures demonstrate a significant survival advantage in the EG that cannot be attributed to length bias.
Overdiagnosis Bias Overdiagnosis was only raised in the context of a post hoc interpretation of the data. From a theoretical perspective, higher incidence in the EG, coupled with a survival/mortality discrepancy, can be attributed to the detection of pseudodisease. However, regression techniques can be used that permit a direct assessment of whether overdiagnosis is consistent with the MLC data. Table 7 demonstrates results of Cox regression that relates clinical parameters to risk of lung cancer mortality. (The reader is reminded that fatality is modeled when mortality among cases is of interest.)
Table 7 lists univariate results and demonstrates a significant reduction in HR associated with each clinical parameter. For example, screen detection was associated with a 51% mortality reduction (HR, 0.49; 95% CI, 0.37 to 0.63), whereas resection was associated with a 71% reduction (HR, 0.29; 95% CI, 0.21 to 0.40). Randomization to the EG, early-stage disease, and NSCLC were also significant on univariate testing. Univariate testing, however, does not distinguish between the possibility that screening was efficacious versus the alternative that screening led to overdiagnosis. Conversely, multivariate analysis does permit such a distinction. If screen detection were a surrogate for pseudodisease, then one would expect screen detection to remain significant on multivariate testing. Conversely, if screen detection were a surrogate for resectability, then screen detection would not be significant when adjusted for resection. Table 7 lists results of a multivariate model, which includes as predictor variables resection, screen detection, NSCLC, and group. (Other variables were removed because of collinearity.) The results demonstrate that resection was the only significant multivariate predictor (HR, 0.32; 95% CI, 0.22 to 0.47). When adjusted for resection, screen detection completely dropped out (HR, 1.01; 95% CI, 0.73 to 1.40). The fact that resection is the only significant multivariate predictor supports the conclusion that effective therapy is directly responsible for reducing lung cancer mortality among cases in MLC. Indeed, as shown in Table 8, stratified analysis demonstrates that resection is the only significant predictor of mortality in both the EG and the CG. When adjusted for differences in resection, screen detection is neither a significant nor powerful predictor of mortality.
Table 8 also shows a stratified logistic regression, with resection as the outcome variable. Both screen detection and NSCLC are highly significant predictors of resection in the EG and the CG. Accordingly, screen detection is extremely important, since it is the major determinant of resection. However, screen detection is not a proxy for pseudodisease. Indeed, the absence of survival beyond 6 years for even a single unresected patient further underscores the fact that MLC data are inconsistent with the hypothesis that pseudodisease exists in lung cancer. Screen detection is only important because it leads to potentially curative therapy.
Resection and Survival
In dramatic contrast, among those not undergoing resection, survival at 5 years was 2% (95% CI, 1% to 8%). Indeed, the 2% survival reflects a single unresected patient who remained alive at 5 years. Unfortunately, this patient died of lung cancer 70 months after diagnosis. In reality, cure was not achieved in even a single unresected MLC patient.
CXR, Cytology, and Survival As shown in Fig 5, cytology-detected cases had the best prognosis, with a survival plateau of 72% (95% CI, 39% to 89%). Survival plateau was 33% (95% CI, 23% to 43%) for CXR-detected cases. Among 185 symptom-detected cases, 5-year survival was 3% (95% CI, 0% to 11%). However, only 5% of all cases were cytology-detected. Among 181 screen-detected cases, 10% (n = 18) were cytology-detected and 90% (n = 163) were CXR-detected.
Mortality, Randomization, and Confounding In addition to eliminating selection bias, the objective of randomization is to control for confounding.17 Confounding is a distortion that arises when comparisons are made between noncomparable groups. A confounder is an independent risk factor for outcome that is also significantly associated with exposure.18,19 In the MLC, lung cancer incidence was 30% higher in the EG. The finding that excess incidence was not because of overdiagnosis raises the question that incidence discrepancies may reflect imbalances in unmeasured and/or unknown confounders. Conversely, Table 1 demonstrates no imbalances in any measured risk factor. It has generally been assumed that if randomization leads to balanced allocation among measured variables, then unmeasured confounders are also equally distributed.17,20 However, a theoretical basis exists to question this assumption in population-based randomized trials. Such trials, which study the effect of interventions in large healthy populations, have also been referred to as randomized population trials (RPTs).15,21 Most have focused on cancer chemoprevention or early detection. The suggestion is that randomization is more likely to fail in an RPT than in a randomized clinical trial (RCT). In an RCT, randomization need only balance clinical parameters. In cancer, these consist of standard prognostic factors, such as nodal status, tumor size, and histologic grade. Conversely, in an early detection RPT using a mortality end point, randomization must also balance those confounders that determine which individuals in the general population are truly at risk to develop the target cancer. Although we do not know the full extent of these confounders, they likely include a broad spectrum of exposure and susceptibility variables. Although approximately 90% of lung cancers are smoking-related, only 16% of male and 9% of female lifelong smokers develop lung cancer.22 Such figures underscore the enormous variability that exists with regard to individual susceptibility to tobacco-related carcinogens. Because other environmental exposures influence lung cancer risk among smokers, it is conceivable that imbalances in an unmeasured exposure variable may have contributed to incidence differences in the MLC. However, it is more likely that incidence discrepancies relate to differences in inherited susceptibility factors that modulate cancer risk among exposed individuals.16,23 Although high-penetrance genes (such as BRCA1 and BRCA2 in breast cancer) have received enormous attention, these lead to only a small proportion of cancers in the general population.24 No high-penetrance gene has been identified for lung cancer. Moreover, because environmental exposures do not contribute substantially to cancers caused by high-penetrance genes, they are unlikely to play a significant role in lung cancer. Conversely, a growing body of evidence indicates that large numbers of low-penetrance genes modulate the effect of tobacco-related carcinogens among cigarette smokers.23,24 The existence of numerous low-penetrance genes that interact with environmental exposures in defining cancer risk at the individual level clearly represents an obstacle to successful randomization.15 No direct evidence exists for an imbalance in susceptibility variables in the MLC, because none were investigated. Even family history of lung cancer, a well-known surrogate for genetic susceptibility, was not ascertained in the MLC. However, the finding of effect modification between group and exposure to air pollution is most consistent with imbalances in susceptibility factors. Exposure to air pollution significantly predicted for higher lung cancer incidence and mortality in the CG, whereas this exposure was associated with a trend toward reduced incidence and mortality in the EG. Because randomization is assumed to control for confounding, randomization to the EG is considered a surrogate for the experimental intervention, which in this case is screening for lung cancer. However, there is no biologically plausible mechanism why exposure to air pollution would significantly alter the relationship between screening and lung cancer risk. Consequently, it is tempting to discount the effect modification as a chance occurrence, possibly related to multiple testing. Conversely, if imbalances in low-penetrance genes were responsible for incidence discrepancies, effect modification would be predictable. By their nature, low-penetrance genes interact with environmental exposures, leading to risk variability. Accordingly, the finding of effect modification is consistent with bad randomization in the MLC. Even stronger evidence for randomization failure is provided by the finding that lung cancer incidence was 30% higher in the EG for reasons unrelated to overdiagnosis. Incidence differences unrelated to any conventional screening bias support the conclusion that these discrepancies were because of imbalances in unmeasured confounders. Moreover, incidence differences confounded the relationship between group assignment and lung cancer mortality. Consequently, the mortality end point was not an accurate measure of screening efficacy in the MLC.
The MLC was undertaken to answer the question of whether screening with CXR and sputum cytology improved outcome in lung cancer. However, in the context of the survival/mortality discrepancy, the answer is dependent on which end point provides an unbiased measure of screening efficacy. Although mortality is assumed to be unbiased, a primary objective of this article is to demonstrate the need to draw inferences on the basis of data rather than assumptions. For decades, decisions about the effectiveness of screening interventions in RPTs have been predicated on two assumptions related to survival and mortality.10,25,26 First, survival is assumed to be biased, because of confounding effects of lead-time bias, length bias, and/or overdiagnosis bias. Second, mortality is assumed to be unbiased, because it is the only end point fully dependent on the randomization process. By eliminating selection bias and controlling for confounding, randomization is assumed to select the EG and CG that have an equivalent risk for disease-specific mortality, except insofar as screening reduces that risk. However, data-driven analyses presented herein demonstrate that survival was not confounded by any of the conventional biases in the MLC. Indeed, in an RPT in which survival is measured from randomization and in which both screen- and symptom-detected cases are included in the analysis, lead-time bias and length bias do not confound survival. Lead-time bias is eliminated when survival is measured from randomization, whereas length bias is eliminated by analyzing survival using intent-to-treat principles. A process of elimination12 has long provided the entire basis for the conclusion that overdiagnosis was responsible for the survival/mortality discrepancy in the MLC.13 Indeed, there is no alternative to the conclusion that pseudodisease accounts for this discrepancy, if mortality is assumed to be unbiased. Conversely, a process of elimination is inappropriate for drawing definitive conclusions regarding overdiagnosis. Pseudodisease can only exist in the context of extraordinarily indolent biologic behavior, which bears little relationship to how lung cancer behaves. Moreover, overdiagnosis is a testable hypothesis. An RCT comparing surgical resection to watchful waiting among those with screen-detected disease would provide a direct test of overdiagnosis. Indeed, such an RCT is currently ongoing in prostate cancer.27 Conversely, there have been no serious proposals for such a study in lung cancer. The author would suggest that such a trial has not been seriously suggested because overdiagnosis so completely contradicts all that is known about lung cancer, that such a proposal would be judged unacceptable, both scientifically and ethically. Nonetheless, the possibility that CXR screening led to the detection of pseudodisease in the MLC has been the major impediment to population-based screening for lung cancer for over two decades. This is true despite the fact that the Mayo Lung Project was not designed to assess for the reality of overdiagnosis. For this reason, it is impossible to evaluate the overdiagnosis hypothesis in the context of its randomized design. Conversely, multiple regression models and other statistical techniques more commonly used for observational research offer an opportunity to evaluate the extent to which overdiagnosis is consistent with observed MLC results. These analyses demonstrate that the data are inconsistent with the overdiagnosis hypothesis. Indeed, the data tell us that curative resection is the only important predictor of outcome in lung cancer. Screen detection is important only insofar as it predicts for curative resection. Conversely, if screening were associated with overdiagnosis, screen detection would be a surrogate for pseudodisease. However, this was not the case, because zero of 185 unresected MLC patients (including 22 who were screen-detected) achieved long-term survival. There is absolutely no evidence that any of 366 lung cancers in the MLC were, in reality, "pseudo-lung cancers." The finding that survival was not confounded by lead-time bias, length bias, or overdiagnosis bias directly leads to the conclusion that survival was unbiased in the MLC. Moreover, although Fig 1 demonstrates the survival advantage associated with randomization to the EG, it clearly underestimates the benefit of screening. This is because of the extensive screening conducted in the CG, on the basis of the recommendation for annual screening. Figures 4 and 5 accurately demonstrate what screening and resection, respectively, accomplished in the MLC. Of course, a survival/mortality discrepancy coupled with the finding that survival was unbiased confronts us with the need to reconsider our assumptions regarding mortality. Obviously, if survival provides an accurate measure of what screening accomplished, there is no alternative to the conclusion that mortality was biased. The fact that lung cancer incidence was 30% higher in the EG for a reason unrelated to pseudodisease provides strong evidence for randomization failure in the MLC. Moreover, the finding of significant effect modification between an exposure variable (air pollution) and group assignment provides additional support for this hypothesis. Although it is impossible to verify the suggestion that randomization failed to control for confounding because of low-penetrance genes, the finding of significant incidence differences unrelated to conventional screening biases provides strong evidence that the mortality end point was biased in the cohort. Indeed, controversy on screening for cancer has primarily been derived from our uncritical acceptance of cause-specific mortality as the primary end point.15,26 Mortality can be biased in an RPT, as it was in the MLC. Conversely, survival can provide definitive evidence for screening efficacy in a mature RPT. In conclusion, randomization to screening significantly improved lung cancer survival in the MLC. Screen detection led to surgical resection, which was directly responsible for a 50% cure rate among those undergoing surgery. Indeed, this is the first report to demonstrate that survival may provide an unbiased measure of screening efficacy in the RPT setting. In the MLC, lung cancer survival represented an accurate surrogate for cure.
I thank Robert Fontana, MD, David Sanderson, MD, and William Taylor, PhD, of Mayo Clinic, Nat Berlin, MD, NCI organizer of Mayo study, and Constantine Gatsonis, PhD, of Brown University Statistical Sciences Center. Special thanks to Marvin Zelen, PhD, of Dana-Farber Cancer Institute, who provided invaluable assistance regarding statistical issues.
1. Patz EF, Goodman PC, Bepler G: Screening for lung cancer. N Engl J Med 343: 1627-1633, 2000
2.
Frame PS: Routine screening for lung cancer? Maybe someday, but not yet. JAMA 284: 1980-1983, 2000 3. Smith RA, Glynn TJ: Early lung cancer detection: Current and ongoing challenges. Cancer 89: 2327-2328, 2000[CrossRef][Medline]
4.
Melamed MR, Flehinger RB, Heelan RT, et al: Screening for early lung cancer: Results of the Memorial-Sloan Kettering study in New York. Chest 86: 44-53, 1984
5.
Tockman MS: Survival and mortality from lung cancer in a screened population: The John Hopkins study. Chest 89: 325S-326S, 1986 6. Ries LAG, Kasary CL, Hankey BF, et al: SEER Cancer Statistics Review, 1973-1995. Bethesda, MD, National Cancer Institute, 1998 7. Fontana RS, Sanderson DR, Taylor WF, et al: Early lung cancer detection: Results of the initial (prevalence) radiologic and cytologic screening in the Mayo Clinic study. Am Rev Respir Dis 130: 561-565, 1984[Medline] 8. Fontana R, Sanderson DR, Woolner LB, et al: Lung cancer screening: The Mayo Program. J Occup Med 28: 746-750, 1986[Medline] 9. Fontana R, Sanderson DR, Woolner LB, et al: Screening for lung cancer: A critique of the Mayo Lung Project. Cancer 67: 1155-1164, 1991[CrossRef][Medline] 10. Morrison AS: Screening in Chronic Disease, ed 2 . New York, NY, Oxford University Press, 1992 11. Eddy D: Screening for lung cancer. Ann Intern Med 111: 232-237, 1989
12.
Black WC: Overdiagnosis: An unrecognized cause of confusion and harm in cancer screening. J Natl Cancer Inst 92: 1280-1282, 2000
13.
Marcus PM, Bergstralk EJ, Gagerstrom RM, et al: Lung cancer mortality in the Mayo Lung Project: Impact of extended follow-up. J Natl Cancer Inst 92: 1308-1316, 2000
14.
Flehinger BJ, Kimmel M, Melamed MR: The effect of surgical treatment on survival from early lung cancer: Implications for screening. Chest 101: 1013-1018, 1992 15. Strauss GM: Randomized population trials and screening for lung cancer: Breaking the cure barrier. Cancer 89: 2399-2421, 2000[CrossRef][Medline] 16. Samet JM (ed): Epidemiology of Lung Cancer: Lung Biology in Health and Disease, vol 74. New York, NY, Marcel Dekker, 1994, pp 1-543 17. Rothman KJ, Greenland S: Modern Epidemiology, ed 2 . Philadelphia, PA, Lippincott-Raven, 1998, pp 1-738
18.
Weinberg CR: Toward a clearer definition of confounding. Am J Epidemiol 137: 1-8, 1993 19. Pearl J: Causality Models: Reasoning and Inference. Cambridge, England, Cambridge University Press, 2000, pp 1-384 20. Hennekens CH, Buring JE: Analysis of epidemiologic studies: Evaluating the role of confounding, in Mayrent SL (ed): Epidemiology in Medicine. Boston, MA, Little Brown and Company, 1987, pp 287-323 21. Dempster A: Logicist statistics: Models and modeling. Stat Sci 13: 248-276, 1998[CrossRef]
22.
Peto R, Darby S, Deo H, et al: Smoking, smoking cessation, and lung cancer in the UK since 1950: Combination of national statistics with two case-control studies. BMJ 321: 323-329, 2000 23. Bepler G: Lung cancer epidemiology and genetics. J Thorac Imaging 14: 228-234, 1999[Medline]
24.
Shields PG, Harris CC: Cancer risk and low-penetrance susceptibility genes in gene-environment interactions. J Clin Oncol 18: 2309-2315, 2000 25. Prorok PC, Hanley BF, Bundy BN: Concepts and problems in the evaluation of screening programs. J Chronic Dis 34: 159-171, 1981[CrossRef][Medline] 26. Strauss GM, Dominioni L: Perception, paradox, paradigm: Alice in the wonderland of lung cancer prevention and early detection. Cancer 89: 2422-2431, 2000[CrossRef][Medline] 27. Wilt TJ, Brawer MK: Early intervention of expectant management for prostate cancer: The prostate cancer intervention versus observation trial (PIVOT)A randomized trial comparing radical prostatectomy with expectant management for the treatment of clinically localized prostate cancer. Semin Urol 13: 130-136, 1995[Medline] Submitted May 14, 2001; accepted January 10, 2002.
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||
|
Copyright © 2002 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
|