Advertisement
Journal of Clinical Oncology  
Search for:
Limit by:
  Browse by Subject or Issue
Home Search or Browse JCO My JCO Subscriptions Customer Service Site Map

Originally published as JCO Early Release 10.1200/JCO.2005.07.022 on September 26 2005

Journal of Clinical Oncology, Vol 23, No 30 (October 20), 2005: pp. 7380-7384
© 2005 American Society of Clinical Oncology.

This Article
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a colleague
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Save to my personal folders
Right arrow Download to citation manager
Right arrowRights & Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Donaldson, G. W.
Right arrow Articles by Moinpour, C. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Donaldson, G. W.
Right arrow Articles by Moinpour, C. M.
Related Articles
Right arrowRelated Article
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

EDITORIAL

Learning to Live With Missing Quality-of-Life Data in Advanced-Stage Disease Trials

Gary W. Donaldson

Pain Research Center, Department of Anesthesiology, University of Utah, Salt Lake City, UT

Carol M. Moinpour

Southwest Oncology Group Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, WA

In advanced-stage disease, palliation can be the most important objective.1 Quality-of-life (QoL) outcomes provide an intuitively reasonable way to monitor palliation from the patient’s perspective. However, patients with advanced-stage disease are often unable to complete a series of QoL assessments because of deteriorating health.2-5 This introduces a possible bias in a clinical trial because patients who are healthy enough to complete the assessments may not represent the target population.6,7

In this issue, Brown et al8 report QoL outcomes from a non–small-cell lung cancer trial comparing supportive care regimens with and without chemotherapy. The analysis of completers only was based on half the baseline sample, and the proportion of completers differed by treatment arms. Despite comprehensive descriptions of the missing data problems and sensitivity analyses designed to cope with them, considerable uncertainty remains. Uncertainty is inevitable with this degree of missing data, but a few key questions can both sharpen focus and suggest practical steps to limit the impact of missing data.

MISSING DATA: NONIGNORABLE OR IGNORABLE?

Most clinical trials that include QoL assessments encounter some degree of missing data. Although small amounts of incidental missing data may be essentially harmless, a substantial amount of missing data can have two major impacts. At a minimum, the missing data will result in wider confidence intervals and reduced power. The larger issue is the likelihood that missing data closely linked to patients’ health and QoL may result in bias. For example, QoL in progressive disease may deteriorate more rapidly as the disease worsens, so that patients with poorer QoL at study entry may worsen more rapidly, and therefore drop out earlier, than patients with better QoL at study entry. This mechanism affects inferences about how QoL changes in the trial. This is the intuition motivating the recognition that so-called nonignorable missing data mechanisms demand special analytic approaches.

The question of whether missing data can be ignored (ignorability) hinges on whether the missing data mechanism can be considered missing at random (MAR) in a particular technical sense. Data are considered MAR if missing values occur exclusively as a function of observed measures (eg, background covariates such as age or disease stage, previously assessed QoL). Slightly more formally, data are MAR if Pr(missing | measured values, unmeasured values) = Pr(missing | measured values), where Pr denotes conditional probability. Unmeasured values include not just missing observations on the QoL outcome itself, but also failure to capture other relevant observable or unobservable variables (eg, true underlying health states or changes, time before death, overlooked predictors). An MAR mechanism can be ignored for a likelihood-based (or Bayesian) analysis that properly incorporates the measured values as covariates or strata.10 Practically, this suggests that standard analysis techniques, such as many mixed-effects models, can give the correct estimates despite the missing data. (Older approaches, such as multivariate analysis of variance, require complete longitudinal cases and the much stronger assumption that observations are missing completely at random, a special case of MAR.) When the mechanism depends on unmeasured values, the data are not MAR, the analysis must model the nonignorable missing data mechanism, and uncertainty in selecting the correct model is high.

IS THERE A PRAGMATIC ANALYSIS STRATEGY THAT CAN WORK DESPITE ALL THIS UNCERTAINTY?

A conservative presumption is that all advanced-stage trials reflect nonignorable missingness. Statisticians disagree about the best approaches to handling data that are nonignorable, although there is consensus that a sensitivity analysis is critical for evaluating the impact of the unverifiable yet crucial assumptions required by a nonignorable model. As a pragmatic strategy, we suggest considering how best to reframe a problematic analysis in an MAR context, at least as a first approximation, before conducting a demanding set of sensitivity analyses for a particular nonignorable approach. It is important to appreciate that "ignorability is relative"11 (p 23). Even when the nonresponse mechanism is largely unknown, observable covariates may be available that can reduce the degree of unexplained missingness. A broad strategy that incorporates thoughtful selection of covariates, in conjunction with a rich analysis model, may help reframe a nonignorable process as one that is (approximately) MAR.11-13

Here is the logic of the pragmatic strategy we recommend. It is common for the QoL response measures Y to be associated with missingness M. But if one can find an observable set of covariates X, such that Y and M are independent conditional on X, then conditioning on X provides a platform for conducting an MAR analysis. In practice, it will be impossible to verify complete conditional independence, but this may not be essential if conditioning on X greatly reduces the association between Y and M.

Figures 1 and 2 illustrate this strategy using simulated data reflective of patterns from an advanced-stage disease trial. The figures show means and missing data patterns for longitudinal subsamples, or cohorts, defined by duration on study for three postrandomization QoL assessments. The lines terminate at the dropout points for each subsample, with isolated dots corresponding to single assessments.3-5,9,14,15



View larger version (11K):
[in this window]
[in a new window]
 
Fig 1. Consistent with an ignorable (missing at random) missing data process, conditioning on a fully observed covariate minimizes the association between missingness and QoL (ie, cohort patterns within strata of the covariate are similar). (A) Mean observed QoL scores (higher scores indicate a better QoL) by assessment time for subsamples based on number of available assessments (cohorts). (B) Mean observed QoL by assessment time for cohorts stratified by baseline QoL quartile. QoL, quality of life.

 


View larger version (11K):
[in this window]
[in a new window]
 
Fig 2. Consistent with a nonignorable missing data process, conditioning on a fully observed covariate does not eliminate the association between missingness and QoL (ie, cohort patterns within strata of the covariate are not similar). (A) Mean observed QoL scores (higher scores indicate a better QoL) by assessment time for subsamples based on number of available assessments (cohorts). (B) Mean observed QoL by assessment time for cohorts stratified by baseline QoL quartile. QoL, quality of life.

 
In Figures 1A and 2A, levels of QoL are associated with patterns of missingness: better QoL scores correspond with more complete assessments. Mean QoL was worst at time 1 for patients who dropped out before time 2 (single point), best for patients who completed all assessments (upper line), and intermediate for patients with two assessments (middle line). The groups with complete assessments have higher (better) than the other cohorts at all time points.

Figures 1A and 2A were based on the same data generation process, with each observation obtained by regression on the previous occasion. Despite the appearance of systematic trends, the true simulated population means for both figures were set equal (to 50) across all occasions before removing missing observations. The trends in the observed data derive from the actions of the simulated missing data mechanisms, which differed between Figures 1A and 2A. Figure 1A displays mean observed data after a simulated MAR process, with missingness entirely a function of an observed baseline covariate (not shown in the figure). Figure 2A shows mean observed data after a simulated process that is not MAR, with missingness dependent on an individual’s value of QoL at each time point.

The effects of conditioning on a fully observed baseline covariate are quite different for the two processes. Figures 1B and 2B stratify the observed relationships by quartiles of the baseline measure (with each subplot containing 25% of the patients in the overall graphs). The conditioning plots in Figure 1A reveal that missing data patterns and levels of QoL are nearly independent, conditional on strata of X, the baseline measure. That is, within each level of X, the cohort lines (and point) are nearly superimposed. The strong association visible in Figure 1A between missing data pattern (cohort) and level of QoL nearly disappears within levels of X. (Note that MAR does not require that patterns be the same across strata.) In contrast, conditioning on the baseline covariate does not achieve conditional independence in Figure 2B, because the missing data mechanism behind Figure 2A was not MAR. These simulations and plots provide evidence of how a conditioning variable (in this case, a baseline observation), if properly modeled, could lead to an analysis that is much closer to an MAR ideal (Fig 1). In a similar spirit, Troxel et al16 describe a quantitative index to evaluate which particular MAR models, given the observed data, might be robust to minor departures from the ignorability assumption.

In general, the suggestion is to consider which observed covariates might most effectively reduce the conditional association between outcomes and nonresponse. In the longitudinal context, the baseline QoL score is first among equals as a candidate for conditioning. If missing data patterns look similar when stratified by baseline QoL, then an analysis that adjusts for baseline QoL, using an analysis of covariance strategy, may be sufficient. More generally, the data model must exploit effectively the missingness information in the observed covariates to approach the ignorable ideal. This criterion should inform technical analysis choices (for example, sometimes favoring autoregressive over compound symmetry covariance structures) and encourage consideration of models that are potentially rich in covariates and interactions.

WHAT CAN BE DONE WHEN PRAGMATIC IGNORABILITY FAILS?

If approximate conditional independence cannot be obtained with any set of covariates, then nonignorability must be addressed by an explicit model for the missing data mechanism. Given that these models rely ineluctably on unverifiable assumptions, there is strong consensus that analysis should consider a range of assumptions and models. Sensitivity analyses systematically vary the assumptions for the missing data mechanism or for distributional characteristics of the data set.4-6,10 Consistency in the results of sensitivity analyses confers more confidence in the specific QoL finding. However, investigators should be prepared for uncertainty because the analyses may not agree. With large amounts of nonignorable missing data, the "treatment effect" is a chimera, given that the definitive evidence is missing and unattainable. The best that can be done is to summarize the range of possibilities. From this perspective, it is reasonable to narrow the uncertainty as much as possible by anchoring the analyses to observed covariates that may explain much of the missingness.

From the time a decision has been made to include QoL outcomes in an advanced-stage disease trial, investigators should involve a statistician familiar with advanced methods for analyzing longitudinal data with nonignorable missing data.17-20 Beyond this, investigators need to become familiar with the broad concepts and objectives of sensitivity analyses to understand how study interpretations might vary. There are three relatively user-friendly analysis platforms, with readily available commercial or standard software, that accommodate nonignorable missingness while encouraging familiarity with the broad underlying concepts: pattern mixture models, multiple imputation, and structural equation models.

Each of three platforms represents an environment for conducting related analyses rather than a commitment to one specific nonignorable model. Pattern mixture models use standard analyses, such as linear mixed-effects models, to obtain separate estimates for the QoL outcome within strata based on missing data patterns; estimates are then combined in specialized ways4 to yield appropriate overall effect estimates and SEs. The multiple imputation approach, with representative software available from SOLAS (Statistical Solutions, Saugus, MA; www.statsolusa.com) and S-Plus (Insightful Corp, Seattle, WA; www.insightful.com), applies standard methods of analysis to data sets that are varied to represent the effects of missingness. This approach allows great scope for examining a range of assumptions for both the missing data model and the data generation model, and may offer more realistic estimates of true sampling uncertainties than other methods. The structural equation approach, although relatively unfamiliar in clinical trials contexts,21 provides a unified conceptual framework and a modeling interface that allows natural specification of assumptions using easily understood causal diagrams or natural language syntax. The comprehensive modeling program Mplus (Muthén & Muthén, Los Angeles, CA; www.statmodel.com)22 is particularly flexible in that its statistical infrastructure permits specification of MAR models, selection models, and pattern mixture models through minor changes in command syntax. Many advanced estimation approaches that otherwise require esoteric dedicated programming can instead be implemented with little technical demand, allowing greater focus on the concepts and assumptions that distinguish the models.

In conclusion, investigators should assume advanced-stage disease trials will involve some degree of nonignorable missing data. Generally, trials are designed to produce different survival rates, and different survival rates imply systematically different missing data rates. Anticipating this, standard reporting should provide the reader with full information about questionnaire submission rates, dropout patterns, and reasons for dropout. Older analysis methods that require complete cases (eg, multivariate analysis of variance) are not adequate for advanced-disease trials; at a minimum, analysis methods should retain all available data and remain valid under MAR (eg, mixed-effects models). To help evaluate the appropriateness of an MAR model, we have suggested that the analysis must consider the likely impact of nonignorable missing data by expanding the role of observable covariates to explain as much missingness as possible. A key question is, "Is there a covariate that appears to be strongly associated with missing data patterns, such as the baseline QoL value, and does the analysis efficiently exploit this information?" The discussion section needs a realistic interpretation of the uncertainty caused by missing data. Practices and claims that are questionable even with complete data (eg, overgeneralizing results, accepting the null hypothesis, confusing statistical and practical significance) become wildly inappropriate in the presence of nonignorable missing data. Because it is not possible to know for sure whether missing data are nonignorable, it is always prudent to examine how conclusions change by varying key statistical assumptions.

Authors’ Disclosures of Potential Conflicts of Interest

The authors indicated no potential conflicts of interest.

REFERENCES

1. Skeel RT: Quality of life dimensions that are most important to cancer patients. Oncology 7:55-61, 1993

2. Bernhard J, Cella DP, Coates AS, et al: Missing quality of life data in cancer clinical trials: Serious problems and challenges. Stat Med 17:517-532, 1998[CrossRef][Medline]

3. Moinpour CM, Triplett JS, McKnight B, et al: Challenges posed by nonrandom missing quality of life data in an advanced-stage colorectal cancer clinical trial. Psychooncology 9:340-354, 2000[CrossRef][Medline]

4. Fairclough D: Design and Analysis of Quality of Life Studies in Clinical Trials. Boca Raton, FL, Chapman & Hall/CRC, 2002

5. Fayers PM, Machin D. Quality of Life: Assessment, Analysis and Interpretation. Chichester, United Kingdom, John Wiley & Sons Ltd, 2002.

6. Pauler DK, McCoy S, Moinpour C: Pattern mixture models for longitudinal quality of life studies in advanced stage disease. Stat Med 22:795-809, 2003[CrossRef][Medline]

7. Campbell D, Stanley J: Experimental and Quasi-Experimental Designs for Research. Chicago, IL, Rand-McNally, 1963

8. Brown J, Thorpe H, Napp V, et al: Assessment of quality of life in the supportive care setting of the Big Lung Trial in non–small-cell lung cancer. J Clin Oncol 23:7417-7427, 2005[Abstract/Free Full Text]

9. Troxel AB, Fairclough DL, Curran D, et al: Statistical analysis of quality of life with missing data in cancer clinical trials. Stat Med 17:653-666, 1998[CrossRef][Medline]

10. Troxel A, Moinpour CM: Design and analysis of quality of life data, In J Crowley, DP Ankerst (eds): Handbook of Statistics in Clinical Oncology: Second Edition, Revised and Expanded. New York, NY, Marcel Dekker (in press)

11. Schafer JL: Analysis of Incomplete Multivariate Data. Boca Raton, FL, Chapman & Hall/CRC, 1997

12. Fairclough DL, Peterson HF, Chang V: Why are missing quality of life data a problem in clinical trials of cancer therapy? Stat Med 17:667-677, 1998[CrossRef][Medline]

13. Schafer JL, Graham JW: Missing data: Our view of the state of the art. Psychol Methods 7:147-177, 2002[CrossRef][Medline]

14. Hopwood P, Stephens RJ, Machin D: Approaches to the analysis of quality of life data: Experiences gained from a Medical Research Council Lung Cancer Working Party palliative chemotherapy trial. Qual Life Res 3:339-352, 1994[CrossRef][Medline]

15. Curran D, Molenberghs G, Fayers PM, et al: Incomplete quality of life data in randomized trials: Missing forms. Stat Med 17:697-709, 1998[CrossRef][Medline]

16. Troxel AB, Ma GM, Heitjan DF: An index of local sensitivity to nonignorability. Statistica Sinica 14:1221-1237, 2004

17. Rubin DB: Multiple imputation for nonresponse in surveys. New York, NY, John Wiley & Sons, 1987

18. Diggle PJ, Kenward MG: Informative dropout in longitudinal data analysis (with discussion). Appl Stat 43:49-93, 1994[CrossRef]

19. Little RJA: Modeling the dropout mechanism in repeated-measures studies. J Am Statist Assoc 90:1112-1121, 1995[CrossRef]

20. Hogan JW, Laird NM: Model-based approaches to analyzing incomplete longitudinal and failure time data. Stat Med 16:259-272, 1997[CrossRef][Medline]

21. Donaldson GW: General linear contrasts on latent variable means: Structural equation hypothesis tests for multivariate clinical trials. Stat Med 22:2893-2917, 2003[CrossRef][Medline]

22. Muthén LK, Muthén BO. Mplus User’s Guide (ed 3). Los Angeles, CA, Muthen & Muthen, 2004


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Facebook Facebook   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?

Related Article

  • Assessment of Quality of Life in the Supportive Care Setting of the Big Lung Trial in Non–Small-Cell Lung Cancer
    J. Brown, H. Thorpe, V. Napp, D.J. Fairlamb, N.H. Gower, R. Milroy, M.K.B. Parmar, R.M. Rudd, S.G. Spiro, R.J. Stephens, D. Waller, P. West, and M.D. Peake
    JCO 2005 23: 7417-7427 [Abstract] [Full Text]


This article has been cited by other articles:


Home page
JNCI J Natl Cancer InstHome page
B. B. Reeve, A. L. Potosky, A. W. Smith, P. K. Han, R. D. Hays, W. W. Davis, N. K. Arora, S. C. Haffer, and S. B. Clauser
Impact of Cancer on Health-Related Quality of Life of Older Americans
J Natl Cancer Inst, June 16, 2009; 101(12): 860 - 868.
[Abstract] [Full Text] [PDF]


Home page
JCOHome page
S. R. Land
Missing Patient-Reported Outcome Data in an Adjuvant Lung Cancer Study
J. Clin. Oncol., November 1, 2008; 26(31): 5018 - 5019.
[Full Text] [PDF]


This Article
Right arrow Full Text (PDF)
Right arrow Purchase Article
Right arrow View Shopping Cart
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a colleague
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Save to my personal folders
Right arrow Download to citation manager
Right arrowRights & Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Donaldson, G. W.
Right arrow Articles by Moinpour, C. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Donaldson, G. W.
Right arrow Articles by Moinpour, C. M.
Related Articles
Right arrowRelated Article
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

About
JCO
 Editorial
Roster
 Advertising
Information
 Librarians &
Institutions
 Rights &
Permissions
 PDA Services

Copyright © 2005 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
Terms and Conditions of Use
  HighWire Press HighWire Press™ assists in the publication of JCO Online