|
|||||
|
|
||||||
Originally published as JCO Early Release 10.1200/JCO.2006.06.7942 on July 5 2006 © 2006 American Society of Clinical Oncology.
Noise and Bias in Microarray Analysis of Tumor Specimens
Departments of Medicine and Molecular Genetics and Microbiology, Division of Medical Oncology, Duke Institute for Genome Sciences and Policy, Duke University Medical Center, Durham, NC
Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School; Department of Medicine, Brigham and Women's Hospital, Boston, MA Microarray-based investigations can provide fundamental insights into cancer biology1 and have the potential to predict patient outcome2,3 and response to therapy.4 For microarray analysis to be informative, measured gene expression must accurately reflect tumor biology. Given the increasing use of microarray analysis in clinical oncology, it is imperative that investigators understand how to minimize expression noise and bias through effective trial design. Expression noise is inherent to microarray-based measurement of gene expression. Noise can be defined as gene expression variation that does not correlate with the biology or behavior being studied and is introduced both by unrelated biologic phenomena and during tissue processing for analysis. Expression noise can obscure informative patterns of gene expression (resulting in a false-negative finding, a beta error), and most computational approaches to microarray analysis are focused on finding true associations despite profound noise. Bias is not inherent to microarray-based analysis but is easily introduced by faulty experimental design. For this discussion, expression bias can be defined as expression variation that correlates with the hypothesis being tested but caused by experimental factors that are independent of the hypothesis. Bias confounds analysis and begets false associations (false-positive associations, alpha errors). Bias is not accounted for by most analytic methodologies, and investigators therefore need to minimize the potential for bias by implementing appropriate experimental design. In this issue of the Journal of Clinical Oncology, Lin et al5 detail gene expression changes in prostate tissue that occur during tissue processing by comparing prostate biopsies performed in situ during surgery with those performed ex vivo during gross pathologic assessment. The authors performed microarray analysis on paired samples from 12 patients and determined that 1.5% of the genes studied (n = 62) have increased expression in the ex vivo biopsies. For this analysis, the authors used a false-discovery rate (FDR) threshold of 0.10 as an arbitrary statistical cutoff, which estimates that 10% of the genes identified are falsely associated with ischemia. No genes had significantly decreased expression using the same statistical cutoff. Quantitative polymerase chain reaction confirmed a representative subset of the genes identified. Pathway analysis using GenMAPP (http://www.genmapp.org/) determined that the observed gene expression was similar to that seen with stress responses caused by cytotoxic drugs, heat shock, and radiation. The trial design of Lin et al demonstrates how successful microarray investigations minimize noise and avoid bias through thoughtful experimental design and meticulous execution. Although their detailed methodology provides a solid roadmap for successful microarray analysis, there are several aspects of their methods that warrant specific comment. First, the authors minimized potential bias by taking biopsies from both the in situ and ex vivo prostates rather than comparing the in vivo biopsies to a section of tissue collected from the prostate during gross dissection. Second, the authors minimized noise by performing laser-capture microdissection (LCM). LCM allows the selective analysis of a specific cell population (eg, epithelial cells) and will minimize noise introduced by variation in tissue composition.6 Although LCM may not be appropriate for all tumor-based expression studies, its use by Lin et al minimized the effect of tissue heterogeneity on subsequent analysis. Finally, the use of FDR7 sufficiently accounts for multiple hypothesis testing and minimizes the chance of false positives caused solely by expression noise and FDR is becoming an accepted standard in microarray analysis. The genes with differential gene expression between in situ and ex vivo biopsies are characterized by the authors as being associated with cellular stress. This is supported by the presence of DUSP1 and JUN, which have previously been shown to have differential expression in response to cellular insults.8,9 Their ischemia-related genes also partially overlaps with a previous set of genes induced during warm ischemia (for example, JUNB and JUND).10 These genes likely contribute to noise in most tumor-based microarray studies as, even in the best of settings, the time between tissue devascularization during surgery and specimen freezing can vary significantly. Thus, this work further underscores the importance of standardized sample handling to minimize noise. However, the association between the genes induced by hypoxia with cellular stress also suggests that these changes can potentially act as confounding variables and introduce bias. As a quick survey to determine whether the 62 ischemic genes identified by Lin et al can introduce bias, we used gene set enrichment analysis (GSEA)11 to assess coordinate differential expression across sample classes from some of our previously reported prostate cancer analyses. As Table 1 demonstrates, ischemic genes do not have significant differential expression across analyses when analyzed for benign compared with malignant localized prostate specimens,12 recurrent or nonrecurrent localized prostate cancer,12 or neoadjuvant docetaxel-treated or untreated localized prostate cancer.13 This suggests that the ischemic gene set may have introduced noise, but not bias, in these studies. However, in a study comparing expression between biopsies from metastatic androgen-independent prostate cancer and samples from localized prostate cancer,14 there was strong differential expression in the radical prostatectomy specimens (FDR enrichment, P < .001). Although we focused on differential expression of androgen metabolism genes, and there is no evidence to suggest that our conclusions are affected by the ischemic genes, this asymmetric expression of the ischemic genes demonstrates the potential for bias.
Looking forward, the findings by Lin et al underscore the importance of valid study design and rigorous implementation for clinical trials incorporating microarray analysis. Without appropriate controls, comparisons between pretreatment biopsies and post-treatment surgical specimens are vulnerable to bias. This will be especially true when there is likely to be significant overlap between the ischemic response genes and genes whose expression may change with the tumor biology of interest (such as hypoxia-induced factor activity) or specific therapies (such as antiangiogenic therapy). Although evolving statistical measures of expression can help account for noise and multiple testings (the FDR), they can not overcome inherent bias. Other procedural elements that have been shown to contribute to poor comparability and bias include batch-to-batch processing differences15 and dye choice for spotted arrays,16 although there are undoubtedly many more. Minimizing processing differences that parallel the clinical or biologic question being addressed is critical to ensure that potentially confounding variables do not introduce bias into microarray experiments. The article by Lin et al details the expression manifestations of an omnipresent technical factor in gene expression of human tissue, tissue processing time. By providing a detailed description of the genes induced by tissue ischemia, Lin et al provide an expression signature of ischemia that may prove helpful for quality control in future microarray experiments. A quick survey of published work demonstrates that these genes have the potential to contribute noise and bias, depending on study context, underscoring the importance of thoughtful microarray trial design in clinical oncology. Authors' Disclosures of Potential Conflicts of Interest The authors indicated no potential conflicts of interest. Author Contributions
ACKNOWLEDGMENTS Phillip G. Febbo is a Damon Runyon Cancer Research Foundation clinical investigator. NOTES published online ahead of print at www.jco.org on July 5, 2006. REFERENCES 1. Sweet-Cordero A, Mukherjee S, Subramanian A, et al: An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat Genet 37:48-55, 2005[Medline] 2. van de Vijver MJ, He YD, van't Veer LJ, et al: A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347:1999-2009, 2002 3. Bild AH, Yao G, Chang JT, et al: Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 439:353-357, 2006[CrossRef][Medline] 4. Rosenwald A, Wright G, Chan WC, et al: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 346:1937-1947, 2002 5. Lin DW, Coleman IM, Hawley S, et al: Influence of surgical manipulation on prostate gene expression: Implications for molecular correlates of treatment effects and disease prognosis. J Clin Oncol 24:3763-3770, 2006 6. Emmert-Buck MR, Bonner RF, Smith PD, et al: Laser capture microdissection. Science 274:998-1001, 1996 7. Bengamini Y, Hochberg Y: Controlling the false discovery rate: A practical and powerful approach to multiple testing. J Royal Stat Soc: Series B 57:289-300, 1995 8. Wild PJ, Krieg RC, Seidl J, et al: RNA expression profiling of normal and tumor cells following photodynamic therapy with 5-aminolevulinic acid-induced protoporphyrin IX in vitro. Mol Cancer Ther 4:516-528, 2005 9. Ma SF, Grigoryev DN, Taylor AD, et al: Bioinformatic identification of novel early stress response genes in rodent models of lung injury. Am J Physiol Lung Cell Mol Physiol 289:L468-L477, 2005 10. Dash A, Maine IP, Varambally S, et al: Changes in differential gene expression because of warm ischemia time of radical prostatectomy specimens. Am J Pathol 161:1743-1748, 2002 11. Subramanian A, Tamayo P, Mootha VK, et al: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545-15550, 2005 12. Singh D, Febbo PG, Ross K, et al: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203-209, 2002[CrossRef][Medline] 13. Febbo PG, Richie JP, George DJ, et al: Neoadjuvant docetaxel before radical prostatectomy in patients with high-risk localized prostate cancer. Clin Cancer Res 11:5233-5240, 2005 14. Stanbrough M, Bubley GJ, Ross K, et al: Increased expression of genes converting adrenal androgens to testosterone in androgen-independent prostate cancer. Cancer Res 66:2815-2825, 2006 15. Piper MD, Daran-Lapujade P, Bro C, et al: Reproducibility of oligonucleotide microarray transcriptome analyses: An interlaboratory comparison using chemostat cultures of Saccharomyces cerevisiae. J Biol Chem 277:37001-37008, 2002 16. Dobbin KK, Kawasaki ES, Petersen DW, et al: Characterizing dye bias in microarray experiments. Bioinformatics 21:2430-2437, 2005 Related Article
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||
|
Copyright © 2006 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
|