|
|||||
|
|
||||||
Journal of Clinical Oncology, Vol 24, No 18 (June 20), 2006: pp. 2808-2814 © 2006 American Society of Clinical Oncology. DOI: 10.1200/JCO.2005.04.3661 The "Big Dog" Effect: Variability Assessing the Causes of Error in Diagnoses of Patients With Lung Cancer
From the Department of Pathology, University of Pittsburgh School of Medicine; Western Pennsylvania Hospital; Department of Family Medicine and Clinical Epidemiology, University of Pittsburgh School of Medicine, Pittsburgh, PA; Henry Ford Health System, Detroit, MI; University of Iowa Healthcare, Iowa City, IA; Wake Forest University Medical Center, Winston-Salem, NC; and the Loyola University Medical Center, Maywood, IL Address reprint requests to Stephen S. Raab, MD, Department of Pathology, 5230 Centre Avenue, UPMC Shadyside Hospital, Pittsburgh, PA 15213; e-mail: raabss{at}upmc.edu
PURPOSE: The frequency of diagnostic error in patients who have a lung mass and a pathology specimen is as high as 15%. This study examined the role of inter-pathologist agreement in identifying the cause of error in these patients.
METHODS: Pathologists from six institutions reviewed the slides of 40 patients who had a pulmonary specimen false-negative diagnosis. The initial assessment of error cause arose from cytologic-histologic correlation slide review of discrepant diagnostic samples in patients who had both a bronchial brushing cytologic and surgical specimen. The cause of error was attributed either to clinical sampling (diagnostic material obtained in one but not the other sample) or interpretation (pathologist failed to identify the salient diagnostic features). The pairwise kappa (
RESULTS: The pairwise CONCLUSION: Pathologists exhibit poor agreement in determining the cause of error for pulmonary specimens sent for cancer diagnosis. We developed a psychosocial hypothesis (the "Big Dog" Effect) that partially explains biases in error assessment. This lack of agreement precludes confident targeting of these errors for quality improvement interventions with prospects of success across a variety of institutions.
Cancer diagnoses are usually based on the cytologic or histologic evaluation of tissues.1 The reproducibility of pathologist error assessments in cancer diagnosis from one institution to another has not been systematically studied. However, this reproducibility has implications for the evaluation, selection, and monitoring of error reduction in cancer diagnosis and for the medical legal assessment of failure or delay in diagnosis as a form of malpractice. The frequency and type of errors in cancer diagnosis depend on the method of detection.1 A commonly used method is cytologic-histologic (CH) correlation,2,3 in which a pathologist reviews the diagnoses of cytologic and surgical specimens from the same anatomic site (eg, fine needle aspiration and surgical excision of the thyroid gland). In this exercise, the correlated diagnoses are classified as either concordant or discordant. Raab et al reported that up to 11.8% of all nongynecologic CH correlation pairs were discordant.1 The first step of a root cause analysis, as performed by most laboratories, involves the review of the slides from the discordant cases to determine whether the source of error is sampling (ie, diagnostic material obtained in one but not the other sample) or interpretation (diagnostic material is present in both samples but misinterpreted in one sample).2,4 For 15 years, CH correlation has been mandated by federal regulation on the assumption that the detection and investigation of discrepancies improves diagnostic accuracy and clinical care.5
In a single institutional study of diagnostic cancer errors detected by the CH correlation method, Clary et al reported that on the original review, 66% of nongynecologic errors were secondary to sampling and 34% of errors were secondary to interpretation.2 The interobserver variability of the rereview assignment of error cause was calculated using the pairwise kappa ( This report summarizes the first multi-institutional study examining the inter- and intraobserver agreement in the attribution of the causes of errors in lung cancer diagnosis, as they were detected by CH correlation. The six institutions that participated in this project are members of a consortium studying patient safety under an Agency for Healthcare Research and Quality (AHRQ)-funded quality improvement initiative.6
Background and Design In 2002, AHRQ provided funding to four institutions to: determine baseline error frequencies detected by different methods; determine the clinical impact of diagnostic errors; perform root cause analysis to derive error reduction strategies, and; assess the success of these error reduction strategies.6 Subsequently, two additional institutions joined in the consortium project. Institutional review board approval for performance of this project was obtained at all sites.
Description of the CH Correlation Review Process
Definition of CH Discrepancy and Cause We defined a discrepancy as a difference between the cytologic and histologic diagnoses that indicated presence or absence of a pathological entity or a definite difference in the degree to which the pathologic condition is judged to be present.1 Cytologic and surgical diagnostic schema are different, and we developed a scaled hierarchy of categories in order to determine if a discrepancy occurred (Fig 1).6 In order to determine differences, we classified all diagnoses (both cytologic and surgical) into categoric steps of unsatisfactory, benign, atypical, suspicious, and malignant. We defined a CH correlation pair as discrepant if there was at least a two-step difference in diagnosis. Less than two-step discrepancies prove to be unreproducible and without clinical implication. In this example, the designated review pathologist identified this case as discrepant based on step system at the bottom of the figure ([surgical pathology malignant diagnosis = step 5] [cytology benign diagnosis = step 1] = 4). In our standardized CH process, the review pathologist examined all microscopic slides and determined if the cytologic diagnosis, surgical pathology diagnosis, neither diagnosis, or both diagnoses were the cause of discrepancy.6 The review pathologist then assigned an underlying root cause for the error using two categoriesinterpretation or sampling. An interpretation error was an error in categorization. Interpretive errors further were classified as overcalls if the review diagnosis was more than two categoric steps lower than the original or undercalls if the review diagnosis was more than two categoric steps higher than the original. A sampling error was an error in which diagnostic material was not present on the discrepant slide, even on review. In this study, we examined review by a single pathologist because such review is the most common way that American laboratories perform their regulatory duty of CH correlation.7
Case Selection
The four original project sites each selected 10 discrepant cases for review. These institutions reported varied proportional assignments of error cause (Table 2). The institutions selected specimen pairs that represented typical CH correlation cases. Originally, all institutions had attributed errors predominantly to sampling and predominantly on cytology samples and did not select false-positive interpretive errors. At the time of the original error assignment, institution A used a cytology fellow to perform CH correlation. A senior pathologist reviewed the cases for institution A for this study. The original review pathologist for institution D had left at the time of this study, and a new pathologist performed the reviews. For review purposes, we outlined rules for case selection. The slides sets were chosen so that one slide set would not contain significantly different types of cases than another slide set. Our rules specifically asked for an equal mix of interpretation and sampling errors on cytology cases.
Data Collection The discrepant specimen slides were sent to the coordinating project site and were de-identified so that the reviewers could not ascertain the site of origin. The original reports were also de-identified (institution, patient, pathologist, cytotechnologist, and clinician names removed). The original diagnoses, clinical history, gross and microscopic findings, and diagnostic comments were not removed since this information is routinely available to pathologists performing CH correlation. The coordinating project site first performed the CH correlation. A senior cytotechnologist screened each cytology slide and dotted the areas containing the most worrisome findings, and the coordinating site pathologist reviewed the slides and made an assignment of error cause. All 40 slides then were sent to the other institutions for review. Each pathologist performed CH correlation in the fashion in which it would be performed at his or her institution. Each site completed a standardized data collection form that recorded the review cytologic and surgical diagnoses, an assessment if the discrepancy was due to either sampling or interpretation (or both), and an assessment if the error occurred in association with a cytological or surgical specimen.
Data Analysis
The pairwise
We interpreted the pairwise
Table 4 presents the pairwise statistic for the site-specific review reason compared with the original reason for discrepancy. Overall, the pairwise statistics were variable; however, the 95% CIs were relatively narrow, allowing us to interpret the calculated statistics as reasonable representations of pathologist agreement beyond chance. The intersite pairwise statistic varied from 0.200 to 0.783. A negative statistic indicated that the level of agreement was worse than that expected by chance alone ( = 0). Project site D originally classified every discrepancy as sampling but none of the other project sites (including project site D) concurred that every error was secondary to sampling on review. The majority (73%) of inter site determined pairwise statistic values were.400 or less, indicating poor agreement.
The intrasite pairwise statistic varied from 0.154 to 0.800. The site B pathologist was the same for the original and review CH correlation (good agreement) and the site C pathologist reviewed only a fraction of the original cases (poor agreement). The site D and A pathologists were not involved in the original CH correlation. The site D pathologist replaced the first pathologist and the site A pathologist had assigned the CH correlation to trainees (even worse agreement).
Table 5 shows the pairwise
These data show that the causes of error assigned to CH discrepancies are highly variable between review pathologists at different institutions. As a consequence, the first classifying step in the root cause analysis for this type of mandated correlation is highly biased. This bias prevents institutions from comparing error causes and prevents the validation of inter-institutional error reduction strategies. If pathologists cannot agree on the underlying cause of CH discrepancies, then further addressing sampling or interpretation issues is problematic. We hypothesize that the lack of inter site agreement is secondary to fundamental differences in how pathologists evaluate instances of diagnostic discrepancy. We term this phenomenon as the "Big Dog" effect; senior experienced pathologists at each institution serve as the final arbitrator for error cause and use different methods and approaches to decide whether discrepancies exist and their causes. When Big Dogs are confronted with differing assessments from other Big Dogs, they remain reticent to change their opinions. As part of this project, the Big Dogs conjointly evaluated errors on several occasions. The locally dominant pathologists in every case did not change their assessments to a significant extent. Pathologists tend to operate in local environments with little exposure to outside opinion; this has led to poor inter-institutional diagnostic agreement for particular case types. The local nature of CH correlation reference standards is a potential source of error that rarely is addressed.
A corollary to the Big Dog effect is the "Little Dog" effect, which operates at the local institutional level. Most institutions have only one Big Dog to whom other pathologists (Little Dogs) at the same institution defer diagnostic judgment (bite). Institution B exhibited the highest intrasite
In our study, the pairwise Although the CH correlation process has been mandated by law on the assumption that it detects the causes of errors, some institutions use it in ways that systematically bias assignment of error to the preanalytic phase of testing (sampling).1 For example, cytology trainees are biased not to attribute interpretive error to their mentors. Similar to nonpathology error detection methods, the CH correlation process is driven by fear of disclosure and fear of blame.10-12 Consequently, laboratories generally avoid using CH correlation data for error reduction.13,14 Most laboratories expend considerable resources performing CH correlation purely because they are required to do so by law and not because they have reasonable hopes of learning to prevent errors based on the findings of this exercise. Noncorrelating CH correlation cases tend to be challenging.2 Nodit et al15 reviewed 32 pulmonary CH correlation pairs, in which the cytology case was a bronchial washing or brushing; root cause analysis showed that an interpretive error was the originally assigned cause in 50% of cases. However, the diagnosis of malignancy was straightforward on rereview in only one case, indicating that less than optimal sampling contributed to even those errors that were considered interpretive.13 Most pulmonary errors are caused by the interplay of both poor sampling and misinterpretation. This factor may have contributed to the lack of agreement in this study; some pathologists were more likely than others to attribute error to interpretation even if they saw only a few malignant cells.
Although the
Conclusion
Although all authors completed the disclosure declaration, the following authors or their immediate family members indicated a financial interest. No conflict exists for drugs or devices used in a study if they are not being evaluated as part of the investigation. For a detailed description of the disclosure categories, or for more information about ASCO's conflict of interest policy, please refer to the Author Disclosure Declaration and the Disclosures of Potential Conflicts of Interest section in Information for Contributors.
Dollar Amount Codes (A) < $10,000 (B) $10,000-99,999 (C)
Supported by Agency for Healthcare Research and Quality Grant No. R01 HS13321-01. Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.
1. Raab SS, Grzybicki DM, Janosky JE, et al: Clinical impact and frequency of anatomic pathology errors in cancer diagnosis. Cancer 104:2205-2213, 2005[CrossRef][Medline] 2. Clary JM, Silverman JF, Liu Y, et al: Cytohistologic discrepancies: A mean to improve pathology practice and patient outcomes. Am J Clin Pathol 117:567-573, 2002 3. Jones BA, Novis DA: Cervical biopsy-cytology correlation: A College of American Pathologists Q-Probes study of 22,439 correlations in 348 laboratories. Arch Pathol Lab Med 120:523-531, 1996[Medline] 4. Joste NE, Crum CP, Cibas ES: Cytologic/histologic correlation for quality control in cervicovaginal cytology: Experience with 1,582 paired cases. Am J Clin Pathol 103:32-34, 1995[Medline] 5. Department of Health and Human Services, Health Care Financing Administration: Clinical laboratory improvement amendments of 1988: final rule, 57 Federal Register 7146. 1992, codified at 42 CFR 6. Raab SS: Improving patient safety by examining pathology errors. Clin Lab Med 24:849-863, 2004[CrossRef][Medline] 7. Vrbin CM, Grzybicki DM, Zaleski MS, et al: Variability in cytologic-histologic correlation practiced and implications on patient safety. Arch Pathol Lab Med 129:893-898, 2005[Medline] 8. Cicchetti DV: Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assess 6:284-290, 1994[CrossRef] 9. Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics 33:159-174, 1977[CrossRef][Medline] 10. Leape LL: A systems analysis approach to medical error. J Eval Clin Pract 3:213-222, 1997[CrossRef][Medline] 11. Leape LL, Berwick DM: Five years after to err is human: What have we learned? JAMA 293:2384-2390, 2005 12. Waring JJ: Beyond blame: Cultural barriers to medical incident reporting. Soc Sci Med 60:1927-1935, 2005[CrossRef][Medline] 13. Wakefield BJ, Blegen MA, Uden-Holman T, et al: Organizational culture, continuous quality improvement, and medication administration error reporting. Am J Med Qual 16:128-134, 2001 14. Eccles M, Grimshaw J, Campbell M, et al: Research designs for studies evaluating the effectiveness of change and improvement strategies. Qual Saf Health Care 12:47-52, 2003 15. Nodit L, Balassanian R, Sudilovsky D, et al: Improving the quality of cytology diagnosis: Root cause analysis for errors in bronchial washing and brushing specimens. Am J Clin Pathol 124:883-893, 2005[CrossRef][Medline] 16. Byrt T, Bishop J, Carlin JB: Bias, prevalence and kappa. J Clin Epidemiol 46:423-429, 1993[CrossRef][Medline] 17. Lantz CA, Nebenzahl E: Behavior and interpretation of the statistic: Resolution of the two paradoxes. J Clin Epidemiol 49:431-434, 1996[CrossRef][Medline] 18. Nelson JC, Pepe MS: Statistical description of interrater variability in ordinal ratings. Stat Methods Med Res 9:475-496, 2000 19. Llewellyn H: Observer variation, dysplasia grading, and HPV typing: A review: Am J Clin Pathol 114:S21-35, 2000[Medline] 20. Leslie KO, Fechner RE, Kempson RL: Second opinions in surgical pathology. Am J Clin Pathol 106:S58-S64, 1996[Medline] 21. Page DL, Dupont WD, Jensen RA, et al: When and to what end do pathologists agree? J Natl Cancer Inst 90:88-89, 1998 22. Scott MA, Lagios MD, Axelsson K, et al: Ductal carcinoma in situ of the breast: Reproducibility of histological subtype analysis. Hum Pathol 28:967-973, 1997[CrossRef][Medline] 23. Wells WA, Carney PA, Eliassen MS, et al: Pathologists' agreement with experts and reproducibility of breast ductal carcinoma-in-situ classification schemes. Am J Surg Pathol 24:651-659, 2000[CrossRef][Medline] 24. Bethwaite P, Smith N, Delahung B, et al: Reproducibility of new classification schemes for the pathology of ductal carcinoma in situ of the breast. J Clin Pathol 51:450-454, 1998[Abstract] 25. Sloane JP, Amendoeira I, Apostolikas N, et al: Consistency achieved by 23 European pathologists in categorizing ductal carcinoma in situ of the breast using five classifications: European Commission Working Group on Breast Screening Pathology. Hum Pathol 29:1056-1062, 1998[Medline] Submitted September 25, 2005; accepted March 21, 2006.
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||
|
Copyright © 2006 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
|