Repeated, Repeated, Got Lucky

James O. Westgard
A word from
Dr. Westgard
 

July 2001
An updated version of this essay appears in the Nothing but the Truth about Quality manual

James O. Westgard, PhD, FACB

The title of this essay comes from a story about the review of laboratory quality control records. The story was told by a participant at our recent workshop on Six Sigma Quality Management. In auditing the QC records of a laboratory, the response to an out-of-control situation was documented as "Repeated, repeated, got lucky!"

This phrase captures a QC practice that is common in many laboratories - repeat the control until it finally falls within the limits. In the instance cited, this particular analyst was honest enough to know that nothing had changed with the quality of the testing process except, luckily, the control finally fell within acceptable limits. No one knows what was going on with the patient samples?

What's wrong?

It seems silly to run controls to check the quality of laboratory tests, then almost automatically disregard the results when they indicate something might be wrong! Yet that is exactly what is being done in many laboratories today.

Here's a common protocol that describes how laboratories respond to control results:

If out of luck after four repeats, then something needs to be done to fix the testing process. Repeat, repeat, repeat, repeat - no luck, fix problem! As you might imagine, some laboratories have more sophisticated protocols and come up with an even high number of repeats!

Don't use 2 SD control limits! The real culprit that has compromised laboratory QC is the use of 2 SD control limits, which give a high level of false rejections - 5% for 1 measurement on 1 material (N=1), 9% for 1 measurement on each of 2 materials (N=2), 14% for 1 measurement on each of 3 materials (N=3). Experience with false rejections or false alarms compromises the detection system. Think about a fire alarm and the effect of hearing false alarms. After one or two false alarms, no one pays any attention to the alarm system.

Don't calculate control limits from labeled means and SDs. One way to get rid of these false alarms is to make the control limits wider. Laboratories have come up with all kinds of rationalizations to do this. One common practice is to use the labeled or bottle values provided by manufacturers. Those bottle values represent the performance observed for several methods or instrument in several different laboratories. They include between laboratory differences that make the SDs larger than what would be observed in a single laboratory. Larger SDs, larger control limits, fewer false rejections, also less error detection. How much less? Who knows?

Don't calculate control limits from peer group means and SDs. Another way to get wider control limits is to use the peer means and SDs obtained for a group of laboratories that utilize the same lot of control material. These peer assessment services are often available from manufacturers and may include hundreds of different laboratories. The group mean is often useful for assessing the systematic difference between an individual laboratory and the peer group. However, control limits should not be calculated the group mean and SD because the SD will includes variations between laboratories. Again, larger SDs, larger control limits, fewer false rejections, less error detection, and no idea whether medically important problems will actually be detected.

Don't use "medical decision limits." Perhaps the most insidious approach has been the use of "medical decision limits," which have been defined as a "second set of limits set for control values… meant to be a wider set of limits indicating the range of medically acceptable results." [1] The idea is to have two sets of control limits, the wider set allowing the reporting of patient data even though the narrow statistical QC limits may have been violated. The origin of this concept is difficult to trace, but it received favorable review in a CAP Q-Probe study in the mid 90s. The name sounds so good, but the practice is so bad! Don't, don't, don't do this! [See The Myth of Medical Decision Limits on this website for a more detailed discussion.]

What's right?

The most recent consensus guideline for QC practices is NCCLS C24-A2: Statistical Quality Control for Quantitative Measurements: Principles and Definitions. Approved Guideline - Second Edition [2]. This document is a 1999 update of an earlier guideline that was developed in the late 1980s. It provides a number of recommendations on how to improve QC practices.

Do minimize false rejections by careful planning. The problem with 2 SD control limits should be dealt with upfront by careful planning and selection of QC procedures. The planning process involves defining the quality required for the test, accounting for the imprecision and inaccuracy observed for the method, then selecting control rules and numbers of control measurements that will minimize false rejections. In general, that eliminates the use of 2 SD control limits for all applications except when N=1, which would limit false rejections to 5%.

Do build-in the error detection necessary for individual tests. The error detection that is needed can be determined on the basis of the quality required for each test, the imprecision and inaccuracy observed for the method, and the known rejection characteristics of different control rules and numbers of control measurements. Improved error detection is obtained for single-rule procedures by narrowing control limits from 3.5s to 2.5s and by increasing N. Improved error detection can also be obtained by building multirule procedures that provide parallel testing by two or more rules, for example, the 13s/22s/R4s/41s/8x multi-rule control procedure add rules such as 22s, 41s, and 8x to increase the detection of systematic error and adds the R4s rule to increase the detection of random error.

Do establish your own means and SDs for calculating control limits. The principle of statistical QC is to compare the observed variation of a method with the variation expected if the method were under stable operation. The best estimate of the expected variation for your laboratory comes from data from your laboratory, i.e., from the means and SDs observed for control materials analyzed in your laboratory. NCCLS permits the use of labeled or assigned values for a short period (up to 4 weeks or so) while control data are being collected in your laboratory. After that, the laboratory should switch over to control limits calculated from it's own means and SDs, using cumulatives up to 3 to 6 months to obtain reliable estimates of the means and SDs.


Do problem-solve all out-of-control situations. If false rejections are minimized out-front and the error detection is appropriate for the test, then an out-of-control signal should lead directly to trouble-shooting the process and fixing the problem. That's what QC is supposed to be about! Fixing problems and eliminating the causes of those problems so they won't occur again. That will lead to stable well-controlled process that will seldom have problems.

References

  1. Tetrault GA, Steindel SJ. Q-Prove 94-08. Daily quality control exception practices. Chicago:College of American Pathologists, 1994.
  2. NCCLS C24-A2. Statistical Quality Control for Quantitative Measurements: Principles and Definitions; Approved Guideline - Second Edition. National Committee for Clinical Laboratory Standards, Wayne, PA.

James O. Westgard, PhD, is a professor of pathology and laboratory medicine at the University of Wisconsin Medical School, Madison. He also is president of Westgard QC, Inc., (Madison, Wis.) which provides tools, technology, and training for laboratory quality management.

Other Essays:

Copyright © 2001. All rights reserved.
Westgard QC, 7614 Gray Fox Trail, Madison WI 53717
Call 608-833-4718 or e-mail us at westgard@westgard.com

A Message from JOW
QC Lessons | QC Applications | Questions | Multirule
CLIA Requirements | What's New?| Catalog | Demo Download
Home  | Glossary | ARCHIVES | Links | Feedback