|
|
In the previous essay, I described the implications of Six Sigma Quality Management for setting analytical specifications for the precision and bias of laboratory testing processes [1]. The desirable precision is 1/6 of the tolerance specification or 1/6 of the allowable total error for a laboratory test. One way of applying this to laboratory tests is to divide the CLIA proficiency testing acceptability criteria by 6 to establish specifications for the precision of laboratory methods [see the table in the previous discussion, 1]. When this precision performance is achieved, the Six Sigma model postulates that a bias up to 1.5 times the standard deviation of the method should be tolerable without causing defective results. Larger biases that would cause significant defects should be readily detectable by simple statistical QC procedures, such as use of 3 SD control limits with a very low number of control measurements.
BUT, what should you do if your method doesn't achieve six-sigma performance? Of course, the first option is to improve method performance, but that often depends on the cooperation of the manufacturer. Today's highly automated systems may not allow you to make any process improvements on your own. The other option is to beef up the statistical QC procedure so you can detect smaller changes, reject runs having defects, and perform corrective actions to prevent further defects. You still have control of your laboratory QC procedure and can select control rules and numbers of control measurements to assure the quality needed by your laboratory.
Different QC procedures (e.g., different control rules and different numbers of control measurements) have different sensitivities or capabilities for detecting analytical errors. The accompanying graph, called a power function graph, shows the rejection characteristics or power curves for QC procedures that are commonly used in clinical laboratories [2]. The control rules and number of control measurements are given in the key at the right, where the lines, top to bottom, correspond with the curves on the graph, top to bottom. All these QC procedures are for two control measurements (N=2), but with different control rules. The top line is for a 12s control rule, which corresponds to a Levey-Jennings control chart having limits calculated as the mean plus and minus 2 SD. The middle line corresponds to a multirule procedure that employs three different control rules - 13s/22s/R4s. The other lines are single rule procedures with control limits of 2.5s, 3.0s, and 3.5s.
The y-axis gives the probability of rejection from 0 to 1.0 and the x-axis gives the size of a systematic error in multiples of the method standard deviation (s). The probability for an x-value of 0.0 describes the probability for false rejection. The probability for an x-value of 1.5 describes the probability of detecting a systematic shift equivalent to 1.5 times the method standard deviation. For example, if the tolerance or quality requirement were 12% and the method has a CV of 2% (6-sigma performance), then a systematic error of 3% or 1.5s corresponds to the tolerable bias in the six sigma model. A QC procedure having 2 control measurements and control limits set at 3s would have a probability of false rejection of almost 0.00 and a probability of about 0.12, or a 12% chance, of rejecting a production run having a systematic shift or error equivalent to 1.5s.
A laboratory metric that has been used to select and design QC procedures is the critical systematic error [3,4], DSEcrit, which can be calculated from the tolerance or quality requirement defined for the test and the imprecision and inaccuracy observed for the method, as follows:
DSEcrit = [(TEa - biasmeas)/smeas] - z
where TEa is the allowable total error, biasmeas is the observed inaccuracy of the method, smeas is the observed imprecision of the method, and z is a value that specifies a maximum defect rate at which a run should be rejected. This z-value is often set as 1.65, to specify that an analytical run should be rejected when the defect rate reaches 5%, i.e., when 5% of the tail of the distribution exceeds the defined tolerance or quality requirement. Other z-values can be selected to trigger run rejection at a lower maximum defect rate, e.g., a z-value of 2.58 would set a 1% maximum defect rate as the condition for rejection of a run.
This concept of a critical systematic error is shown in the accompanying figure, which illustrates the shift of the distribution to the point where 5% of one tail exceeds the defined quality requirement. This figure corresponds to the 6-sigma process described above, where biasmeas is 0.0%, smeas is 2.0%, and TEa is 12%, therefore TEa is 6 times the standard deviation of the method. The QC design metric, DSEcrit, would be 4.35s, which indicates that a systematic shift equivalent to 4.35 times the standard deviation of the method needs to be detected by the QC procedure. This parameter expresses the QC design objective as a multiple of the standard deviation, therefore it is also a sigma-metric.
To assess the probability of detecting a critical systematic error of 4.35s, the previous power function graph can be used. An x-value of 4.35s would actually be off the x-scale. As evident from the power curves, all the QC procedures shown would provide at least 90% detection of a critical systematic error greater than 4.0s. The best choices would be those procedures having low probabilities for false rejection, such as 13.5s or 13s control rules with N=2.
The sigma-metric that is used in the Six Sigma model to express process capability can be expressed in laboratory terms as follows:
Processsigma = (TEa - biasmeas)/smeas
The relation between the sigma-metrics for process capability and the sigma-metric for QC design is obvious:
DSEcrit = Processsigma - z
The implication is that the error detection capability of the QC procedure should complement the performance capability of the process. With high process capability, the errors that cause defective results will be large and more easily detected by statistical quality control. As process capability decreases, the errors that must be detected get smaller, which requires better detection capabilities for the statistical QC procedure.
When TEa is 12%, biasmeas is 0.0%, and smeas is 2.0%, then the Processsigma metric is 6-sigma. As discussed above, a 6-sigma process can be easily controlled in the laboratory by use of either a 13.5s or 13s control rule with N=2.
On the other hand, if the method had a CV of 2.4%, then this is a 5-sigma process and a systematic shift of 3.35s needs to be detected with a high probability. As shown in the figure here, it is possible to achieve 90% detection with either a 12.5s single-rule procedure or a 13s/22s/R4s multi-rule procedure, each having 2 control measurements per analytical run. The 13.5s and 13s QC procedures that were acceptable for a 6-sigma process no longer provide sufficient error detection. Note that the key at the right of the graph shows the probability of error detection under the column headed Ped. The column headed Pfr give the probability of false rejections.
If the method had a CV of 3.0%, then this is a 4-sigma process and a systematic shift of 2.35s needs to be detected with a high probability. None of the QC procedures with N=2 can provide 90% detection of this systematic error, but it is possible that increasing N to 4 will provide sufficient error detection (which can be demonstrated with similar graph for N=4). If the method had a CV of 4.0%, then we have a 3-sigma process and a systematic error of 1.35s needs to be detected with a high probability. When the critical systematic error is this small, it becomes very difficult provide sufficient error detection by any statistical QC procedure.
As can be seen from these examples, the requisite QC procedure will depend on the process capability. Lower process capability requires more QC to detect smaller errors. Power function graphs that display the critical systematic error - called a critical-error graph - provide one tool for assessing QC performance and selecting the requisite QC procedure. A detailed example is available in the scientific literature that demonstrates the use of critical-error graphs to select QC procedures for a multitest chemistry analyzer [5].
It's also possible to use a simple graphical tool, similar to the method decision chart [6], to characterize process capability, in the following way:
- Define the tolerance or allowable total error, e.g., 12% in this example.
- Scale the y-axis from 0 to the allowable total error, i.e., 0 to 12%.
- Scale the x-axis from 0 to half of the allowable total error, i.e., 0 to 6% in this example.
- Draw lines from a y-intercept of 10% to 12% to x-intercepts of 2.0%, 2.4%, 3.0%, 4.0%, and 6.0%. These lines correspond to process capabilities of 6-sigma, 5-sigma, 4-sigma, 3-sigma, and 2-sigma, respectively.
- Plot the observed imprecision (CV in %) and observed inaccuracy (bias in %) of your method to describe the operating point of your method.
- Identify the line immediately above the operating point to estimate the process capability.
An example graph is shown here. If the performance of a method were plotted as an operating point on this chart, it's location could be related directly to a process capability metric expressed as a number of sigmas, in the manner utilized in Six Sigma Quality Management. For example, if a method had a CV of 2.0%, the operating point would fall on the x-axis at a value of 2.0 units, indicating this is a 6-sigma method. If a method had a CV of 2.0% and a bias of 3.0%, the operating point would fall between the 5-sigma and 4-sigma lines. If a method had a CV of 4.0%, the x-value of 4.0 indicates this is a 3-sigma process, which is the minimum capability that is tolerated in industrial applications.
The sensitivity or error detection capability of a QC procedures can be displayed on a similar type of chart, called a chart of operating specifications, or OPSpecs chart [7]. For example, the OPSpecs chart here displays the error detection capabilities of the same QC procedures whose power curves were shown earlier. Note that the X-axis and the Y-axis are the same as in the previous figure. An operating point of y=0.0% and x=2.0%, which corresponds to a 6-sigma method, lies below the all lines showing the error detection capabilities of the different control rules, all with Ns of 2. All the QC procedures will be able to detect critical systematic errors, but the ones with the lowest false rejections are preferred to minimize the operating costs. The best choices would be the 13.5s or 13s rules with N=2, whose probabilities for false rejection are essentially zero.
However, if a method had a CV of 2.0% and a bias of 2.0%, neither the 13.5s or 13s rule would provide the necessary error detection. The operating point is now above the bottom two lines in the chart. The requisite QC procedure should be selected from those represented by the top three lines in the chart. Considering the false rejection probabilities, the best choices would be the multirule procedure (0.01 probability for false rejection) or the 12.5s single-rule procedure (0.03 probability for false rejection). Use of a 12s rule, i.e., a Levey-Jennings control chart having control limits as plus/minus 2 standard deviation, would have a 0.09 probability for false rejection or a 9% chance that an analytical run would be judged as out-of-control even when performance was stable.
The purpose of this discussion is to illustrate that there are quantitative tools that can be applied to laboratory testing processes to facilitate the implementation of proper analytical methods. Proper means having the necessary precision, accuracy, and quality control. The available tools are called power function graphs, critical-error graphs, and OPSpecs charts. All these tools have been documented in the scientific literature. Power function graphs were described the earliest - in 1979 [2]. OPSpecs charts were developed in the early 90s [7]. The time has come to apply these tools and manage analytical quality in the proper way. Six Sigma Quality Management provides a new opportunity to focus attention on process performance, desirable precision, and requisite QC.
- Westgard JO. Six sigma quality management and desirable laboratory imprecision. http://www.westgard.com/essay35.htm
- Westgard JO, Groth T. Power function graphs for statistical control rules. Clin Chem 1979;25:863-869.
- Westgard JO, Burnett RW. Precision requirements for cost-effective operation of analytical processes. Clin Chem 1990;36:1629-1632.
- Chesher D, Burnett L. Equivalence of critical error calculations and process capability index Cpk. Clin Chem 1997;43:1100-1101.
- Koch DD, Oryall JJ, Quam EF, Feldbruegge DH, Dowd DE, Barry PL, Westgard JO. Selection of medically useful quality-control procedures for individual tests done in a multitest analytical system. Clin Chem 1990;36:230-233.
- Westgard JO. A method evaluation decision chart (MEDx Chart) for judging method performance. Clin Lab Science 1995;8:277-283.
- Westgard JO. Charts of operating specifications (OPSpecs Charts) for assessing the precision, accuracy, and quality control needed to satisfy proficiency testing criteria. Clin Chem 1992;38:1226-1233.
Other Essays:
