Measuring QC Efficacy: A Proposal

[The featured image is of a Vietnamese satellite undergoing final test in Japan]

As Jay Bitsack pointed out in his comments on LinkedIn about my previous post, the portability of a method from epidemiology to manufacturing quality is not a foregone conclusion. Formally, the logic of validating a vaccine seems applicable to the solution of a quality problem. They look similar when you consider only outcomes in terms of infection rates or the proportion of defectives.

There are differences between data sets from a clinical trial and tests run before and after a process change in production that may affect the applicability of a method. We examine the conditions for the approach developed by Carlo Graziani for vaccine efficacy to cross over to quality control. Then we work out the math of Graziani’s method and the means to apply it.

Data Science In Health Care And QC

Epidemiology and pharmacology have kept up with new developments in data science. Manufacturing quality control has not. It has remained stuck at the World War II level of statistics and should learn newer techniques from other domains.

Graziani’s method is Bayesian, an approach that has been around for 250 years, but SPC/SQC ignores it. The founders of this discipline were trained in a different school of thought. Bayesian methods, generally, are of interest because they support learning, meaning refining models based on new data.

In Quality Control, in particular, they provide answers to questions that the classical methods can’t, as in the case of the previous post: when you receive 0 defectives out of N units from a supplier, how do you estimate a confidence interval for the defective rate in the supplier’s output? Here we show how, within a range of applicability, Graziani’s method produces a probability distribution for this rate, based on prior knowledge and incoming quality data.

The method described here is not, to our knowledge, currently applied anywhere in manufacturing. The point is to encourage practitioners to try it, as it is mathematically no more complex than the classical methods they are using.

Efficacy and Effectiveness

“Efficacy” is not a manufacturing term. It’s a health care concept, quantifying the ability of a drug or a treatment to achieve a result. In manufacturing, or more generally in business, it sounds like effectiveness.

However, in most discussions, effectiveness is usually an attribute that a policy or an action may have rather than a metric. You might say, for example, that, when given to a whole population, a vaccine with 80% efficacy is effective at stopping an epidemic.

In a Technical Review for the Agency for Healthcare Research and Quality, Gerald Gartlehner et al. explained the distinction as follows:

“Efficacy trials determine whether an intervention produces the expected result under ideal circumstances. Effectiveness trials measure the degree of beneficial effect under ‘real world’ clinical settings.”

As discussed in the last post, the efficacy of a vaccine is the reduction it causes in the probability of infection with a given disease per unit time per person. You estimate it by comparing properly selected samples of recipients of the vaccine and a placebo.

Diseases versus Defects

What is the difference with the solution of a quality problem? As this solution eliminates a cause of defects, it should reduce the probability that a product unit is defective.

Infections Versus Defects

Unlike communicable diseases, defects are usually not transmitted from one unit to the next. If you mount the wrong gasket on a part, it doesn’t affect other parts. Except for biological contamination in foods or drugs, defects are introduced by variations or errors in processing. handling, storage but not by contagion.

That the mechanisms are inherently different does not mean that the same probability models cannot be used. They just must be validated from the data.

In-house Versus Supplier Defects

If the defects originate in your own production lines, although it is most often overkill, you can validate the solution with a statistically designed experiment you could call a “clinical trial.”

Otherwise, the defects originate with a supplier. Then you compare the population of parts before and after the supplier implemented the solution, assuming it is the only difference between the two populations. The passage of time, however, may be associated with other changes.

For example, as the area of the plant transitions from a dry to a wet season, humidity on the shop floor may affect quality in ways that are unrelated to process improvements and will be reversed with the next seasonal transition. Only if no such external issues exist can you use before-and-after groups to validate the solution.

Defects and Defectives

The quality literature doesn’t bother defining defects and defectives, and translations of Japanese books don’t make the distinction. In the ASQ Glossary, a defect is

“A product’s or service’s nonfulfillment of an intended requirement or reasonable expectation for use, including safety considerations.”

In plain language, a defect is anything that makes a product malfunction; a defective is a unit of product that has at least one defect.

At Final Test on a production line, the most common policy is to stop testing as soon as you find a defect, with thorough testing for all defects reserved for off-line failure analysis. Therefore, the testing sequence has a bearing on the frequency with which you observe a particular defect. As discussed in Revisiting Pareto, when multiple defects can be found on the same unit, eliminating the tallest bar may increase the height of its runner-up.

If, in response to Quality Problem Report, a supplier implements a solution, it should reduce the proportion of defectives in the shipments from this supplier, reflecting the beneficial effect of the solution. In the health care vocabulary, it would be its effectiveness. The supplier is likely to test first for the defect the solution is supposed to eliminate. Then we can know how thoroughly this solution eliminates this specific defect, in other words, its efficacy

Applying Graziani’s Method to Defectives

The same method applies whether we are looking to characterize the effectiveness or the efficacy of the solution, with a sharper focus in the latter case, where the data is free of the effects of other defects. As in the post on the COVID-19 Pfizer Vaccine study on teens, you start with a proportions test to establish that the solution did some good and then quantify how much.

Graziani’s explanations are reasonably easy to follow, with one exception. In his formulas, he uses the letter e to designate both the vaccine efficacy and the constant used in the exponential function. To avoid confusion, in what follows, I will use v instead for the efficacy or effectiveness of the solution.

Why Poisson Models of Defect Occurrences

Different probabilistic models are in use for defect occurrences.

The Bernouilli Model For Single Parts

If you make a single unit of product made under given process conditions, you get 1 defective with a probability p and 0 defectives with probability 1-p. The number of defectives is said to be a Bernouilli variable. It is the simplest probability distribution, with mean  p and standard deviation  \sigma =\sqrt{p\times\left ( 1-p \right )}

The Binomial Model For A Lot, Batch, or Shipment of Parts

In a batch or shipment of  n units made independently under the same conditions, the number of defectives is the sum of the Bernouilli variables of each unit. It’s the binomial distribution  B\left (N,p \right )with mean N\times p and standard deviation \sigma =\sqrt{N\times p\times\left ( 1-p \right )}.

The probability of having k defectives out of N units is:

P\left ( k\: defectives\: out\: of\; N \right )= \binom{N}{k}p^k\times\left ( 1-p \right )^{\left ( N-k \right )}

The Gaussian (a.k.a. “Normal”) Approximation

Because the coefficient \binom{n}{k} is the number of subsets of k units within a batch of N, it involves complicated combinatorics as soon as N  and k grow beyond a handful. For this reason, you quickly replace the binomial with approximations that are easier to handle, most commonly the Gaussian with mean N\times p and standard deviation \sqrt{N\times p\times\left ( 1-p \right )}, also known as Normal or Bell-shaped. It is used, for example, in SPC to set control limits for p-charts.

The Poisson Approximation

Graziani, however, uses a different approximation, the Poisson distribution, applicable when N is large and p small, and previously discussed in Series of Events in Manufacturing. It is the most basic model for counts of events that occur independently at a constant mean rate over time during an interval. In this case, it has a mean of N\times p and a standard deviation of  \sqrt{N\times p}. Then:

P\left ( k\: defectives\: out\: of\; N \right )= e^{-N\times p}\times\frac{\left ( Np \right )^k}{k!}

The formula would compute values for k>N but because p is small, the values are vanishingly small long before k approaches N. For example, if  N= 100 and p= 1\% , then

P\left (1\: defectives\: out\: of\; 100 \right )= 37\% and

P\left (10\: defectives\: out\: of\; 100 \right )= 0.27 ppm

For large N is large and small p, the Poisson approximation is not only easier to work with than the Gaussian but also closer to the Binomial. The means of the Poisson approximation and the Binomial are identical. As shown in the following figure, their moments up to order four are close matches for  p \leqslant 3\%.

Their variances, or standard deviations, represent how spread out they are; their skewness, how asymmetric; their kurtosis, how “pointy.”

The Poisson model approximation makes sense when you are making 1,000 units/day in an industry where 1ppm defective is acceptable and 3\% is a disaster, like auto parts, pharmaceuticals, vacuum cleaners, or processed foods. You wouldn’t use it when making 10 units/day, 20\% of which are defective.

Graziani’s method

This section explains the math of Graziani’s method in more detail and, hopefully, in a way that is easier to follow than Graziani’s own paper.

Notations

Let’s use the following notations:

• N_b and N_a designate numbers of parts produced before and after the solution was implemented.
• k_a and k_a are the numbers of defectives found respectively among the N_b and N_a units.
• N_t = N_b + N_a
• k_t = k_b + k_a is the total number of defectives observed before or after.
• r= \frac{N_a}{N_b} is the ratio of the after and before sample sizes.
• p_b and p_a are the corresponding probabilities for any unit to be defective.

This can be expressed in the following contingency table:

\begin{matrix} \textbf{Group} & \textbf{Defective} & \textbf{Good} & \textbf{Total}\\ Before&k_b &N_b-k_b & N_b\\ After&k_a &N_a-k_a & N_a\\ Total& k_t&N_t - k_t & N_t \end{matrix}

We are trying to estimate v =1- \frac{p_a}{p_b} based on prior information and the observations of k_b defectives out of N_b and k_a defectives out of N_a. What Graziani’s method gives us is a probability distribution for  v that becomes narrower as N_a grows.

Sample Probabilities

For the Before and After samples, we have:

• P\left ( k_b\: defectives\; out\; of\: N_b\right ) = e^{\left (p_b\times N_b \right )}\times \frac{\left (p_b\times N_b \right )^{k_b}}{k_b!}
• P\left ( k_a\: defectives\; out\; of\: N_a\right ) =e^{\left (p_a\times N_a \right )}\times \frac{\left (p_a\times N_a \right )^{k_a}}{k_a!}

And, by independence, the joint probability is:

P\left ( k_b,N_b,k_a,N_a\right ) = P\left ( k_b\: defectives\; out\; of\: N_b\right )\times P\left ( k_a\: defectives\; out\; of\: N_a\right )

Introducing Efficacy

Remembering that what we are after is the distribution of v, given  k_b,N_b,k_a,N_a, we can eliminate p_b and p_a from the equations by using the distribution of the total number of defectives k_t, which follows the Poission distribution with mean

\lambda = p_b N_b + p_a N_a = \left [ 1+\left ( 1-v \right )r \right ]p_b N_b

In those terms,

P\left ( k_b,N_b,k_a,N_a\right ) = f\left ( k_b, k_a \right )\times e^{-\lambda}\times\lambda^{k_t} \times \frac{r^{k_a}\left ( 1-v \right )^{k_a}}{\left [ 1+ \left ( 1-v \right )r\right ]^{k_t}}

Separating Efficacy

This is also the likelihood \mathfrak{L}\left ( v,\lambda \right ) at \left ( k_b,N_b,k_a,N_a\right ) and it is the product of three factors. The first of which contains neither v nor \lambda, and the second only \lambda. The likelihood \mathfrak{L}\left (v \right ) of v  is therefore proportional to the 3rd factor:

\mathfrak{L}\left (v \right ) \propto \frac{r^{k_a}\left ( 1-v \right )^{k_a}}{\left [ 1+ \left ( 1-v \right ) r\right ]^{k_t}}

This is a function of the defective counts k_b and k_a with the before and after sample sizes N_b and N_aappearing through their ratio r

Probability Distribution for Efficacy

To obtain a probability distribution for v, we need to multiply \mathfrak{L}\left (v \right ) by the prior distribution \pi\left ( v \right ) representing our prior knowledge of v, or lack thereof, and a normalizing constant \kappa to make it a probability distribution.

\pi\left (v |k_b, k_a\right ) = \kappa\times\pi\left ( v \right ) \times\frac{r^{k_a}\left ( 1-v \right )^{k_a}}{\left [ 1+ \left ( 1-v \right ) r\right ]^{k_t}}

For efficacy, Graziani uses the ignorance prior, in which \pi\left ( v \right ) = 1 for all v between 0 and 100%, meaning that, before seeing any data, we consider all efficacies to be equally likely.

Example Plots of Efficacy Distributions

Using this formula, we can plot the posterior probability distribution base on the ignorance prior and the data, as in the following example: For effectiveness, we are considering eliminating one out of several causes of defects — that is, one bar out of the Pareto charts. The best-case scenario is that the solution eliminates the entire bar and that no other bar grows as a result. If p_o designates the before proportion defective due to other causes, the best possible outcome of implementing the solution is p_a = p_o and the worst p_a = p_b, means that v is constrained to the interval \left [ 0, 1-\frac{p_o}{p_b} \right ]. In this case, the ignorance prior becomes

\pi\left ( v \right ) = \frac{1}{1-\frac{p_o}{p_b}} for 0 \leqslant v \leqslant1-\frac{p_o}{p_b}  and \pi\left ( v \right ) = 0 elsewhere.

The \kappa normalizing constant is then most easily estimated numerically.

From this formula, we can see that the maximum likelihood estimator of v for efficacy is

v_{m}= 1- \frac{k_a/N_a}{k_b/N_b}

Which is as expected. In particular when defectives have been completely eliminated, k_a = 0 and v_{m}= 100\% . As discussed in the previous post, we can’t conclude from this that  v = 100\% . Graziani’s formula, however, lets you plot the distribution of  v based on the data collected to date and determine a confidence interval at whatever level you need.

Conclusions

If efficacy is a useful concept in health care, it should be in manufacturing quality as well. How thoroughly a solution eradicates a problem based on outcomes is something manufacturers want to know. They also know that a number based on 50,000 units is more solid than one based on a 100 and would like to quantify how much more solid. Within a range encompassing massive segments of manufacturing, Graziani’s method provides these answers. Its math is no more complex than the foundations of classical SQC or SPC,  and it requires no more power than a regular laptop computer.