Jul 31 2017

Acceptance Sampling In The Age Of Low PPM Defectives

Today, some automotive parts manufacturers are able to deliver one million consecutive units without a single defective, and pondering quality management practices appropriate for this level of performance is not idle speculation. Of course, it is only achieved by outstanding suppliers using mature processes in mature industries.

You cannot expect it during new product introduction or in high-technology industries where, if your processes are mature, your products are obsolete. While still taught as part of the quality curriculum, acceptance sampling has been criticized by authors like W. E. Deming and is not part of the Lean approach to quality.

For qualified items from suppliers you trust, you accept shipments with no inspection; for new items or suppliers you do not trust, you inspect 100% of incoming units until the situation improves. Let us examine both what the math tells us about this and possible management actions, with the help of 21st century IT.

Contents

Deming’s Tipping Point
Background On Acceptance Sampling
Example Setup
Sigma Levels
Common Sense Calculations And Their Limitations
Effect Of Sampling
References

Deming’s Tipping Point

In Out Of The Crisis, in 1986, Deming denounced acceptance sampling as a waste of money and blamed the approach for guaranteeing that “some customers will get defective products.” He even proposed the following criterion for a tipping point from no inspection to 100% inspection:

Tipping\,Point = \frac{Cost\,of\,inspecting\, one\,piece}{Cost\,of\,accepting\,a\,defective}

If the fraction defective of the incoming units is below the Tipping Point, 100% inspection doesn’t pay; otherwise, it does. This formula is simple and logically unimpeachable, except that the “cost of accepting a defective” is not exactly easy to assess.

It is dominated by the damage done to the company’s reputation for quality, which is not easily reduced to a number. The intermediate policy of first and last piece checking is only feasible if the lot you receive presents the parts in the order in which they were made because, otherwise, you cannot interpolate the characteristics of the parts in-between.

It is a method you can apply in your own processes where you practice one-piece flow but rarely on purchased items. We don’t really know where the Tipping Point is but we know that the further downstream defectives are detected, the more damage they do, even if we don’t always know how to put a number on it.

If they pass through Receiving and are detected at Assembly, they disrupt the assembly process, particularly if it involves kitting; if they are detected at Final Test for products they go into, the product units are diverted to a repair area where they may have to be taken apart; if they escape to the market, they may harm customers.

So the real question is whether there is anything we can do, short of inspecting every unit, that can be effective at protecting the downstream process. We assume that we have a way to test or inspect each unit that always produces an accurate diagnosis of good or defective. There are two possible objectives worth pursuing:

Ensuring that production receives no defective.
Detecting changes in the supplier’s performance.

Background On Acceptance Sampling

At Receiving, the only choice other than 0% and 100% inspection is sampling and, because the position of a part in a bin contains no information about its position in the production sequence, random sampling. Are Deming and the Lean Quality approach correct in rejecting acceptance sampling outright, or is there any way it can do some good towards either one of the above objectives?

On this subject, the literature on quality is not helpful. It brings you back to World War II, names like Dodge-Romig, and a confusing alphabet soup with names like AQL, LTPD, AOQL, and ASN, the use or misuse of which being exactly what Deming was criticizing.

For example, the best-known acronym, the AQL, stands for “Acceptable Quality Limit,” and most mistakenly assume it is to be an upper bound on the fraction defective of accepted materials. It is, in fact, the worst quality that the inspection procedure will pass 95% of the time. A plan with an AQL of 1% will rarely reject lots under 1% defective but may pass half the lots that are 10% defective. It is a parameter of the inspection itself, not a measure of outgoing quality.

Example Setup

Let’s focus on an example:

We consume 1,000 units/day of a particular component, delivered once a day.
We have received 1,000,000 units of this component from the same supplier, in 1,000-unit increments.
Among this 1,000,000 units, 3 were found defective.

3 defective parts per million is what the Six Sigma people would call “Six-Sigma level.” If we just accept today’s lot and send it to production without any inspection or test, what is the risk that it will contain at least one defective? Once nuance is that the Sigma levels, as shown in the following table, are actually for defects per million opportunities(dpmo) while we are looking at defective parts per million (dppm), which is not the same.

An automatic transmission case, for example, has more than 2000 critical dimensions, each of which is a defect opportunity, and you count a case as defective if it has at least one defect. 3 dppm represents a much higher performance than 3.4 dpmo, but it has been achieved in the auto parts industry.

Sigma Levels


Sigma Level	Defects Per Million Opportunities
1	690,000
2	308,537
3	66,807
4	6,210
5	233
6	3.4

Common Sense Calculations And Their Limitations

The common sense approach is to infer from the supplier’s past performance that each unit delivered is good with probability:

p = \frac{999,997}{1,000,000} = 0.999997

The next 1,000-unit lot is free of any defective with probability:

P(\textup{defect-free\,lot}) = 0.999997^{1000} = 99.7\%

and therefore, a lot with at least one defective unit will occur on the order of once a year. This simple logic, however, breaks down when we have observed zero defects, because, regardless of the lot size or the number of units you have already received, it gives you:

P(\textup{defect-free\,lot}) = 100\%

It doesn’t make sense, because we know that a perfect record does not guarantee a perfect performance on the next lot, particularly when the record is short, for example when you have received only one lot of 1,000 parts, as opposed to 1,000,000.

The key to resolving this dilemma is to start from the assumptions you make before you have received the first 1,000-unit lot from this supplier and refine them based on what you learn as you receive more. The assumption that all the numbers of defective lots from 0 to L are equally likely is called the uniform prior. If you start from the uniform prior, Laplace’s rule of succession is that the probability of having lot L+1 defect-free given that c of the previous L lots were defect-free is:

P(\textup{Lot\,L+1\,is\,defect-free} ) = \frac{c+1}{L+2}

Going back to our example of 3 defective units out of 1 million received. They could have been all in the same lot, in two different lots, or in 3 different lots, the latter one being the most likely. In any case, we know from historical records which it was. If they were in 3 different lots and we plug in c = 997 and L= 1,000 into Laplace’s formula, we get 99.6%, which closely matches the common-sense result above.

Unlike the common-sense formula, however, Laplace’s gives us answers for the zero defects case, regardless of how many lots we already received. For the first one, L = c = 0, we get 1/2; for the second one, 2/3; for the 1000th one, 1001/1002 = 99.9%.

The probability for the next lot to be free of defects is 99.6%, and it rises slightly with the receipt of each new defect-free lot. At today’s level, it means that, on the average, 1 out of every 250 lots will contain one defective, and that it will occur, on the average, just above once a year.

How much effort should you put into planning a response for a once-a-year event? Not much, perhaps, unless you apply control chart thinking and interpret the occurrence of a defective as proof that the supplier’s process has failed.

With the supplier’s performance as demonstrated over 1,000,000 units in the past three years, any defective in today’s lot is strong evidence that it is no longer performing at the same level, that the defective has an assignable cause, and that you should (1) issue a Quality Problem Report (QPR) to the supplier and (2) inspect all incoming units until you are confident that performance is back up.

Effect Of Sampling

To consider sampling we need to switch focus from the “true or false” of lots being defect-free or not to the number of defectives in a lot. If not intuitively obvious, Laplace’s rule of succession gives a formula that is easy to calculate but it is at the macro level of lots and it does not allow us to quantify the possible effect of inspecting a random sample from the new lot on the risk of passing defective units to production. For that, we need to work at the micro level of individual units within lots.

The Formula

At this level, the uniform prior is that, before we knew better, to us, all fractions defective were equally likely. What we now know is that, over three years, we have received

$N_{0} = 1,000,000$ units from the supplier, of which $R_{0} = 3$ were defective.

Today, we received $N_{1} = 1,000$ , of which $R_{1}$ are defective.

Let’s find the probability that a lot sent on to production without any sampling is free of defectives, meaning $R_{1} = 0$ . Then we’ll see if we can increase this probability by sampling.

We can base our thinking about the supplier’s process on Deming’s red bead experiment. Defectives and good units are mixed in an urn, and the supplier’s process is akin to pulling a paddle load of beads — a lot — from the urn. In our example, the urn contains 1,001,000 units.

We have already pulled 1,000,000 beads, found three reds, and are wondering whether the remaining 1,000 are all white. We wouldn’t think of an in-house manufacturing process this way, particularly if it is one-piece flow. Instead, we would view its output as a sequence of units, with the sequence itself carrying information about the process.

This is why, as discussed above, we may do first-and-last-piece checking on a production run. On a supplier process, we usually do not have this level of information. Let’s introduce a few notations:

Red Bead Experiment, by Michael Arthur Johnson

$N_{0}$ is the number of units on which we have full information.
$R_{0}$ is the number of defectives found among the $N_{0}$ parts.
$N_{1}$ is the size of the lot we just received.
$R_{1}$ is the unknown number of defectives in the lot.
$p\left ( r | N_{1}, N_{0}, R_{0} \right )$ is the probability that $R_{1} = r$ , given what we know.

Based on the theory of red and white beads picked from urns without replacement, using Bayes formula, E.T. Jaynes gives us the following, general formula for $r = 0,..., N_{1}$ :

p\left ( r | N_{1}, N_{0}, R_{0} \right ) = \frac{\binom{R}{r}\times\binom{N - R}{N_{1} - r}}{\binom{N+1}{N_{0}+1}} \textup{ where }N = N_{0} + N_{1} \textup{ and }R = R_{0} + r

If you put yourself in the shoes of Walter Shewhart in 1925, you can’t use this theoretical formula because you can’t compute binomial coefficients with large arguments using slide rules, books of tables, and adding machines. So you reach for approximations. If you are picking 1,000 beads from an urn that contains 1,000,000, it doesn’t make much difference whether you put the beads back into the urn or not.

If you do put them back in, you can use the simpler binomial distribution, which is itself approximated by a Gaussian with mean $N_{0}p$ where

p = R_{0}/N_{0}

and standard deviation

$\sigma = \sqrt{N_{0}\times p \times (1-p)}$ .

With $p$ sufficiently low, you can even approximate it with the even simpler Poisson distribution with mean $N_{0}p$ and standard deviation $\sqrt{N_{0}\times p}$ .

This is the thinking behind the use of p-charts to detect shifts in supplier performance and, to this day, I have not seen any literature on statistical quality proposing to use the original formula. But Shewhart didn’t have Excel, R, Minitab or any of the other tools we have in 2017 to instantly evaluate the formula without worrying about the validity of any approximation.

Can Acceptance Sampling Move The Needle?

In our example, assuming no change in the supplier’s capability, we now know that, if we just pass the lot through to production, it will be free of defectives with a 99.6% probability. Is there anything we can do by drawing a sample from the current lot to increase this number? If we draw and inspect a sample of n units, then:

The number of units we have information on grows to $N_{0}+n$ .
The number of unknown units in the lot shrinks to $N_{1}-n$ .

Dealing with a supplier who has delivered 3 defectives out of the last 1,000,000, we also know that we need to consider only perfect samples because, if there is anydefective in the sample, then there is one in the lot, and we already know that this is enough cause to issue a QPR to the supplier and inspect 100% of this lot and future lots until the problem is solved.

The quantity we are looking for is therefore:

c(n) = p\left ( 0 | N_{1}-n, N_{0}+n, R_{0} \right )\textup{ for }n=0,...,N_{1}

where:

$n = 0$ is the previous case, where we accept the lot with no inspection.
$n = N_{1}$ is 100% inspection.

With:

$N_{0} = 1,000,000$
$R_{0} = 3$
$N_{1} = 1000$

the results are:


Sample Size	Probability of Defect-Free Lot	Mean Time Between Defectives
0	99.6%	249 Days
260	99.7%	332 Days
510	99.8%	499 Days
760	99.9%	999 Days

Is Acceptance Sampling Worth Doing?

These are not the kinds of ratios you usually look for in sampling. Prior to 2016, political polls, for example, used to predict voting by 130 million people to ±3% for each candidate, based on samples of about 2,000 voters, or about 15 voters/million. Here, we are talking about at least 1 in 4. In addition, the mean numbers of days between defectives also have to be considered in light of the following:

They are based on the assumption that the supplier’s quality performance stays the same.
The mean times between defectives may be large with respect to the remaining product life, for example, If you are going to be using the item for only one more year.

Given the rarity of defectives at this level, rather than setting up an incoming inspection or test operation for samples, you get better control through go/no-go gauge checking and mistake-proofing devices integrated with the production process. While possibly less thorough than an incoming inspection or test, they apply to 100% of the parts, do not delay the flow of materials, and do not add labor.

Detecting Changes In The Supplier’s Capability

As indicated above, we are working in a range of quality performance where a single defective in a lot is sufficient evidence of a shift in the supplier’s process to warrant action. Besides issuing a QPR to the supplier, this action usually includes performing 100% incoming inspection or test until you solver the problem, not acceptance sampling.

References

Grieve, A.P. (1994) A further note on sampling to locate rare defectives with strong prior evidence, Biometrika, Volume 81, Issue 4, December 1994, Pages 787–789,
Jaynes, E.T. (2003) Probability Theory: The Logic of Science, Cambridge University Press.

#Deming, #Jaynes, #AcceptanceSampling, #LeanQuality,#DPMO, #SixSigma, #SPC

By Michel Baudin • Laws of nature • 6 • Tags: Acceptance Sampling, DPMO, Lean Quality, Six Sigma, SPC

6 Comments

Renaud Anjoran
August 1, 2017 @ 12:09 pm

Nice demonstration. It really doesn’t move the needle in this situation.
It’s amazing how few people really understand the ‘AQL’. You are right, an AQL of 1.0% doesn’t make it impossible to accept a lot containing 10% of defectives (especially for small samples). Even for a sample of 200 units, lots containing over 5% of defectives can be expected to be accepted 10% of the time.

- Michel Baudin
  August 1, 2017 @ 1:07 pm
  
  It’s good to hear from you. I don’t blame any user for being confused by AQLs, but I do blame the people who came up with such an acronym. The sooner we retire it, the better.
  
  - Renaud Anjoran
    August 2, 2017 @ 12:51 am
    
    The AQL originally stood for “Acceptable Quality Level”. That was extremely confusing and carried another meaning. Then it was renamed into “Acceptance Quality Limit”. Better terminology. But the concept and, above all, its applicability and limits, are still widely misunderstood.
Is SPC Obsolete? (Revisited) | Michel Baudin's Blog
December 10, 2017 @ 12:33 pm

[…] this blog, simple Bayesian methods were applied in the discussion of Acceptance Sampling In The Age Of Low PPM Defectives to questions that cannot be answered otherwise, like “When all previous deliveries of an […]

Kyle Harshberger
March 10, 2019 @ 7:09 am

I love how simple Deming’s approach is to the inspection problem. The math is so straight forward that counter arguments should be easily refuted.

One difficulty implementing it is likely with quality departments themselves. Taking away sampling inspection work is likely the bulk of their job. I suppose this is one of the fundamental transformations of business he spoke of.

Process Control and Gaussians
March 5, 2024 @ 1:49 pm

[…] however, there are large-scale examples of industries, like automotive parts, that can deliver 1 million consecutive units without a single defective, making them instances of the first case. In fact, as a process improves, the rarefaction of true […]