Mar 12 2019
More About the Math of the Process Behavior Chart
In statistics on time series with “moving” in their name, each value is correlated with past and future neighbors — that is, the series is autocorrelated. It affects the way you can use these statistics to detect anomalies and issue alarms.
The moving range in the XmR chart is a case in point. Its autocorrelation in the moving range chart is self-inflicted. It is autocorrelated by construction, regardless of whether the raw data themselves are.
Some raw data are autocorrelated. For example, when you issue a replenishment order for a part by pulling a Kanban from a bin, you are assuming that the demand for a coming period to match that of the period that just elapsed, with minor fluctuations. Implicitly, you are leveraging the autocorrelation of the part consumption across periods.
On the other hand, if a physical characteristic of a manufactured part is the sum of a constant and noise, then the noises are independent, and therefore uncorrelated. Taking moving ranges introduces an autocorrelation between consecutive values that is absent in the raw data.
Contents
The Autocorrelation of Moving Ranges
As Don Wheeler acknowledges:
“Since each individual value is used to create two moving ranges, the computations can create correlations within the moving range values.”
He even included the following example to dramatize this point:
Wheeler points out that this prevents the use of runs in analyzing the chart. A chart is supposed to let the reader see patterns but the presence of autocorrelation here makes the most visible patterns an artefact of the technique instead of real information.
Two Strategies to Deal With Autocorrelation
If you don’t want to ignore the autocorrelation of moving ranges, you can:
- Replace them with a statistic that doesn’t have autocorrelation.
- Use a model that takes it into consideration instead of the plain upper control limit of the mR chart.
Eliminating autocorrelation by skipping every other point
So why choose a statistic that creates artificial complexity? One way to get rid of autocorrelation is to only include every other point in the range chart. If you have a series of measurements X_1,...,X_n,..., two consecutive values of the moving range are R_i = \left | X_i - X_{i-1} \right | and R_{i+1} = \left | X_{i+1} - X_i \right | are correlated because they have the term X_i in common.
If you skip every other point, two consecutive points in the range chart will be R_i = \left | X_i - X_{i-1} \right | and R_{i+2} = \left | X_{i+2} - X_{i+1} \right | that have no common term. They are functions of different independent variables and therefore independent. If you see a run on this chart, it is as meaningful as on the X chart. The downside is that you are accumulating moving range points at half the speed of the X chart.
Using an autoregressive model
If you want to plot every moving range, you need to use a different and slightly more complicated model, called AR(1), of the form:
where \alpha, like \mu_R, are coefficients estimated on a training set of data and the W'_i are independent, identically distributed noises. On future data, we can predict R_{i+1} from R_{i} and issue alarms when the W'_i exceed a treshold.
Visualizing the AR(1) model
It is visually less appealing that than the mR chart, because you don’t have an Upper Control Limit in the form of a flat, straight line. This is seen in the following figure, generated from a simulation with both limits corresponding to a 99% level, or p = 1%:
The real issue is whether it performs any better in terms of avoiding false alarms and effectively flagging real ones. To answer this question, we need to get quantitative.
Quantification of Moving Range Autocorrelation
If you have a series of measurements X_1,...,X_n,..., two consecutive values of the moving range are R_i = \left | X_i - X_{i-1} \right | and R_{i+1} = \left | X_{i+1} - X_i \right | have the term X_i in common.
Visualization
If, instead of the absolute values R_i you take the differences D_i = X_i - X_{i-1} = W_i - W_{i-1}, where W_i = X_i - \mu you can calculate the correlation between D_i and D_{i+1}.
The cross terms vanish in the expected value of D_i\times D_{i+1} and therefore:
E(D_i\times D_{i+1}) = -E(W_i^2) = -\sigma^2Since D_i is the sum of two independent, centered Gaussians with standard deviation \sigma,
And the correlation between D_i and D_{i+1} is therefore -1/2.
The ranges R_i are the absolute values of the D_is and the math is not so easy. You can, however, estimate the correlation between R_i and R_{i+1} from 100,000 simulated values at 0.224. As seen in the following figures, it is too low to stand out in a scatterplot, yet has a p-value of 2.2\times10^{-16}.
The scatterplot of moving ranges is obtained by folding all the quadrants by symmetry around the axes onto the quadrant with D_i \geq 0 and D_{i+1} \geq 0 . This blurs the picture so much that the autocorrelation is no longer visually obvious.
The plotting method manages to show 100,000 points on a few square inches without producing a large blob of overlapping points. The trick is to treat it as a heat map. The area is divided into small hexagons that are colored in various shades from blue to red based on the number of points they contain.
Why 100,000 points when, in his article, Wheeler uses 150 Camshaft Bearing diameters? More generally, classical SPC bases process capability studies on sets of a few dozen points, essentially because the IT of the time it was developed did not allow you to work with larger data sets.
Today, final test at the end of a car engine assembly line produces a vector of characteristics every 30 seconds, or about 1,600 times per day. To get 100,000 actual data points, you query the history database for the past 63 work days. Simulating 100,000 points, today, is instantaneous.
Comparative performance
The point of the moving range chart is to detect shifts in the size of the noise, which is determined by its standard deviation \sigma .
If we use our simulated 100,000 points with \sigma =1 as a training set, the ideal control system would never issue an alarm as long as \sigma =1 and would immediately react when it shifts to \sigma > 1.
No real method can actually do this. They all generate some false alarms and do not always notice a shift immediately but, using simulated testing sets, we can measure how many points it takes before a system issues a false or a real alarm.
Using 10 simulated testing sets, and values of \sigma from 1 to 4, we count how many points we go through before issuing an alarm, and take the average over all the simulations. The best performance is a high number for false alarms at \sigma =1 and a low number for real alarms \sigma > 1. The following figure compares, on those terms, the following three methods:
- The upper control limit (UCL) for the mR chart.
- The AR(1) 99% limit for p = 1%, the same level as the UCL.
- The AR(1) 99.7% limit for p = 0.3%, the same level as the control limits of the X chart.
The chart shows that the autoregressive model produces fewer false alarms and is as effective at the UCL for \sigma\geq 1.6. For smaller shifts in \sigma, the UCL responds faster. Again, the relative importance of avoiding false alarms versus rapidly detecting small shifts in \sigma depends on the maturity of the process.
Conclusions
Manufacturing, engineering, and business in general produce all sorts of time series and all business professionals have to analyze them in some fashion.
Plot the time series
As pointed out by authors like Don Wheeler or Mark Graban, the worst possible way to use this data is to look only at the last value. Because it’s time-dependent data, you need to consider its history by plotting values against time, which is easier to do if it is just a number, as opposed to a multidimensional vector.
Consider the meaning of the data
Before applying any tool to a time series, you need to ponder what the plot is visually suggesting in light of the nature and origin of the data. As discussed in my commments on Mark Graban’s article on Oscar TV Viewership, it makes a difference whether the numbers represent the performance of a baseball player, a rep’s sales volume , or the torque produced by an engine.
Use a panoply of analytical tools
Then you can apply a variety of tools to confirm or refute conjectures the plot suggests, or to identify patterns that are plausible given the nature of the data but do not stand out in the plots. Looking at an mR chart, for example, you would not guess that consecutive values are correlated and its not visible in the scatter plot of pairs of consecutive values. You assume they are correlated based on the nature of the moving range, and calculations confirm it.
The XmR chart is not a panacea
In another paper, Wheeler identifies W.J. Jennett as the creator of the XmR chart in the UK in 1942. Wheeler then explains that he brought it back from obscurity in the 1980s and has since been promoting it as a universal tool for analyzing time series.
Wheeler’s view is far from the mainstream in time series analysis. The XmR chart is nowhere to be found in the technical literature on this topic and only has at most brief mentions in the literature on statistical quality. According to Wheeler, even Deming didn’t know about it in 1985.
Swiss Army knives are a last resort
In this article, Wheeler also describes the XmR chart as a “Swiss Army knife.” The Swiss Army knife, however, is a tool of last resort. It has many functions but doesn’t perform any of them as well as a special purpose tool. You can use it to open a can of beans but you don’t unless you have to, because a specialized can opener works better.
As discussed in an earlier post, the only multifunction tools that outperforms special-purpose rivals is the computer, and it has changed the game in data science. Like all the tools of SPC, the XmR chart predates its invention.
References
Shumway, R.H. & Stoffer, D.S. (2017) Time Series Analysis and Its Applications, Springer, ISBN: 978-3319524511
Montgomery, D.C., Jennings, C.L., Kulahci, M. (2016) Introduction to Time Series Analysis and Forecasting, Wiley, ISBN: 978-1118745113
Montgomery, D.C. (2012) Introduction to Statistical Quality Control (7th Edition, Wiley, ISBN: 978-1118146811
Pyzdek, T. & Keller, P. (2013) The Handbook for Quality Management, McGraw-Hill, ISBN: 978-0-07-179924-9
#XmR, #Autoregression, #SPC, #ControlCharts
Dr Tony Burns
March 13, 2019 @ 8:52 am
Comment on LinkedIn:
Michel Baudin
March 13, 2019 @ 8:52 am
I am interested in the theories behind tools and the theories matter to their usefulness. As for computers, they are everywhere. The people and organizations who are best at using them win, in politics, business, sports,… What’s so great about using techniques that predate them? “No computer” means that you are helpless with a dataset of 100,000 points in 45 dimensions, which is what you routinely get today.
Dr Tony Burns
March 13, 2019 @ 8:52 am
Comment on LinkedIn:
Michel Baudin
March 13, 2019 @ 8:52 am
As Wheeler pointed out, you can’t trust runs on a moving range chart because of autocorrelation. It’s a flaw of the method. Try and explain that to an operator!
The problem is establishing and maintaining process capability, by whatever means work. If you want to keep pushing World War II vintage manual methods for these purposes, it’s your choice.
Dr Tony Burns
March 13, 2019 @ 8:52 am
Comment on LinkedIn:
Michel Baudin
March 13, 2019 @ 8:52 am
[latexpage]
I am quoting an article by him that is specifically on autocorrelation of moving ranges, which, again, isn’t there because of any autocorrelation in the original data but because two successive moving ranges and are both functions of .
It’s an artifact of the method, and it’s Wheeler who points out that it screws up the interpretation of runs. You can just follow the link in my post to see his article. The article you point to is about autocorrelation in the raw data. What I am discussing is the autocorrelation between moving ranges, which is an artifact of the method.
In the article you reference, Wheeler’s conclusion is that you should XmR charts regardless of autocorrelation. Mine is to use the model that works best. In the presence of autocorrelation, it may well be an autocorrelated time series model.
Autocorrelated models weren’t worked out until 20 to 30 years after the XmR chart but, today, represent a major chunk of the literature on time series. The keywords are AR, ARMA, or ARIMA.
Process Control and Gaussians
March 5, 2024 @ 2:24 pm
[…] More About the Math of the Process Behavior Chart (2019) […]