Safety Stocks: More about the formula

In a previous post on 2/12/2012, I warned against the blind use of formulas in setting safety stock levels. Since then, it has been the single most popular post in this blog, and commands as many page views today as when it first came out. Among the many comments, I noticed that several readers, when looking at the formula, were disturbed that three of the four parameters under the radical are squared and the other one isn’t, to the point that they assume it to be a mistake. I have even seen an attempt on Wikipedia to “correct that mistake.”

I was myself puzzled by it when I first saw the formula, but it’s no mistake.  The problem is that most references, including Wikipedia,  just provide the formula without any proof or even explanation. The authors just assume that the eyes of inventory managers would glaze over at the hint of any math. If you are willing to take my word that it is mathematically valid, you can skip the math. You don’t have to take my word for it, but then, to settle the discussion, there no alternative to digging into the math.

A side effect of working out the math behind a formula is that it makes you think harder about the assumptions behind it, and therefore its range of applicability, which we do after the proof. If you don’t need the proof, please skip to that section.

Math prerequisites

As math goes, it is not complicated. It only requires a basic understanding of expected value, variance, and standard deviation, as taught in an introductory course on probability.

In this context, those who have forgotten these concepts can think of them as follows:

  • The expected value E(X) of a random variable X can be viewed, in the broadest sense, as the average of the values it can take, weighted by the probability of each value. It is linear, meaning that, for any two random variables X and Y that have expected values,

E[X+Y] = E[X]+E[Y]

and, for any number a,

E[a\times X]= a \times E[X]

  • Its variance is the expected value of the square of the deviation of individual values of X from its expected value E(X):

Var(X) = E[X-E(X)]^{2}= E[X^{2}] -[E(X)]^{2}

Variances are additive, but only for uncorrelated variables X and Y that have variances. If

E[[X-E(X)] \times [Y-E(Y)]]= 0

then

Var(X+Y) = Var(X)+Var(Y)

  • Its standard deviation is

\sigma = \sqrt{Var[X]}

Proof of the Safety Stock Formula

Fasten your seat belts. Here we go:

As stated in the previous post, the formula is:

S = C\times \sqrt{\mu{_{L}^{}}\times\sigma_{D}^{2}+\mu_{D}^{2} \times \sigma_{L}^{2}}

Where:

  • S is the safety stock you need.
  • C  is a coefficient set to guarantee that the probability of a stockout is small enough.
  • The other factor, under the radical sign, is the corresponding standard deviation.
  • μL and σL are the mean and standard deviations of the time between deliveries.
  • μD and σD are the mean and standard deviation rates for the demand.

  \sqrt{\mu{_{L}^{}}\times\sigma_{D}^{2}+\mu_{D}^{2} \times \sigma_{L}^{2}} is the standard deviation of the item quantity consumed between deliveries, considering that the time between deliveries varies.

μD and  \sigma_{D}^{2} are the mean and variance of the demand per unit time, so that the demand for a period of length T has a mean of \mu_{D} \times T , a variance of  \sigma_{D}^{2} \times T, and therefore a standard deviation of \sigma_{D} \times \sqrt{T}. See below a discussion of the implications of this assumption.

Note that the assumptions are only that these means and variances exist. At this stage, we don’t have to assume more, and particularly not that times between deliveries and demand follow a particular distribution.

If D(T) is the demand during an interval of duration T, since:

E \left [ D\left ( T\right ) \right ] = \mu_{D}\times T

Var\left [D\left ( T\right ) \right ]= \sigma_{D}^{2}\times T

we have:

E\left [ D\left ( T \right )^{2} \right ]= Var\left [ D\left ( T\right ) \right ]+ \left ( E\left [ D\left ( T \right ) \right ]\right )^{2}= \sigma_{D}^{2}\times T + \mu _{D}^{2} \times T^{2}

If we now allow T to vary, around mean μL with, standard deviation σL , we have:

E \left [ D \right ] = \mu_{D}\times E\left [ T \right ] = \mu_{D}\times\mu_{L}

E\left [ D^{2} \right ]= E\left [ E\left [ D\left ( T \right )^{2} \right ] \right ] = \sigma_{D}^{2}\times E\left [ T \right ] + \mu _{D}^{2} \times E\left [ T^{2} \right ]

and therefore:

E\left [ D^{2} \right ]= \sigma_{D}^{2}\times \mu _{L} + \mu _{D}^{2} \times \left ( \sigma_{L}^{2}+\mu _{L}^{2} \right )

and

Var\left [ D \right ]= \mu _{L}\times\sigma_{D}^{2} + \mu _{D}^{2} \times \sigma_{L}^{2}

That’s how the variance ends up linear in one parameter and quadratic in the other three!

Then:

 \sigma\left [ D \right ]= \sqrt{\mu _{L}\times\sigma_{D}^{2} + \mu _{D}^{2} \times \sigma_{L}^{2} }

QED.

Note that all of the above argument only requires the means and standard deviations to exist. There is no assumption at this point that the demand or the lead time follow a normal distribution. However, the calculation of the multiplier C used to set an upper bound for the demand in a period, is based on the assumption that the demand between deliveries is normally distributed.

Applicability

The assumption that the variance of demand in a period of length T is \sigma_{D}^{2} \times T implies that it is additive, because, if  T = T_{1} + T_{2}, then

\sigma_{D}^{2} \times T = \sigma_{D}^{2} \times T_{1} + \sigma_{D}^{2} \times T_{2}.

But this is only true if the demands in periods T_{1} and T_{2} are uncorrelated. For a hot dog stand working during lunch time, this is reasonable: the demands in the intervals between 12:20 and 12:30, and between 12:30 and 12:40 are from different passers by, who make their lunch choices independently.

On the other hand, in a factory, if you make a product in white on day shift and in black on swing shift every day,  then the shift demand for white parts will not meet the assumptions. Within a day, it won’t be proportional to the length of the interval you are considering, and the variances won’t add up. Between days, the assumptions may apply.

More generally, the time periods you are considering must be long with respect to the detailed scheduling decisions you make. If you cycle through your products in a repeating sequence, you have an “Every-Part-Every” interval (EPEI), meaning, for example, that, if your EPEI is 1 week, you have one production run of every product every week.

In a warehouse, product-specific items don’t need replenishment lead times below the EPEI. If you are using an item once a week, you don’t need it delivered twice a day. You may instead receive it once a week, every other week, every three weeks, etc. And the weekly consumption will fluctuate with the size of the production run and with quality losses. Therefore, it is reasonable to assume that its variance will be sigma_{D}^{2} times T where T is a multiple of the EPEI, and it can be confirmed through historical data.

You can have replenishment lead times that are less than the EPEI for materials used in multiple products. For example, you could have daily deliveries of a resin used to make hundreds of different injection-molded parts with an EPEI of one week. In this case, the model may be applied to shorter lead times, subject of course to validation from historical data.