Saturation In Manufacturing Versus Service

In Capacity Planning For 1st Responders, we considered the problem of dimensioning a group so that there is at least one member available when needed. Not all service groups, however, are expected to respond immediately to all customers. Most, from supermarket check stands and airport check-in counters to clinics for non-emergency health care, allow some amount of queueing, giving rise to the question of how long the queues become when the servers get busy.

Patients waiting in Emergency Room

At one point in his latest book, Andy and Me And The Hospital, Pascal Dennis writes that the average number of patients in an emergency room is inversely proportional to the availability of the doctors. The busier the doctors are, the more dramatic the effect. For example, if they go from being busy 98% of the time to 99%, their availability drop by half from 2% to 1%, and the mean number of patients doubles. Conversely, any improvement in emergency room procedures that, to provide the same service, reduces the doctors’ utilization from 99% to 98%, which cuts the mean number of patients —  and their mean waiting time — in half.

The rule cited by Pascal Dennis is clearly precious in setting improvement priorities, where it applies. The following paragraphs explore what basic queueing theory tells us about this and practical ways to check it in a real system.

1. From Production Lines To Queueing Systems

No matter how much they bemoan “variability” in their work, manufacturing professionals operate in a controlled environment where there is less of it than in almost every other human endeavor, and it does not prepare them for the level found in service, where events are driven by independent actions from multiple agents.

1.1. Saturation Of An Assembly Line

When work arrives at a work station at fixed intervals one piece at a time and each piece undergoes the same operation in the same, fixed amount of time, the station can be used 100% of the time without accumulating a queue in front. This is what happens inside an assembly line. If it works at a takt time of, say, 58 seconds and you are at station 15, every 58 seconds a workpiece arrives from station 14, you work on it for up to 58 seconds according to a fixed, standard method, and it goes on to station 16.

It is a deterministic system, even if it exists only in approximations. Real assembly lines are subject to small disruptions and variations in the pace at which assemblers work, which make it impossible to specify exactly 58 seconds of work at every station but still, a moving assembly line physically prevents the accumulation of WIP between stations and does deliver a new workpiece at fixed intervals.

In order fulfillment, queueing will still happen with orders upstream from production and with finished goods downstream, but most of it is engineered out of production. In Manufacturing, there is queueing also in peripheral activities like the arrival and departure of employees at the beginning and the end of each shift, food services, and restrooms. Many factories, for example, require employees to stand in line to punch in and out of the facility, and to spend half their lunch breaks walking to a central cafeteria and standing in line.

1.2. Why Do Customers Appear To Arrive In Clusters?

A station where an individual performs services of varying durations for customers arriving randomly differs from a station on an assembly line in many ways, but particularly in that the waiting line of customers grows explosively as the station’s utilization approaches 100%. In this context, even the notions of workload and capacity must be envisioned differently than in an assembly line. Instead of workpieces fed to the first station at fixed intervals, customers arrive randomly; instead of fixed process times, you have random service times.

Not my sister’s bookstore, which no longer exists.

As a teenager, I occasionally spent Saturday afternoons in my sister’s bookstore in Paris, and observed the flow of incoming customers. They seemed to come in clusters. The store abruptly filled up, and gradually emptied out. Then there was dry spell, followed by another crowd. I found this strange, because the customers were entering from a sidewalk where I saw a steady stream of pedestrians, and they came in as individuals who didn’t know each other. I couldn’t see any clustering mechanism at work, and did not understand what was going on until several years later, when I studied queueing theory.

The most basic model, applicable to bookstore customers, is more complicated than the assembly line case in that it requires some High School math. It assumes independent arrivals as a steady average rate.  You may, for example, observe 100 customers entering the store in 100 minutes, giving you an arrival rate of 1 customer/minute, or a mean time between customers of 1 minute.  Then you assume that the numbers of customers entering in any disjoint time intervals are independent. The math then shows you why your intuition is wrong when it tells you to expect a steady flow of incoming customers. Let’s examine dry spells — that is intervals during which no customer arrives. What is the probability α(t) of a dry spell of duration t?

For the waiting time for the next customer to be t, we must have a dry spell of length t but not of length t+dt, which means that we have a dry spell of length t followed by an arrival between t and t+dt. Because arrivals in [0,t] and [t, t+dt] are independent the arrival rate is a constant λ, and we have 0 arrivals in [0,t] followed by 1 in  [t, t+dt]. Therefore:

\alpha(t) - \alpha(t+dt) = \alpha(t)\times \lambda dt

\alpha'(t) = -\lambda \alpha(t)

and:

\alpha(t) =e^{-\lambda t}

It is the cumulative distribution function of the exponential distribution for interarrival times. λ is the arrival rate, and the probability density function for interarrival times is:

f(t) = \lambda e^{-\lambda t}

When you plot this, you understand why customers appear to arrive in clusters in the absence of any clustering mechanism. Short times between customers occur more often than long ones, and, to the observers, make the customers appear to arrive as a group. The infrequent long intervals, on the other hand, separate these groups and stand out on the timeline, just for being long.

In the literature, this model of customer arrivals is called a Poisson Process with rate λ, and is usually introduced by pulling the math out of a hat, without explaining the assumptions of a constant rate and independent arrivals. The following picture is a simulation of Poisson arrivals at the rate of 1 per minute for two hours:

Simulation of Poisson arrivals at rate of 1 per minute for 2 hours

The shopkeeper who sees customers draining out during the busiest shopping day of the week doesn’t need to fret. It’s the Poisson process at work, and the next gaggle of customers is around the corner.

The assumptions matter because they explain why the Poisson Process fits, for example, the pooled output of many independent series of events, like the failures occurring on 100 different machines and queued for maintenance attention, even when the Poisson process is a poor fit for the failures of any individual machine. In reliability theory, the arrival rate of failures on one machine is called its hazard function, and it is not always a constant.

Likewise, in the absence of a specific disaster or epidemic, each of the, say, 50,000 people served by a hospital has his or her own health issues that may occasionally require a visit to the emergency room. In aggregate, you can expect this population to produce patients arriving independently at a steady average rate, and you can model the patient arrivals as a Poisson Process. Of course, disasters and epidemics do occur, during which this logic no longer applies.

The Poisson Process is not always a match for the more complex queueing behavior of human beings. Instead of joining a long queue, they may balk and walk away. In societies with chronic shortages, as the Soviet Union was, they may instead join a queue without knowing what it is for, in case it is for something valuable. In such situations, the arrival rate of new customers depends on the population already there, and therefore the independence assumption no longer applies.

Some service operations have independent arrivals at rates that vary over time. If you serve food, for example, you can expect peaks around meal times. This can be accommodated in the Poisson Process by using a variable arrival rate λ(t) instead of a constant λ. There are also refinements to model actual clustering, which is what I worked on early in my career, as a mathematician, before I got initiated to factories. The math, however, quickly becomes unable to produce simple formulas, and the most practical way to analyze such processes is by simulating them.

1.3. The Service Process

We have discussed how customers arrive but not how they receive service. There are two issues to consider:

  1. The order in which customers are served. First-In-First-Out (FIFO) is the discipline with the simplest math. FIFO is usually easy to implement with people because it is simple and they perceive it as fair. Even when customers are people, however, there are cases where it doesn’t make sense. In an emergency room, for example, patients who arrive in a life-threatening condition should be treated ahead of those who don’t. In manufacturing, FIFO preserves the process sequence, which is esssential in diagnosing quality problems. Computer operating systems, on the hand, use complex algorithms to allocate slices of processor time to competing processes.
  2. The time it takes to serve each customer. If the service is a flu shot, there is little variation in duration between customers; if it is checking out of a supermarket, it varies, roughly in proportion to the number of items in each shopper’s cart but it is also influenced by the mode of payment; if it is delivering a checked suitcase to an airport baggage claim carousel, it varies with the position of the suitcase in the delivery sequence,…

The simplest model, extensively covered in the literature, is FIFO with a single server and exponential service times. It means that, while a customer is being served, the time to complete service is between t and t+dt with a constant probability μdt, with μ known as the service rate. Similarly to the arrival process, the probability density function of the service time is:

s(t) = \mu e^{-\mu t}

In real service systems, you often have multiple servers for a single queue. At US airports, for example, a single queue of passengers waits before multiple check-in counters. Supermarket check-out stands could be organized the same way but usually are not, because managers believe that it would relieve the pressure on check stand operators and that their productivity would drop.

As indicated above, you also have many cases where service times just are not exponential. Exponential service times are nonetheless of interest in an analysis of saturation because, as we shall see, the key results apply to more general cases.

2. Saturation In Service

Let us now consider the complete arrival-service-departure process, with a single server. In the literature, it is often called a birth-and-death process, each arrival being a metaphorical birth and each departure a death. This is an awkward terminology when discussing hospital patients. Even though it takes longer to say, arrival-and-departure process is more appropriate in this context.

2.1. Saturation With Poisson Arrivals And Exponential Service Times

For the time being, we’ll assume Poisson arrivals at rate λ and exponential service with completion rate μ. In the literature, this is called an M/M/1 queue; with c>1 servers, an M/M/c queue; with another service time distribution, an M/G/1 queue; with another arrival process and service time distribution, a G/G/1 queue, etc. In all these cases, the process reaches a steady state only when the arrival rate λ is less than the service rate μ.

In all these cases, obviously, if customers arrive faster than they leave, the number of customers grows indefinitely. What we’ll show with the M/M/1 queue is that, as λ approaches μ from below, the mean number of customers in the system grows like 1/(1-ρ) where ρ = λ/μ is the utilization of the system. Then we’ll list the results that show this to be true as well for M/M/c , M/G/1, and G/G/1 queues, with few restrictions on the “general” distributions

In an M/M/1 queue, Let Pi be the steady-state probability of having i customers in the system, either waiting or being served. Then P is the probability that there is no customer, and therefore that the server is idle. Then 1-P is the server’s utilization, and we want to know what happens to the queue when this utilization approaches 100%. The Pi are calculated by balancing the in- and outflows of each state. You reach the empty state 0 from the state 1 by having this customer completing service; you leave it when a new customer arrives. Therefore:

\mu \times P_{1} = \lambda \times P_{0}

and

P_{1} = \rho \times P_{0}

For i ≥ 2,

\mu \times P_{i+1} + \lambda \times P_{i-1} = \left ( \lambda + \mu \right )\times P_{i}

This implies

P_{i} = \rho P_{i-1} = ... = \rho^{i}P_{0}

and

P_{0} = 1-\rho

The mean number of customers in the M/M/1 queue is therefore:

L = \frac{\rho}{1-\rho}

2.2. Generalizations

The value of this result is that it applies to more general cases. The math to show it is more complex, and I am not showing the derivation of the formulas here, only their application when the queueing system saturates. The best option to go beyond the cases for which there are formulas is to use simulations.

1) Poisson Arrivals, Exponential Service Times, and Multiple Servers

At airports, to check in or go through border controls, you wait in a single line snaking between barriers, to be served at the first available of a row of counters. The logic is the same when you wait on hold for customer service, or where you take a ticket with a number for service to be provided by any member of a group.

The basic mathematical model for this multiserver FIFO queue is called M/M/c, with c servers, Poisson arrivals, and exponential service times. It does not behave like the single server queue when lightly loaded but does when it is saturated. The utilization is ρ = λ/cμ and  the average number of customers in the system becomes the sum of two terms:

  1. The first one resembles the formula for the M/M/1 queue and accounts for the number of customers in the system when all servers are busy.
  2. The second one is the average number of customers being served.

The formula is as follows:

L =  L = \frac{\rho}{1-\rho} C\left ( c, \rho \right ) + c\rho

where C(c, ρ) , given by the Erlang C formula, is the probability that all servers are busy. Clearly, as the utilization ρ → 1, so does C(c, ρ). C(c,ρ) is not an Excel built-in function, but can be calculated from the Poisson built-in function as follows:

=1/(1 + (1-rho)*POISSON(cee-1, cee*rho, TRUE)/POISSON(cee, cee*rho,FALSE))

where you put the number of servers cee and the utilization rho in correspondingly named cells. In R, there is a package called queueing that includes a function called C_erlang. The following chart plots L as a function of ρ as the number of servers varies:

Mean number of customers in system versus utilization by number of servers

It shows how, as utilization approaches 100%, all the curves merge.

As it approaches 100%, the complex formula for the number of customers in the system reduces to:

L \approx c + A(c)\frac{\rho}{1-\rho} 

Where A(c) is only a function of c. Asymptotically, the M/M/c queue saturates like the M/M/1 queue.

The math tells you that an M/M/c queue outperforms a set of c M/M/1 queues to provide the same service, each with an arrival rate of λ/c. and it makes you wonder why grocery supermarkets have a separate line for each check stand. The problem is that many of the assumptions behind the math are not satisfied in actual operations.

With multiple M/M/1 queues, you assume that customers are equally likely to join any, but the lines are in plain view and the customers can choose the shorter ones, and even switch lines midway. Real check-stand operators, also, don’t have the same service rates. Some work faster than others, and these differences are more visible when each stand has its own line than when they share a common one.

2) Poisson Arrivals, General Service Times, Single Server

For Poisson arrival and any service time distribution that has a mean and standard deviation, an M/G/1 queue, we have the Pollaczek-Khinchin formula:

L = \rho +\frac{\rho^{2}+ \lambda^{2}Var(S)}{2\times(1-\rho)}

where ρ = λ/μ, 1/μ is the mean of the service time S, and Var(S) its variance. Then, when ρ → 1,

L \approx \frac{\rho}{1- \rho}\times A(S)

where A(S) is only a function of the service time distribution, and the queue also saturates like the M/M/1 queue.

3) General Arrival Process, General Service Time Distribution, Single Server

In 1961, John Kingman, in the UK, published a more general formula, for the expected waiting time in a G/G/1 queue, where we have a different distribution of interarrival times, of which we only require that is has a positive mean and a standard deviation. Then the mean waiting time Wq is:

W_q \approx \left ( \frac{\rho}{1-\rho} \right )\times \left ( \frac{c_{a}^{2} + c_{s}^{2}}{2} \right )

where ca and cs are the coefficients of variation, of interarrival times and service times, defined as the ratios of standard deviation to mean. With Poisson arrivals at rate λ, the interarrival times are exponentially distributed, with a mean and standard deviation both equal to 1/λ. Therefore c= 1. With a general distribution of interarrival times, ccan be anything.

To go fromWq to L, we add ρ for the mean number of customers being processed by the server, and multiply by λ to apply Little’s Law.

L = \lambda \times \left( \rho + W_q\right) \approx  \left (\frac{\rho}{1- \rho}  \right )\times \rho \left (1 + \frac{c_{a}^{2} + c_{s}^{2}}{2} \right )

If c is bounded for the arrival rate λ in a neighborhood [μ-ε, μ], then we can bracket L as follows:

\left (\frac{\rho}{1- \rho}  \right )\times A(S) \leq L \leq \left (\frac{\rho}{1- \rho}  \right )\times B(S)

where A(S) and B(S) are functions only of the service times. Again, as ρ → 1, L rises like 1/(1-ρ).

4) More General Situations

The complexity of real queueing systems usually exceeds the best mathematicians’ ability to derive simple formulas, but the performance of such systems can be analyzed through simulations.

3. Conclusions

Pascal Dennis’s assumption that, in a hospital emergency room, as the doctor’s utilization ρ approaches 100%, the number of patients rises like 1/(1-ρ) is a reasonable starting point for steady-state operations, understanding that the dynamic is different in times of epidemics or natural disasters. And you may want to verify it through observations.

One consequence alluded to earlier is that the waiting lines in a saturated system can be massively cut by increasing service rates by just a few percentage points, with diminishing returns as the system becomes less saturated. Going from 99% to 98% in server utilization can cut your mean waiting line in half, but going further from 98% to 95% gets you less.

Once you are out of the saturation zone where you have a clear strategy to cut the mean waiting time in half, you can switch to other forms of improvement, in customer sequencing or in the physical organization of the waiting system. If your system’s customers are people, for example, you may give priority to some of them, and you may organize the waiting so that they don’t personally have to stand in line for hours outside, for example by using numbered tickets.

Generally, waiting is regarded as a waste of time, particularly when it requires standing in line. It has been argued that, in the old Soviet Union, waiting was the main economic activity, because it was how citizens got things and their frustration with constant shortages was a factor in the collapse of the country.

In the US, oddly, some organizations have managed to make people volunteer or even pay to stand in line. For over 60 years, Disneyland has sold tickets for the privilege of standing in line for hours at rides that last minutes, and Apple product launches under Steve Jobs were occasions for fans to wait overnight at stores.

#AirportCheckIn, #Hospitals, #EmergencyRooms, #QueueingTheory, #Service, #Supermarkets

 

 

 

Variability, Randomness, And Uncertainty in Operations

This elaborates on the topics of randomness versus uncertainty that I briefly touched on in a prior post. Always skittish about using dreaded words like “probability” or “randomness,” writers on manufacturing or service operations, even Deming, prefer to use “variability” or “variation” for the way both demand and performance change over time, but it doesn’t mean the same thing. For example, a hotel room that goes for $100/night in November through March and $200/night from April to October has a price that is variable but not random. The rates are published, and you know them ahead of time.

By contrast, to a passenger, the airfare from San Francisco to Chicago is not only variable but random. The airlines change tens of thousands of fares every day in ways you discover when you book a flight. Based on having flown this route four times in the past 12 months, however, you expect the fare to be in the range of $400 to $800, with $600 as the most likely. The information you have is not complete enough for you to know what the price will be but it does enable you to have a confidence interval for it.

Beyond randomness, events like the 9/11 attack in 2001, the financial crisis in 2008, the Fukushima earthquake in 2011, or a toy that is a sudden hit for the Christmas season create uncertainty, a higher level of variability than randomness. Such large-scale, unprecedented events give you no basis to say, on 9/12/2001, when airliners would fly again, in 2008 how low the stock market would go, in 2011 when factories in northeastern Japan would restart, or how many units of the popular toy you should make.

In Manufacturing, you encounter all three types of variability, each requiring different management approaches. In production planning, for example:

  1. When the volume and mix of products to manufacture is known far in advance relative to your production lead time, you have a low-volume/high-mix but deterministic demand. The demand for commercial aircraft is known 18 months ahead of delivery. If you supply a variety of components to this industry that you can buy components for, build and ship within 6 weeks,  you still have to plan and schedule production but your planners don’t need to worry about randomness or uncertainty.
  2. When volume and mix fluctuate around constant levels or a predictable trend, you have a random demand. The amplitude of fluctuations in aggregate volume is smaller than for individual products. In this context, you can use many tools. You can, for example, manage a mixed-flow assembly line by operating it at a fixed takt time, revised periodically, using overtime to absorb fluctuations in aggregate volumes, heijunka to sequence the products within a shift, and kanbans to regulate the flow of routinely used components to the line.
  3. As recent history shows, uncertain events occur, that can double or halve your demand overnight. No business organization can have planned responses to or all emergencies, but it must be prepared to respond when it needs to. The resources needed in an emergency that need to be nurtured in normal times include a multi-skilled, loyal and motivated workforce, as well as a collaborative supply chain.
    In many cases, you have to improvise a response; in some, vigilance can help you mitigate the impact of the event. Warned by weather data, Toyota’s logistics group in Chicago anticipated the Mississippi flood of 1993. They were shipping parts by intermodal trains to the NUMMI plant in California and, two days before the flood covered the tracks, they reserved all the available trucking in the area, which cost them daily the equivalent of 6 minutes of production at NUMMI. They were then able to reroute the shipments south of the flooded area.

The distinction between random and uncertain is related to that between common and special causes introduced by Shewhart and Deming in the narrower context of quality control. In Deming’s red bead game,  operators plunge a paddle into a bowl containing both white and red beads with the goal of retrieving a set of white beads only, and most paddle loads are defective.

The problem has a common cause: the production system upstream from this operation is incapable of producing bowls without red beads. In Deming’s experiment, the managers assume is has a special cause: the operator is sloppy. They first try to motivate by slogans, then discipline and eventually fire the operator. The proper response would have been (1) as an immediate countermeasure, filtering the red beads before the operation and (2) for a permanent solution, working with the source to improve the process so that it provides batches with all white beads every time.

The imprecision — or randomness — of the process is summarized in terms of its capability, which sets limits on observable parameters of outgoing units. Observations outside of these limits indicate that, due to a special cause, to be identified, the capability model no longer matches reality. In the other cases discussed above, the cause is known: you felt the earthquake, or you heard on the news that war broke out… The only challenge you are facing is deciding how to respond.

Deming made “knowledge of variation” one of the pillars of his “system of profound knowledge.” One key part of this knowledge is recognition of the different types of variability described above and mastery of the tools available to deal with each.

#Variation, #Variability, #Randomness, #Uncertainty, #Toyota, #MississippiFloodOf1993, #Deming

How to Pick the Fastest Line at the Supermarket | New York Times [Debunk]

Inside a Whole Foods in Brooklyn (New York TImes)

“[…] Choose a single line that leads to several cashiers

Not all lines are structured this way, but research has largely shown that this approach, known as a serpentine line, is the fastest. The person at the head of the line goes to the first available window in a system often seen at airports or banks. […]”

Sourced through the New York Times

Michel Baudin‘s comments:

No! Research shows no such thing. The serpentine line does not reduce the customers’ mean time through the system. Little’s Law tells us that, in steady state, regardless of how the queue is organized:

{Mean\, time\, in\, system = \frac{Mean\, number\, of\,  customers\, in \, system}{Mean\, service\,  rate}}

Continue reading

A factory can always be improved

Based on an NWLEAN post entitled: Laws of Nature – Pareto efficiency and Pareto improvements, from 3/3/2011 

In manufacturing, Italian economist Vilfredo Pareto is mostly known for the Pareto diagrams and the 80/20 law, but  in economics, he is also known for the unrelated concept of Pareto efficiency, or Pareto optimality, which is also relevant to Lean. A basic tenet of Lean is that a factory can always be improved, and that, once you have achieved any level of performance, it is just the starting point for the next round of improvement. Perfection is something you never achieve but always pursue and, if you dig deep enough, you always find opportunities. This is the vocabulary you use when discussing the matter with fellow production people. If, however, you are taking college courses on the side, you might score more points with your instructor by saying, as an empirical law of nature, that a business system is never Pareto-efficient. It means the same thing, but our problem is that this way of thinking is taught neither in Engineering nor in Business school, and that few managers practice it.

A system is Pareto-efficient if you cannot improve any aspect of its performance without making something else worse. Managers who believe their factories to be Pareto-efficient think, for example, that you cannot improve quality without lengthening lead times and increasing costs, which is exactly what Lean does. In fact, eliminating waste is synonymous with making improvements in some dimensions of performance without degrading anything else, or taking advantage of the lack of Pareto-efficiency in the plant.

When we say that a factory can always be improved it is a postulate, an assumption you start from when you walk through the gates. The overwhelming empirical evidence is that, if you make that assumption, you find improvement opportunities. Obviously, if you don’t make that assumption, you won’t find any, because you won’t be trying.

This is not a minor issue. Writing in the Harvard Business Review back in 1991 about Activity-Based Costing, Robert Kaplan stated that all the possible shop floor improvements had already been made over the previous 50 years. He was teaching his MBA students that factories were Pareto-efficient and that it was therefore pointless to try and improve them. They would do better to focus on financial engineering and outsource production.

The idea that improving factories is futile and a distraction from more “strategic” pursuits dies hard. It is expressed repeatedly in a variety of ways. The diminishing returns argument is that, as you keep reaching for fruits that hang ever higher, the effort requires starts being excessive with respect to the benefits, but there are two things to consider:

  • As you make improvements, you enhance not only performance but your own skills as well, so that some of what was out of reach before no longer is.
  • Competition is constantly raising the bar. If your competitors keep improving and you don’t, you lose.

Another argument is that the focus on waste elimination discourages activities like R&D that do not have an immediate impact on sales. The improvement effort, however,  isn’t about what we do but how we do it. Nobody in his right mind would call R&D waste, even on projects that fail. Waste in R&D comes in the form of researchers waiting for test equipment, sitting through badly organized meetings, or filling out administrative paperwork.

In manufacturing itself, some see the pursuit of improvement as a deterrent to investment in new technology. While it is clear that the improvement mindset does not lead to solving every problem by buying new machines,  the  practitioners of continuous improvement are in fact better informed, savvier buyers of new technology. On one side of the shop floor, you see a cell with old machines on which incremental improvements over several years have reduced staffing requirements from 5 operators to 1. On the other side of the aisle, you see a brand new, fully automatic line with a design that incorporates the lessons learned on the old one.

Others have argued that a society that pursues improvement will be slower to develop and adopt new, disruptive technology. But does the machinist improving a fixture deter the founder of the next Facebook? There is no connection. If the machinist were not making improvements, his creativity would most likely be untapped. And his improvement work does not siphon off the venture capital needed for disruptive technology.

Comparative advantage in the allocation of work among machines

Another NWLEAN post in response to Mike Thelen’s query on Laws of Nature, posted on 2/11/2011

On several occasions, I ran into the problem of allocating work among machines of different generations with overlapping capabilities. There were several products that could be processed to the same levels of quality in both the new and the old machines. The machines worked differently. For example, the old machines would process parts in batches while the new ones supported one-piece flow. But the resulting time per part was shorter on the new machines for all products. In other words, the new machines had a higher capacity for everything.

Given that the products were components going into the same assemblies, they were to be made in matching quantities per the assembly bill of materials and the demand was such that the plant had to make as many matching sets as possible. The question then is: how do you allocate the work among the machines?

When I first saw this problem, I thought it was unique, but, in fact, many machine shops keep multiple generations of machines on their floors and make parts in matching sets for their customers, and it is in fact quite common. The solution that maximizes the total output is to apply the law of comparative advantage from classical economics. Adapted to this context, it says that the key is the ratio of performance between the old and the new machines on each product. For example, if the new machine can do product X 30% faster than the old machine and product Y ten times faster, then the old machine is said to have a comparative advantage on product X, and you should run as much as possible of product X on the old machine.

It is a bit surprising at first, but easy to apply. What is more surprising is that so few plants do. The logic that is actually most commonly used is to load up the new machine with as much work as possible, on the grounds that it has a high depreciation and needs to “earn its keep.” What many managers have a difficult time coming to terms with is that what you paid for a machine and when you paid it is irrelevant when allocating work, because it is in the past and nothing you do will change it. You produce today with the machines you have, and the only thing that matters is what they can do, now and in the future.

The law of comparative advantage is taught in economics, not manufacturing or industrial engineering, and pertains to the benefits of free trade between countries, not work allocation among machines. The similarity is not obvious. This law is attributed to David Ricardo who published in 1817, based on an analysis of the production of wine and cloth in England and Portugal. Trade was free because, at the time, Portugal was under British occupation. Both wine and cloth were cheaper to produce in Portugal, but wine was much cheaper and cloth only slightly cheaper. England had therefore a comparative advantage on cloth, and the total output of wine and cloth was maximized by specializing England on cloth and Portugal on wine. You transplant that reasoning to your machine shop by mapping the countries to machines and costs to process times.

This simple approach works in a specific context. It is not general, but is of value because that context occurs in reality. The literature on operations research is full of more complicated ways to arrive at solutions in different situations. There is an article from IE Magazine in July, 2006 that I wrote about this entitled “Not-so-basic equipment: the pitfalls to avoid when allocating work among machines.” It used to be available on line for free on the magazine’s web site. Now you have to buy it on Amazon to download it.

Learning or experience curves

The following is a revision of a posting on NWLEAN in January, 2011 in response to Mike Thelen’s call for “Laws of nature” in manufacturing.

Learning curves are often mentioned informally, as in “there is a learning curve on this tool,” just to say that it takes learning and practice to get proficient at it. There is, however, a formal version expressing costs as a function of cumulative production volume during the life of a manufactured product. T. P. Wright first introduced the learning curve concept  in the US aircraft industry in 1936, about  labor costs; Bruce Henderson generalized in the experience curve, to include all costs ,  particularly those of purchased components.

The key idea is to look at cumulative volume. After all, how many units of a product you have made since you started is your experience, and it stands to reason that, the more you have already made of a product, the easier and cheaper it becomes for you to build one more. The x-axis of the experience curve is defined clearly and easily. The y-axis, on the other hand, is the cost per unit of the product, one of the characteristics that are commonly discussed as if they were well-defined, intrinsic properties like weight and color.  They really are a function of current production volume, and contain allocations that can be calculated in different ways for shared resources and resources used over time. The classic reference on the subject, Bruce Henderson’s Perpectives on Experience (1972), glosses over these difficulties and presents empirical evidence about prices rather than costs.

Assuming an unambiguous and meaningful definition of unit costs, it is reasonable to assume that they would decline as experience in making the product accumulates. But what might be the shape of the cost decline curve?  Engineers like to plot quantities and look for straight lines on various kinds of graph paper. Even before looking at empirical data, we can reflect on the logic of the most common types of models:

  1. In a plot of unit cost versus cumulative volume in regular, Cartesian coordinates, a straight line means a linear cost decline, which makes no sense because you would end up with negative costs for a sufficiently large volume.
  2. In a semi-logarithmic plot, a straight line would mean an exponential cost decline, which makes no sense either, because you could make an infinite volume at a finite cost.
  3. If you try a log-log plot, a straight line means an inverse-power cost decline, meaning, for example, the unit cost drops by 20% every time the cumulative volume doubles. This approach has none of the above problems. It represents a smooth decline as long as production continues, slow enough that the cumulative costs keeps growing to infinity with the volume.

I don’t know of any deeper theoretical justification for using inverse-power laws in learning or experience curves. Henderson, investigated the prices of various industrial products. I remember in particular his analysis of the Ford Model T, which showed prices from 1908 to 1927 that were consistent with a fixed percentage drop in unit costs for each doubling of the cumulative volume. The prices followed an obvious straight line on a log-log plot, suggesting that the costs did the same below.

Today, you don’t hear much about experience curves in the car industry, but you do in Electronics, where products have much shorter lives and this curve is a key factor in planning. When working in semiconductors, I remember a proposal from a Japanese electronics manufacturer that was designing one of our chips into a product. Out of curiosity, I plotted the declining prices they were offering to pay  for increasing quantities on log-log scales, and found that they were perfectly aligned. There was no doubt that this was how they had come up with the numbers.

The slope of your own curve is a function of your improvement abilities. Your market share then determines where you are on the x-axis. The higher your market share the faster your cumulative production volume grows. Being first lets you to grab market share early; being farther along the curve than your competitors allows you to retain it.

Lead times, work sampling, and Little’s Law

On 1/11/2011, Michael Thelen asked in the NWLEAN forum about “laws of nature” as they related to Lean. This is based on one of my answers

Lead time is a key performance indicator of manufacturing operations, but how do you measure it? It is not a quantity that you can directly observe by walking through a plant. To measure it directly you need to retrieve start and finish timestamps from historical data, assuming they are available and accurate. Or you could put tracers on a sample of parts, which means that it would take you at least six weeks to measure a six-week lead time. In most plants, however, a quick and rough estimate is more useful than a precise one that takes extensive time and effort to achieve.

That is where work sampling and Little’s Law come in handy. The key idea of work sampling, which the Wikipedia article fails to make clear, is that lets you infer a breakdown of each entity’s status over time from snapshots of the status of multiple identical entities. If, every time you go to the shop floor, you see 2 out  of 10 operators walking the aisle, you infer that, on the average, each operator spends 20% of the time walking the aisle.

There are obviously necessary conditions for such an inference to be valid. For example, you want to take snapshots during normal operations, not during startup orshutdown, and the group you are measuring must be homogeneous: if it is comprised of two materials handlers and eight production operators, the 20% average is not interesting, even if it is accurate. Work sampling is usually described as applied to people, but the same logic is applicable to machines and to work pieces, and that is what makes it possible to infer lead times from snapshots of inventory and throughput rates.

On the shop floor, you can count parts, bins or pallets, and observe the pace at which they are being consumed. Let us assume we are in the context shown in Figure 1, and want to estimate how long we take to turn a blank into a finished good.


Figure 1. Context of Little’s Law

Little’s Law, then, says that on the average,  in steady state, within one process or process segment,

Inventory = Lead time X Throughput

The reason this is true is best explained graphically, as in Figure 2, in the simple case of constant throughput and lead time. The cumulative count of blanks in is a straight line going up over time, and so is the count of finished goods out, offset by the lead time.  The vertical distance between the curves is the number of blanks that have come in but not yet made it out as products, and represents therefore the inventory. The slope of each curve is the throughput, and it is clearly the ratio of the inventory to the lead time.

Figure 2. Little’s Law with constant throughput and lead time

What is interesting about Little’s Law is that it remains valid about averages when both rates of arrivals of blanks and departures of finished goods are allowed to fluctuate randomly about an average. This is probably the best known and most useful general result of queueing theory.

Since we can count inventory and measure throughput, we can infer average lead times from just this data. One snapshot will not give you an accurate estimate, but it is still considerably easier to take a few snapshots of a production line to get a more accurate estimate than it is to research history. The point is to get close to an answer that would take much longer to get if you actually had to be accurate.