Michel Baudin's Blog
Ideas from manufacturing operations
  • Home
  • Home
  • About the author
  • Ask a question
  • Consulting
  • Courses
  • Leanix™ games
  • Sponsors
  • Meetup group
Jakob-Bernoulli-stamp-Swiss-formula-graph-law-1713

Oct 12 2022

Musings on Large Numbers

Anyone who has taken an introductory course in probability, or even SPC, has heard of the law of large numbers. It’s a powerful result from probability theory, and, perhaps, the most widely used. Wikipedia starts the article on this topic with a statement that is free of any caveat or restrictions:

In probability theory, the law of large numbers is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and tends to become closer to the expected value as more trials are performed.

This is how the literature describes it and most professionals understand it. Buried in the fine print within the Wikipedia article, however, you find conditions for this law to apply. First, we discuss the differences between sample averages and expected values, both of which we often call “mean.” Then we consider applications of the law of large numbers in cases ranging from SPC to statistical physics. Finally, we zoom in on a simple case, the Cauchy distribution. It easily emerges from experimental data, and the Law of Large Numbers does not apply to it.

Continue reading…

Share this:

  • Click to print (Opens in new window) Print
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on X (Opens in new window) X
  • Click to email a link to a friend (Opens in new window) Email

Like this:

Like Loading...

By Michel Baudin • Laws of nature • 1

Screen Shot 2022-10-04 at 4.55.56 PM

Oct 4 2022

Strange Statements About Probability Models | Don Wheeler | Quality Digest

In his latest column in Quality Digest, Don Wheeler wrote the following blanket statements, free of any caveat:

  1. “Probability models are built on the assumption that the data can be thought of as observations from a set of random variables that are independent and identically distributed.”
  2. “In the end, which probability model you may fit to your data hardly matters. It is an exercise that serves no practical purpose.”

Source: Wheeler, D. (2022) Converting Capabilities, What difference does the probability model make? Quality Digest

Michel Baudin‘s comments:

Not all models assume i.i.d. variables

Wheeler’s first statement might have applied 100 years ago. Today, however, there are many models in probability that are not based on the assumption that data are “observations from a set of random variables that are independent and identically distributed”:

  • ARIMA models for time series are used, for example in forecasting beer sales.
  • Epidemiologists use models that assume existing infections cause new ones. Therefore counts for successive periods are not independent.
  • The spatial data analysis tools used in mining and oil exploration assume that an observation at any point informs you about its neighborhood. The analysts don’t assume that observations at different points are independent.
  • The probability models used to locate a wreck on the ocean floor, find a needle in a haystack, and other similar search problems have nothing to do with series of independent and identically distributed observations.

Probability Models Are Useful

In his second statement, Wheeler seems determined to deter engineers and managers from studying probability. If a prominent statistician tells them it serves no useful purpose, why bother? It is particularly odd when you consider that Wheeler’s beloved XmR/Process Behavior charts use control limits based on the model of observations as the sum of a constant and a Gaussian white noise.

Probability models have many useful purposes.  They can keep from pursuing special causes for mere fluctuations and help you find root causes of actual problems. They also help you plan your supply chain and dimension your production lines.

Histograms are Old-Hat; Use KDE Instead

As Wheeler also says, “Many people have been taught that the first step in the statistical inquisition of their data is to fit some probability model to the histogram.” It’s time to learn something new, that takes advantage of IT developments since Karl Pearson invented the histogram in 1891.

Fitting models to a sample of 250 points based on a histogram is old-hat. A small dataset today is more 30,000 points, and you visualize its distribution with kernel density estimation(KDE), not histograms.

#donwheeler, #probability, #quality

Share this:

  • Click to print (Opens in new window) Print
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on X (Opens in new window) X
  • Click to email a link to a friend (Opens in new window) Email

Like this:

Like Loading...

By Michel Baudin • Press clippings • 8 • Tags: Don Wheeler, Probability, Quality

PoolingOutput5-01

Jul 18 2022

The Most Basic Problem in Quality

Two groups of parts are supposed to be identical in quality: they have the same item number and are made to the same specs, at different times in the same production lines, at the same time in different lines, or by different suppliers.

One group may be larger than the other, and both may contain defectives. Is the difference in fraction defectives between the two groups a fluctuation or does it have a cause you need to investigate? It’s as basic a question as it gets, but it’s a real problem, with solutions that aren’t quite as obvious as one might expect. We review several methods that have evolved over the years with information technology.

Continue reading…

Share this:

  • Click to print (Opens in new window) Print
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on X (Opens in new window) X
  • Click to email a link to a friend (Opens in new window) Email

Like this:

Like Loading...

By Michel Baudin • Data science • 0 • Tags: A/B testing, Barnard's Test, Binomial Probability Paper, Fisher's Test, Incoming QA, Z-test

KaizenInAction

Jun 30 2022

A Kaizen Case Study

This is the start of a new section of this blog, about case studies. The stories do not have to be extraordinary but they have to be real, from factories large and small. The Japanese example below is a manga. It’s a difficult art, and I am not expecting anyone to submit cases in this form. An infographic showing before and after states, methods used, and results achieved would be plenty. I will then format it for this blog and post it in this category.

Continue reading…

Share this:

  • Click to print (Opens in new window) Print
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on X (Opens in new window) X
  • Click to email a link to a friend (Opens in new window) Email

Like this:

Like Loading...

By Michel Baudin • Case studies • 0

Pavimento_di_siena,_ruota_della_fortuna_small

Jun 12 2022

Perspectives On Probability In Operations

The spirited discussions on LinkedIn about whether probabilities are relative frequencies or quantifications of beliefs are guaranteed to baffle practitioners. They come up in threads about manufacturing quality, supply-chain management, and public health, and do not generate much light. Their participants trade barbs without much civility, and without actually exchanging on substance.

The latest one, by Alexander von Felbert, is among the more thoughtful, and therefore unlikely to inspire rants. I do, however, fault it with using words like “aleatory” or “epistemic” that I don’t think are helpful. I am trying to discuss it here in everyday language, and to apply the concepts to numerically specific cases, with an eye to operations.

While there are genuinely great and not-so-great ideas, the root of the most violent disagreements is elsewhere, with individuals generalizing from different experience bases. You may map probability to reality differently depending on whether you are developing drugs in the pharmaceutical industry, enhancing yield in a semiconductor process, or driving down dppms in auto parts. The math doesn’t care as long as you follow its rules, and it doesn’t invalidate other interpretations.

Continue reading…

Share this:

  • Click to print (Opens in new window) Print
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on X (Opens in new window) X
  • Click to email a link to a friend (Opens in new window) Email

Like this:

Like Loading...

By Michel Baudin • Data science • 0 • Tags: Bayesian Statistics, data science, Probability, statistics

LAAqueduct22765.48161.jpg

Apr 27 2022

Flow

In his latest post on AllAboutLean, Christoph Roser compares the flow of materials in a factory with the flow of traffic on roads. About flow, he asks “But what is it?” but stops short of giving an answer. I also wrote many posts about flow without ever bothering to answer that question. It seemed so obvious and self-explanatory that it didn’t require defining but, perhaps, it does.

Continue reading…

Share this:

  • Click to print (Opens in new window) Print
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to share on X (Opens in new window) X
  • Click to email a link to a friend (Opens in new window) Email

Like this:

Like Loading...

By Michel Baudin • Uncategorized • 4 • Tags: Data Flow, Flow, Flow line, Information Flow, Job shop, Material Flow

«< 5 6 7 8 9 >»

Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 579 other subscribers

Recent Posts

  • Using Regression to Improve Quality | Part III — Validating Models
  • Rebuilding Manufacturing in France | Radu Demetrescoux
  • Using Regression to Improve Quality | Part II – Fitting Models
  • Using Regression to Improve Quality | Part I – What for?
  • Rankings and Bump Charts

Categories

  • Announcements
  • Answers to reader questions
  • Asenta selection
  • Automation
  • Blog clippings
  • Blog reviews
  • Book reviews
  • Case studies
  • Data science
  • Deming
  • Events
  • History
  • Information Technology
  • Laws of nature
  • Management
  • Metrics
  • News
  • Organization structure
  • Personal communications
  • Policies
  • Polls
  • Press clippings
  • Quality
  • Technology
  • Tools
  • Training
  • Uncategorized
  • Van of Nerds
  • Web scrapings

Social links

  • Twitter
  • Facebook
  • Google+
  • LinkedIn

My tags

5S Automation Autonomation Cellular manufacturing Continuous improvement data science Deming ERP Ford Government Health care industrial engineering Industry 4.0 Information technology IT jidoka Kaizen Kanban Lean Lean assembly Lean Health Care Lean implementation Lean Logistics Lean management Lean manufacturing Logistics Management Manufacturing Manufacturing engineering Metrics Mistake-Proofing Poka-Yoke Quality Six Sigma SMED SPC Standard Work Strategy Supply Chain Management Takt time Toyota Toyota Production System TPS Training VSM

↑

© Michel Baudin's Blog 2025
Powered by WordPress • Themify WordPress Themes
%d