Data science Archives – Page 5 of 7

Mar 2 2019

The Math Behind The Process Behavior Chart

Ever since asking Is SPC Obsolete? on this blog almost 6 years ago, multiple sources have told me that the XmR chart is a wonderful and currently useful process behavior chart, universally applicable, a data analysis panacea, requiring no assumption on the structure of the monitored variables. So I dug into it and this what I found.

By Michel Baudin • Data science 45 • Tags: Process Behavior Chart, SPC, XmR Chart

Jun 2 2018

Data Mining/Machine-Learning Tools In Manufacturing

This elaborates on the section on Analyzing The Data of the previous post. For a list of tools used for “data mining” or “machine-learning,” I researched, for each one, who invented it, when it was invented, for what purpose, and what applications it has had in manufacturing, and summarized my findings in the table below.

I am, however, not satisfied with the level of applications I found and would like to crowdsource more. If you have made these or other tools useful in your own manufacturing environment, please share whatever information you can about your applications in the survey that follows.

By Michel Baudin • Data science 1 • Tags: Data mining, data science, Machine Learning, Manufacturing

May 23 2018

Using Data Science To Improve Manufacturing

If you google “data-science + manufacturing,” what comes back is recycled hype about the factory of the future. The same vision has been painted before and hasn’t come to pass. Yet we are expected to believe that this time it will be a “4th industrial revolution.” Whether it’s true or not, this happy talk is no help in today’s factories. “Data science” covers real advances in the art of working with data, and the more relevant question is what it can do to improve existing operations.

This is not just about reaping tangible benefits today rather than hypothetical ones in the future but also about acquiring skills needed to design new plants and production lines 5 years from now. These publications endow technology with a power to drive innovation that it doesn’t have. It is only a means for people to innovate. Their ability to do so hinges on their mastery of the technology, which is acquired by using it in continuous improvement.

By Michel Baudin • Data science 7 • Tags: analytics, Data munging, data science, Data wrangling, Machine Learnin, Visualization

Dec 10 2017

Is SPC Obsolete? (Revisited)

Six years ago, one of the first posts in this blog — Is SPC Obsolete? — started a spirited discussion with 122 comments. Reflecting on it, however, I find that the participants, including myself, missed the mark in many ways:

My own post and comments were too long on what is wrong with SPC, as taught to this day, and too short on alternatives. Here, I am attempting to remedy this by presenting two techniques, induction trees and naive Bayes, that I think should be taught as part of anything reasonably called statistical process control. I conclude with what I think are the cultural reasons why they are ignored.
The discussions were too narrowly focused on control charts. While the Wikipedia article on SPC is only about control charts, other authors, like Douglas Montgomery or Jack B. Revelle, see it as including other tools, such as scatterplots, Pareto charts, and histograms, topics that none of the discussion participants said anything about. Even among control charts, there was undue emphasis on just one kind, the XmR chart, that Don Wheeler thinks is all you need to understand variation.
Many of the contributors resorted to the argument of authority, saying that an approach must be right because of who said so, as opposed to what it says. With all due respect to Shewhart, Deming, and Juran, we are not going to solve today’s quality problems by parsing their words. If they were still around, perhaps they would chime in and exhort quality professionals to apply their own judgment instead.

By Michel Baudin • Data science 3 • Tags: data science, Quality, SPC, Statistical Process Control

Oct 23 2017

There Is More To Data Than Just Numbers

Don Wheeler’s Understanding Variation starts with a chapter entitled “Data are random and miscellaneous” that contains no discussion of any part of its title. Implicit in Wheeler’s book, however, is the view that data consists of tables of numbers, representing either measured variables — lengths, weights, densities,… — or event occurrence counts — defective units, defects, machine failures,…

Many times, I have quoted computer scientist Don Knuth on this subject, saying that data is “the stuff that’s input or output,” meaning anything that can be read or written, and it includes much more than tables of numbers. The data we work with today includes, for example, the following:

Unstructured text, like 25,000 incident reports written by maintenance techs all over the world in their versions of English about problems with jet engines, or thousands of product reviews posted by consumers on e-commerce sites
Images, like photographs of visual defects on products, or electron-microscope images of integrated circuits.
Videos recordings of operations.
…

Analyzing data about a manufacturing process today means extracting information from all sources. The state of the art, based on automatic data acquisition and databases includes analytical techniques that were unthinkable in Shewhart’s day, known under the labels of data science, data mining or machine learning.

By Michel Baudin • Data science 2 • Tags: data science, Six Sigma, SPC, Text mining

Jan 3 2017

Probability For Professionals

dice In a previous post, I pointed out that manufacturing professionals’ eyes glaze over when they hear the word “probability.” Even outside manufacturing, most professionals’ idea of probability is that, if you throw a die, you have one chance in six of getting an ace.

2000 years ago, Claudius wrote a book on how to win at dice but the field of inquiry has broadened since, producing results that affect business, technology, science, politics, and everyday life.

In the age of big data, all professionals would benefit from digging deeper and becoming, at least, savvy recipients of probabilistic arguments prepared by others. The analysts themselves need a deeper understanding than their audience.

With the software available today in the broad categories of data science or machine learning, however, they don’t need to master 1,000 pages of math in order to apply probability theory, any more than you need to understand the mechanics of gearboxes to drive a car.

It wasn’t the case in earlier decades, when you needed to learn the math and implement it in your own code. Not only is it now unnecessary, but many new tools have been added to the kit. You still need to learn what the math doesn’t tell you: which tools to apply, when and how, in order to solve your actual problems. It’s no longer about computing, but about figuring out what to compute and acting on the results.

Following are a few examples that illustrate these ideas, and pointers on concepts I have personally found most enlightening on this subject. There is more to come, if there is popular demand.

By Michel Baudin • Data science 1 • Tags: data science, Manufacturing, Probablility, Randomness, Variability

Data science

The Math Behind The Process Behavior Chart

Like this:

Data Mining/Machine-Learning Tools In Manufacturing

Like this:

Using Data Science To Improve Manufacturing

Like this:

Is SPC Obsolete? (Revisited)

Like this:

There Is More To Data Than Just Numbers

Like this:

Probability For Professionals

Like this:

Follow Blog via Email

Recent Posts

Categories

Data science

The Math Behind The Process Behavior Chart

Share this:

Like this:

Data Mining/Machine-Learning Tools In Manufacturing

Share this:

Like this:

Using Data Science To Improve Manufacturing

Share this:

Like this:

Is SPC Obsolete? (Revisited)

Share this:

Like this:

There Is More To Data Than Just Numbers

Share this:

Like this:

Probability For Professionals

Share this:

Like this:

Follow Blog via Email

Recent Posts

Categories

Social links

My tags