Sep 3 2024
Using Regression to Improve Quality | Part I – What for?
In quality, regression serves to identify substitutes for true characteristics that are hard to observe and to find the root causes of technically challenging process problems. It is a major topic in data science, but oddly, the most extensive coverage I could find in the literature on quality is in Shewhart’s first book, from 1931! Later books, including Shewhart’s second, discuss it briefly or not at all. The ASQC, forerunner of the ASQ, published an 80-page guide on how to use regression analysis in quality control in 1985, but has not updated it since.
Regression analysis has been around for almost 140 years and has grown massively in scope, capabilities, and dataset size. Perhaps, it is time for professionals involved with quality to take another look at it.
Sep 8 2024
Using Regression to Improve Quality | Part II – Fitting Models
This is a personal guided tour of regression techniques intended for manufacturing professionals involved with quality. Starting from “historical monuments” like simple linear regression and multiple regression, it goes through “mid-century modern” developments like logistic regression. It ends with newer constructions like bootstrapping, bagging, and MARS. It is limited in scope and depth, because a full coverage would require a book and knowledge of many techniques I have not tried. See the references for more comprehensive coverage.
To fit a regression model to a dataset today, you don’t need to understand the logic, know any formula, or code any algorithm. Any statistical software, starting with electronic spreadsheets, will give you regression coefficients, confidence intervals for them, and, often, tools to assess the model’s fit.
However, treating it as a black box that magically fits curves to data is risky. You won’t understand what you are looking at and will draw mistaken conclusions. You need some idea of the logic behind regression in general or behind specific variants to know when to use them, how to prepare data, and to interpret the outputs.
Continue reading…
Share this:
Like this:
By Michel Baudin • Data science • 0 • Tags: Bagging, Bootstrapping, Kriging, Linear regression, Logistic regression, MARS, Multiple regression, Multivariate regression, Substitute characteristic, True characteristic