Jun 12 2022
Perspectives On Probability In Operations
The spirited discussions on LinkedIn about whether probabilities are relative frequencies or quantifications of beliefs are guaranteed to baffle practitioners. They come up in threads about manufacturing quality, supply-chain management, and public health, and do not generate much light. Their participants trade barbs without much civility, and without actually exchanging on substance.
The latest one, by Alexander von Felbert, is among the more thoughtful, and therefore unlikely to inspire rants. I do, however, fault it with using words like “aleatory” or “epistemic” that I don’t think are helpful. I am trying to discuss it here in everyday language, and to apply the concepts to numerically specific cases, with an eye to operations.
While there are genuinely great and not-so-great ideas, the root of the most violent disagreements is elsewhere, with individuals generalizing from different experience bases. You may map probability to reality differently depending on whether you are developing drugs in the pharmaceutical industry, enhancing yield in a semiconductor process, or driving down dppms in auto parts. The math doesn’t care as long as you follow its rules, and it doesn’t invalidate other interpretations.
Jun 25 2025
Update on Data Science versus Statistics
Based on the usage of the terms in the literature, I have concluded that statistics has been subsumed under data science. I view statistics as beginning with a dataset and ending with conclusions, while data science starts with sensors and transaction processing, and ends in data products for end users. Kelleher & Tierney’s Data Science views it the same way, and so do tool-specific references like Gromelund’s R for Data Science, or Zumel & Mount’s Practical Data Science with R.
Brad Efron and Trevor Hastie are two prominent statisticians with a different perspective. In the epilogue of their 2016 book, Computer Age Statistical Inference, they describe data science as a subset of statistics that emphasizes algorithms and empirical validation, while inferential statistics focuses on mathematical models and probability theory.
Efron and Hastie’s book is definitely about statistics, as it contains no discussion of data acquisition, cleaning, storage and retrieval, or visualization. I asked Brad Efron about it and he responded: “That definition of data science is fine for its general use in business and industry.” He and Hastie were looking at it from the perspective of researchers in the field.
Continue reading…
Share this:
Like this:
By Michel Baudin • Data science, Uncategorized • 0 • Tags: data science, math, statistics