Sep 23 2023
Orbit Charts, Revisited
Data visualization is not just the art of presenting data to an audience. Upstream from this, you use visualizations in data cleaning to identify defective points, and in exploratory analysis, to identify patterns of interest. Then, you validate these patterns with a more formal analysis. Once confident that you have findings of value to communicate, you worry about making a compelling presentation.
Nick Desbarats and I had a long exchange on LinkedIn prompted by his article Connected Scatterplots Make Me Feel Dumb in Nightingale, the Data Visualization Society journal, on 8/29/2023. What he called Connected Scatterplot is what I call orbit charts, and I have found them helpful, particularly in analysis.
Jun 25 2025
Update on Data Science versus Statistics
Based on the usage of the terms in the literature, I have concluded that statistics has been subsumed under data science. I view statistics as beginning with a dataset and ending with conclusions, while data science starts with sensors and transaction processing, and ends in data products for end users. Kelleher & Tierney’s Data Science views it the same way, and so do tool-specific references like Gromelund’s R for Data Science, or Zumel & Mount’s Practical Data Science with R.
Brad Efron and Trevor Hastie are two prominent statisticians with a different perspective. In the epilogue of their 2016 book, Computer Age Statistical Inference, they describe data science as a subset of statistics that emphasizes algorithms and empirical validation, while inferential statistics focuses on mathematical models and probability theory.
Efron and Hastie’s book is definitely about statistics, as it contains no discussion of data acquisition, cleaning, storage and retrieval, or visualization. I asked Brad Efron about it and he responded: “That definition of data science is fine for its general use in business and industry.” He and Hastie were looking at it from the perspective of researchers in the field.
Continue reading…
Share this:
Like this:
By Michel Baudin • Data science, Uncategorized • 0 • Tags: data science, math, statistics