Nov 7 2022
Analyzing Variation with Histograms, KDE, and the Bootstrap
Assume you have a dataset that is a clean sample of a measured variable. It could be a critical dimension of a product, delivery lead times from a supplier, or environmental characteristics like temperature and humidity. How do you make it talk about the variable’s distribution? This post explores this challenge in the simple case of 1-dimensional data. I have used methods from histograms to KDE and the Bootstrap, varying in vintage from the 1890s to the 1980s:
Other methods were surely invented for the same purpose between 1895 and 1960 or since 1979, that I don’t know about or haven’t used. Readers are welcome to point them out.
The ones discussed here are not black boxes, automatically producing answers from a stream of data. All require a human to tune the settings of the tools. And this human needs to know the back story of the data.
May 13 2023
Effect of COVID-19 Vaccines on Excess Deaths
Even on LinkedIn, you still see posts and comments asserting that the COVID-19 vaccines aren’t “real” and alleging that they do more harm than good. This is usually based on articles of questionable value and the author’s brother-in-law catching COVID-19 while vaccinated. Public health, however, warrants serious research and is not a matter of anecdotes.
The real question is whether the administration of these vaccines to large populations was effective in curbing the pandemic and saving lives. Today, we can answer yes to some of these questions with simple methods applied to data from the US government. Specifically, we can analyze US CDC and census data on state-by-state Excess Deaths and fractions of the population vaccinated for 2021 and 2022. The results are obvious but, sometimes, the obvious needs belaboring.
By Michel Baudin • Data science • 3 • Tags: COVID-19, Vaccine