The bell curve: “Normal” or “Gaussian”?

Most discussions of statistical quality refer to the “Normal distribution,” but “Normal” is a loaded word. If we talk about the “Normal distribution,” it implies that all other distributions are, in some way, abnormal. The “Normal distribution” is also called “Gaussian,” after the discoverer of many of its properties, and I prefer it as a more neutral term. Before Germany adopted the Euro, its last 10-Mark note featured the bell curve next to Gauss’s face.

The Gaussian distribution is widely used, and abused, because its math is simple, well known, and wonderful. Here are a few of its remarkable properties:

  1. It applies to a broad class of measurement errors. John Herschel arrived at the Gaussian distribution for measurement errors in the position of bodies in the sky simply from the fact that the errors in x and y should be independent and that the probability of a given error should depend only on the distance from the true point.
  2. It is stable. If you add Gaussian variables, or take any linear combination of them, the result is also Gaussian.
  3. Many sums of variables converge to it.  The Central Limit Theorem (CLT) says that, if you add variables that are independent, identically distributed, with a distribution that has a mean and a standard deviation, they sum converges towards a Gaussian. It makes it an attractive model, for example, for order quantities for a product coming independently from a large number of customers.
  4. Mint syrup diffusion in water
    Mint syrup diffusion in water

    It solves the equation of diffusion. The concentration of, say, a dye introduced into clear water through a pinpoint is a Gaussian that spreads overt time. You can experience it in your kitchen: fill a white plate with about 1/8 in of water, and drop the smallest amount of mint syrup you can in the center. After a few seconds, the syrup in the water forms a cloud that looks very much like a two-dimensional Gaussian bell shape for concentration, as shown on the right. And it fact it is, because the Gaussian density function solves the diffusion equation, with a standard deviation that rises with time. It also happens in gases, but too quickly to observe in your kitchen, and in solids, but too slowly.

  5. It solves the equation of heat transfer by conduction. Likewise, when heat spreads by conduction from a point source in a solid, the temperature profile is Gaussian… The equation is the same as for diffusion.
  6. Unique filter. A time-series of raw data — for temperatures, order quantities, stock prices,… — usually has fluctuations that you want to smooth out in order to bring to light the trends or cycles your are looking for. A common way of doing this is replacing each point with a moving average of its neighbors, taken over windows of varying lengths, often with weights that decrease with distance, so that a point that is 30 minutes in the past counts for less than the point of 1 second ago. And you would like to set these weights so that, whenever you enlarge the window, the peaks in your signal are eroded and the valleys fill up. A surprising, and recent discovery (1986) is that the only weighting function that does this is the Gaussian bell curve, with its standard deviation as the scale parameter.
  7. Own transform. This is mostly of interest to mathematicians, but the Gaussian bell curve is its own Laplace transform, which drastically simplifies calculations.

For all these reasons, the Gaussian distribution deserves attention, but it doesn’t mean that there aren’t other models that do too. For example, when you pool the output of independent series of events, like failures of different types on a machine, you tend towards a Poisson process, characterized by independent numbers of events in disjoint time intervals, and a constant occurrence rate over time. It is also quite useful but it doesn’t command the same level of attention as the gaussian.

The most egregious misuse of the gaussian distribution is in the rank-and-yank approach to human resources, which forces bosses to rate their subordinates “on a curve.” Measuring several dimensions of people performance and examining their distributions might make sense, but mandating that grades be “normally distributed” is absurd.