A kernel density estimate provides a
means of estimating and visualizing the probability distribution function of a
random variable based on a random sample. In contrast to a histogram, a kernel
density estimate provides a smooth estimate, via the effect of a smoothing
parameter called the *bandwidth*, here denoted by *h*. With the correct choice
of bandwidth, important features of the distribution can be seen; an incorrect
choice will result in undersmoothing or oversmoothing and obscure those
features.

Here we see a histogram and three kernel density estimates for a sample of
waiting times in minutes between eruptions of
Old Faithful Geyser in Yellowstone National
Park, taken from R’s
`faithful`

dataset. The data follow a bimodal distribution; short
eruptions are followed by a wait time averaging about 55 minutes, and long
eruptions by a wait time averaging about 80 minutes. In recent years, wait
times have been increasing, possibly due to the effects of earthquakes on the
geyser’s geohydrology.