What's the purpose of density argument in plt.hist if we don't actually get a density plot?

I was looking at the examples on the matplotlib website trying to figure out how to create a density figure instead of just only a histogram.

The closest example I could find was the following which seems to completely defeat the purpose of the density=True argument IMHO, because you’ll have to manually find the best fit for each plot.

Imagine I have hundreds of such plots then doing

# add a 'best fit' line
y = ((1 / (np.sqrt(2 * np.pi) * sigma)) * np.exp(-0.5 * (1 / sigma * (bins - mu))**2))
ax.plot(bins, y, '--')

is far from optimal.

Wouldn’t it be better to actually have the argument density=True create a density plot directly instead, and for instance if density='both' in order to create a density plot overlaid over a histogram?

The density argument says how we normalize the bars in the histogram, not anything about the underlying distribution of the data. Matplotlib has no idea how your data is distributed.

If you want to fit a Gaussian to a data set, script.stats has a function for that, and they also have functions returning the PDF to compare with the histograms.

Sorry I suppose you mean scipy.stats instead of scripts.stats, right?

Initially I though that density argument might close to what the seaborn interface has to offer.

Kernel density estimation is just one way to estimate a density, binning; counting, and normalizing (i.e. a histogram) is another way. People often call KDE line plots “density plots” which isn’t unreasonable, but it also isn’t exactly correct.

It might also be helpful to address the narrow question in the title: plotting a histogram with density normalization lets you compare multiple distributions with different sizes, which might otherwise be difficult using a count histogram.

1 Like

Matplotlib does have kernel density estimator code in it for violin plots, so its not ridiculous to ask that be provided as a function in matplotlib. See We have Violinplots but no kdeplot. · Issue #17341 · matplotlib/matplotlib · GitHub

1 Like