axes.hist() with 2D input

Hi,

Working with images as 2D numpy arrays, it's very confusing to find that:

plt.hist(image.flatten())
plt.hist(image)

produce such different histograms. What I am expecting is the first one, which is the histogram of the image. But if I forget to flatten the array (which I don't need to do if I use numpy.histogram and then plot, adding to the confusion), the plotted histogram is strange. The number of bins is different, and so are the frequencies.

I'm not sure what the point of this message is, but I'd like to share my experience with this. I just spend a good 30 minutes trying to understand why the matplotlib histogram of my image was clearly wrong. The pyplot.hist and axes.hist documentation are not so great in explaining what is plotted with 2D input (I'm still not sure what I'm looking at), or at building an expectation that if you are working with an image, plt.hist(image) is NOT what you want.

To sum up, numpy.histogram and matplotlib's hist() have VERY different behaviors for 2D input arrays. numpy flattens, and matplotlib does... what exactly?

Any thoughts?

cheers,

Victor Poughon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20161006/7f3b3542/attachment.html>

pyplot.hist(x) where x is a N (rows) x M (cols) array will generate M
histograms and plot on a single Axes object.

Consider an array that likely has far fewer columns than your image:

import numpy
from matplotlib import pyplot

data = numpy.random.normal(size=(37, 4))
fig, ax = pyplot.subplots()
ax.hist(data)

Does that make sense?
-p

···

On Thu, Oct 6, 2016 at 5:36 AM, Poughon Victor <Victor.Poughon at cnes.fr> wrote:

Hi,

Working with images as 2D numpy arrays, it's very confusing to find that:

plt.hist(image.flatten())
plt.hist(image)

produce such different histograms. What I am expecting is the first one,
which is the histogram of the image. But if I forget to flatten the array
(which I don't need to do if I use numpy.histogram and then plot, adding to
the confusion), the plotted histogram is strange. The number of bins is
different, and so are the frequencies.

I'm not sure what the point of this message is, but I'd like to share my
experience with this. I just spend a good 30 minutes trying to understand
why the matplotlib histogram of my image was clearly wrong. The pyplot.hist
and axes.hist documentation are not so great in explaining what is plotted
with 2D input (I'm still not sure what I'm looking at), or at building an
expectation that if you are working with an image, plt.hist(image) is NOT
what you want.

To sum up, numpy.histogram and matplotlib's hist() have VERY different
behaviors for 2D input arrays. numpy flattens, and matplotlib does... what
exactly?

Any thoughts?

cheers,

Victor Poughon

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users at python.org
https://mail.python.org/mailman/listinfo/matplotlib-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20161006/29647eae/attachment.html>

Ohhhhhh I see now. I was so confused because the histogram with my 2D (non flatten()'ed) input looks like this:
http://i.imgur.com/NswA4El.png

but that's just because there are a lot of columns (3500 in that case), and matplotlib with render the histogram bars *side-by-side* for each bin.

Looking at the figure above, I thought that somehow the bins edges were not the same for each column, or that there were thousands of bins, or something like that, but it's just because it's putting 3500 vertical bars in each bin? So there are still 10 bins, but with a lot of vertical bars in each. For reference here is the same histogram with flatten()'ed input: http://i.imgur.com/x0dwUoL.png. Thank you your example code helped me get that.

I think the documentation for pyplot.hist and axes.hist could be improved regarding this issue. Should I have a go at it in a PR?

Thanks a lot,
(matplotlib is awesome)

Victor Poughon

···

________________________________
De : Paul Hobson [pmhobson at gmail.com]
Envoy? : jeudi 6 octobre 2016 22:48
? : Poughon Victor
Cc : matplotlib-users at python.org
Objet : Re: [Matplotlib-users] axes.hist() with 2D input

pyplot.hist(x) where x is a N (rows) x M (cols) array will generate M histograms and plot on a single Axes object.

Consider an array that likely has far fewer columns than your image:

import numpy
from matplotlib import pyplot

data = numpy.random.normal(size=(37, 4))
fig, ax = pyplot.subplots()
ax.hist(data)

Does that make sense?
-p

On Thu, Oct 6, 2016 at 5:36 AM, Poughon Victor <Victor.Poughon at cnes.fr<mailto:Victor.Poughon at cnes.fr>> wrote:
Hi,

Working with images as 2D numpy arrays, it's very confusing to find that:

plt.hist(image.flatten())
plt.hist(image)

produce such different histograms. What I am expecting is the first one, which is the histogram of the image. But if I forget to flatten the array (which I don't need to do if I use numpy.histogram and then plot, adding to the confusion), the plotted histogram is strange. The number of bins is different, and so are the frequencies.

I'm not sure what the point of this message is, but I'd like to share my experience with this. I just spend a good 30 minutes trying to understand why the matplotlib histogram of my image was clearly wrong. The pyplot.hist and axes.hist documentation are not so great in explaining what is plotted with 2D input (I'm still not sure what I'm looking at), or at building an expectation that if you are working with an image, plt.hist(image) is NOT what you want.

To sum up, numpy.histogram and matplotlib's hist() have VERY different behaviors for 2D input arrays. numpy flattens, and matplotlib does... what exactly?

Any thoughts?

cheers,

Victor Poughon

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users at python.org<mailto:Matplotlib-users at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20161007/d43b9cc8/attachment.html>

Improvements to the documentation are always welcome. Feel free to take a
stab at clarifying the documentation here:
https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes/_axes.py#L5896

and maybe adding an example here:
https://github.com/matplotlib/matplotlib/tree/master/examples/statistics
(rendered:
http://matplotlib.org/devdocs/examples/statistics/histogram_demo_multihist.html
)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20161007/ffc18df1/attachment.html>

···

On Fri, Oct 7, 2016 at 4:33 AM, Poughon Victor <Victor.Poughon at cnes.fr> wrote:

I think the documentation for pyplot.hist and axes.hist could be improved
regarding this issue. Should I have a go at it in a PR?

Thanks a lot,
(matplotlib is awesome)

Victor Poughon