Combination of a box plot and a histogram

I'd like to make something in between a box plot [1] and a histogram. Each histogram would be represented by a single, tall, rectangular patch (like the box in a box plot), and the patch would be subdivided by the bin edges of the histogram. The face color of each sub-patch would replace the bar height in the histogram.

If any of that actually made sense:

* Does this type of plot have a name?

* Is there an easy way to do this in Matplotlib?

* If there isn't an easy way, what would be a good starting point? Initial ideas: 1) Use pcolor or imshow and embed this axes in a larger axes, 2) represent the sub-patches as a PolyCollection.

Thoughts?
-Tony

[1] e.g. http://matplotlib.sourceforge.net/examples/pylab_examples/boxplot_demo.html

Tony,

I am not quite sure I understand. Are you looking for error bars on the histogram, maybe?

http://matplotlib.sourceforge.net/users/screenshots.html#bar-charts

Or maybe something more like this:

http://matplotlib.sourceforge.net/examples/pylab_examples/bar_stacked.html

Or maybe something else in the gallery is more like what you want:

http://matplotlib.sourceforge.net/gallery.html

Ben Root

···

On Thu, Sep 30, 2010 at 8:44 PM, Tony S Yu <tsyu80@…287…> wrote:

I’d like to make something in between a box plot [1] and a histogram. Each histogram would be represented by a single, tall, rectangular patch (like the box in a box plot), and the patch would be subdivided by the bin edges of the histogram. The face color of each sub-patch would replace the bar height in the histogram.

If any of that actually made sense:

  • Does this type of plot have a name?

  • Is there an easy way to do this in Matplotlib?

  • If there isn’t an easy way, what would be a good starting point? Initial ideas: 1) Use pcolor or imshow and embed this axes in a larger axes, 2) represent the sub-patches as a PolyCollection.

Thoughts?

-Tony

[1] e.g. http://matplotlib.sourceforge.net/examples/pylab_examples/boxplot_demo.html

I’d like to make something in between a box plot [1] and a histogram. Each histogram would be represented by a single, tall, rectangular patch (like the box in a box plot), and the patch would be subdivided by the bin edges of the histogram. The face color of each sub-patch would replace the bar height in the histogram.

If any of that actually made sense:

  • Does this type of plot have a name?

  • Is there an easy way to do this in Matplotlib?

  • If there isn’t an easy way, what would be a good starting point? Initial ideas: 1) Use pcolor or imshow and embed this axes in a larger axes, 2) represent the sub-patches as a PolyCollection.

Thoughts?

-Tony

[1] e.g. http://matplotlib.sourceforge.net/examples/pylab_examples/boxplot_demo.html

Tony,

I am not quite sure I understand.

[snip]

Or maybe something else in the gallery is more like what you want:

http://matplotlib.sourceforge.net/gallery.html

Ben Root

I’ve checked the gallery, but I don’t see anything that appears similar. In any case, I ended up hacking together something that works. I’ve attached an image of what I had in mind (created with the code at the very bottom of this reply).

I ended up using mpl Rectangle objects and stringing them together using a PatchCollection. Maybe there’s a more efficient way to do this, but this approach worked well-enough.

Best,

-Tony

hist_strip.png

“”"

First attempt at a histogram strip chart (made up name).

if-main block taken from [1] except that I’ve replaced uniform distributions

with normal distributions.

[1] http://matplotlib.sourceforge.net/examples/pylab_examples/boxplot_demo3.html

“”"

import numpy as np

import matplotlib.pyplot as plt

from matplotlib import collections

NORM_TYPES = dict(max=max, sum=sum)

class BinCollection(collections.PatchCollection):

def init(self, hist, bin_edges, x=0, width=1, cmap=plt.cm.gray_r,

norm_type=‘max’, **kwargs):

yy = (bin_edges[:-1] + bin_edges[1:])/2.

heights = np.diff(bin_edges)

bins = [plt.Rectangle((x, y), width, h) for y, h in zip(yy, heights)]

norm = NORM_TYPES[norm_type]

fc = cmap(np.asarray(hist, dtype=float)/norm(hist))

super(BinCollection, self).init(bins, facecolors=fc, **kwargs)

def histstrip(x, positions=None, widths=None, ax=None):

if ax is None:

ax = plt.gca()

if positions is None:

positions = range(1, len(x) + 1)

if widths is None:

widths = np.min(np.diff(positions)) / 2. * np.ones(len(positions))

for data, x_pos, w in zip(x, positions, widths):

x_pos -= w/2.

hist, bin_edges = np.histogram(data)

bins = BinCollection(hist, bin_edges, width=w, x=x_pos)

ax.add_collection(bins, autolim=True)

ax.set_xticks(positions)

ax.autoscale_view()

if name == ‘main’:

import matplotlib.pyplot as plt

import numpy as np

np.random.seed(2)

inc = 0.1

e1 = np.random.normal(0,1, size=(500,))

e2 = np.random.normal(0,1, size=(500,))

e3 = np.random.normal(0,1 + inc, size=(500,))

e4 = np.random.normal(0,1 + 2*inc, size=(500,))

treatments = [e1,e2,e3,e4]

fig, ax = plt.subplots()

pos = np.array(range(len(treatments)))+1

histstrip(treatments, ax=ax)

ax.set_xlabel(‘treatment’)

ax.set_ylabel(‘response’)

fig.subplots_adjust(right=0.99,top=0.99)

plt.show()

···

On Oct 1, 2010, at 9:40 AM, Benjamin Root wrote:

On Thu, Sep 30, 2010 at 8:44 PM, Tony S Yu <tsyu80@…287…> wrote:

Actually, that looks kinda cool. If anyone is aware of the name for this kind of plot, maybe we could add a new plotting function?

Ben Root

···

On Fri, Oct 1, 2010 at 9:47 AM, Tony S Yu <tsyu80@…287…> wrote:

On Oct 1, 2010, at 9:40 AM, Benjamin Root wrote:

On Thu, Sep 30, 2010 at 8:44 PM, Tony S Yu <tsyu80@…287…> wrote:

I’d like to make something in between a box plot [1] and a histogram. Each histogram would be represented by a single, tall, rectangular patch (like the box in a box plot), and the patch would be subdivided by the bin edges of the histogram. The face color of each sub-patch would replace the bar height in the histogram.

If any of that actually made sense:

  • Does this type of plot have a name?

  • Is there an easy way to do this in Matplotlib?

  • If there isn’t an easy way, what would be a good starting point? Initial ideas: 1) Use pcolor or imshow and embed this axes in a larger axes, 2) represent the sub-patches as a PolyCollection.

Thoughts?

-Tony

[1] e.g. http://matplotlib.sourceforge.net/examples/pylab_examples/boxplot_demo.html

Tony,

I am not quite sure I understand.

[snip]

Or maybe something else in the gallery is more like what you want:

http://matplotlib.sourceforge.net/gallery.html

Ben Root

I’ve checked the gallery, but I don’t see anything that appears similar. In any case, I ended up hacking together something that works. I’ve attached an image of what I had in mind (created with the code at the very bottom of this reply).

I ended up using mpl Rectangle objects and stringing them together using a PatchCollection. Maybe there’s a more efficient way to do this, but this approach worked well-enough.

Best,

-Tony

"""
First attempt at a histogram strip chart (made up name).
if-main block taken from [1] except that I've replaced uniform distributions
with normal distributions.
[1] http://matplotlib.sourceforge.net/examples/pylab_examples/boxplot_demo3.html
"""
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import collections

NORM_TYPES = dict(max=max, sum=sum)
class BinCollection(collections.PatchCollection):
def __init__(self, hist, bin_edges, x=0, width=1, cmap=plt.cm.gray_r,
norm_type='max', **kwargs):
yy = (bin_edges[:-1] + bin_edges[1:])/2.
heights = np.diff(bin_edges)
bins = [plt.Rectangle((x, y), width, h) for y, h in zip(yy, heights)]
norm = NORM_TYPES[norm_type]
fc = cmap(np.asarray(hist, dtype=float)/norm(hist))

    super\(BinCollection, self\)\.\_\_init\_\_\(bins, facecolors=fc, \*\*kwargs\)

Is this equivalent to writing collections.PatchCollection.__init__()
and what are the advantages of super()?

I think you can use axes.pcolor() to replace BinCollection. pcolor()
just adds a collection similar to what you do now by hand for you.
With appropriate arguments it should do the job. You can also look
into pcolorfast() and pcolormesh().

def histstrip(x, positions=None, widths=None, ax=None):
if ax is None:
ax = plt.gca()
if positions is None:
positions = range(1, len(x) + 1)
if widths is None:
widths = np.min(np.diff(positions)) / 2. * np.ones(len(positions))
for data, x_pos, w in zip(x, positions, widths):
x_pos -= w/2.
hist, bin_edges = np.histogram(data)

No other arguments to numpy.histogram() allowed?

    bins = BinCollection\(hist, bin\_edges, width=w, x=x\_pos\)
    ax\.add\_collection\(bins, autolim=True\)
ax\.set\_xticks\(positions\)
ax\.autoscale\_view\(\)

if __name__ == '__main__':
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(2)
inc = 0.1
e1 = np.random.normal(0,1, size=(500,))
e2 = np.random.normal(0,1, size=(500,))
e3 = np.random.normal(0,1 + inc, size=(500,))
e4 = np.random.normal(0,1 + 2*inc, size=(500,))
treatments = [e1,e2,e3,e4]
fig, ax = plt.subplots()
pos = np.array(range(len(treatments)))+1
histstrip(treatments, ax=ax)
ax.set_xlabel('treatment')
ax.set_ylabel('response')
fig.subplots_adjust(right=0.99,top=0.99)
plt.show()

In my opinion this is too special to be added as a general matplotlib
plotting feature.

Friedrich

If you don't need faceting (dark edges around the bins), imshow with
the extent set would be the easiest way. If you want faceting, pcolor
should work as well.

···

On Thu, Sep 30, 2010 at 8:44 PM, Tony S Yu <tsyu80@...287...> wrote:

I'd like to make something in between a box plot [1] and a histogram. Each histogram would be represented by a single, tall, rectangular patch (like the box in a box plot), and the patch would be subdivided by the bin edges of the histogram. The face color of each sub-patch would replace the bar height in the histogram.

If any of that actually made sense:

* Does this type of plot have a name?

* Is there an easy way to do this in Matplotlib?

* If there isn't an easy way, what would be a good starting point? Initial ideas: 1) Use pcolor or imshow and embed this axes in a larger axes, 2) represent the sub-patches as a PolyCollection.

"""
First attempt at a histogram strip chart (made up name).
if-main block taken from [1] except that I've replaced uniform distributions
with normal distributions.
[1] http://matplotlib.sourceforge.net/examples/pylab_examples/boxplot_demo3.html
"""
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import collections

NORM_TYPES = dict(max=max, sum=sum)
class BinCollection(collections.PatchCollection):
    def __init__(self, hist, bin_edges, x=0, width=1, cmap=plt.cm.gray_r,
                 norm_type='max', **kwargs):
        yy = (bin_edges[:-1] + bin_edges[1:])/2.
        heights = np.diff(bin_edges)
        bins = [plt.Rectangle((x, y), width, h) for y, h in zip(yy, heights)]
        norm = NORM_TYPES[norm_type]
        fc = cmap(np.asarray(hist, dtype=float)/norm(hist))

        super(BinCollection, self).__init__(bins, facecolors=fc, **kwargs)

Is this equivalent to writing collections.PatchCollection.__init__()
and what are the advantages of super()?

I believe collections.PatchCollection.__init__() is equivalent. In this instance, I don't think there are advantages (or disadvantages) to using super---it's just how I'm used to writing classes.

I think you can use axes.pcolor() to replace BinCollection. pcolor()
just adds a collection similar to what you do now by hand for you.
With appropriate arguments it should do the job. You can also look
into pcolorfast() and pcolormesh().

Yes, you're right. This was actually my main question in the original post; i.e. what plotting function to start with. I'm not really sure how I overlooked pcolor(mesh) as a viable option. Thanks.

def histstrip(x, positions=None, widths=None, ax=None):
    if ax is None:
        ax = plt.gca()
    if positions is None:
        positions = range(1, len(x) + 1)
    if widths is None:
        widths = np.min(np.diff(positions)) / 2. * np.ones(len(positions))
    for data, x_pos, w in zip(x, positions, widths):
        x_pos -= w/2.
        hist, bin_edges = np.histogram(data)

No other arguments to numpy.histogram() allowed?

As I mentioned, this was just a function I hacked together. I didn't try to make it general purpose (yet).

In my opinion this is too special to be added as a general matplotlib
plotting feature.

I'd agree that it's pretty specialized; especially since I haven't been able to find any mention of it. I'm still curious if there's a name for this type of plot if anyone out there knows.

Best,
-Tony

···

On Oct 4, 2010, at 4:09 PM, Friedrich Romstedt wrote:

Friedrich

Thanks! I'll give both imshow and pcolor a try. Most likely I'll use pcolor, since lighter bins would completely disappear without faceting (... or maybe that's a good thing).

-Tony

···

On Oct 4, 2010, at 4:30 PM, John Hunter wrote:

On Thu, Sep 30, 2010 at 8:44 PM, Tony S Yu <tsyu80@...287...> wrote:

I'd like to make something in between a box plot [1] and a histogram. Each histogram would be represented by a single, tall, rectangular patch (like the box in a box plot), and the patch would be subdivided by the bin edges of the histogram. The face color of each sub-patch would replace the bar height in the histogram.

If any of that actually made sense:

* Does this type of plot have a name?

* Is there an easy way to do this in Matplotlib?

* If there isn't an easy way, what would be a good starting point? Initial ideas: 1) Use pcolor or imshow and embed this axes in a larger axes, 2) represent the sub-patches as a PolyCollection.

If you don't need faceting (dark edges around the bins), imshow with
the extent set would be the easiest way. If you want faceting, pcolor
should work as well.

The barcode demo shows something similar with a binary color map for imshow

http://matplotlib.sourceforge.net/examples/pylab_examples/barcode_demo.html

JDH

···

On Mon, Oct 4, 2010 at 3:37 PM, Tony S Yu <tsyu80@...287...> wrote:

Thanks! I'll give both imshow and pcolor a try. Most likely I'll use pcolor, since lighter bins would completely disappear without faceting (... or maybe that's a good thing).