boxplot

I've discovered that matplotlib does boxplots, and apparently this is what I should be using for one of the big graphs in my paper.

Two problems:

  1. I need to put 45 boxplots on a single date plot. Each of the boxes has a different amount of data that goes in it. Since the boxplot() function wants to calculate its own means, rather than have me provide them, I need to either create a single 45xN Numeric array (and I can't), or else I need to call it 45 times. But each time I call it, the last box obscures the previous one. That is, this code only shows one box:

···

===================
#!/usr/bin/python
from pylab import *

# fake up some data
set1 = (rand(50)+1) * 100
set2 = (rand(50)+2) * 100

boxplot(set1,positions=[1])
boxplot(set2,positions=[2])

show()

=================
The boxplot function returns a list of lines that it adds, but when I capture the lines from set1 and add them manually to the axes object, it fails. What should I do?

  2. I need to have the X axis of the boxplot be dates. There doesn't seem to be an easy way to do that.

Suggestions?

Thanks!

  2. I need to have the X axis of the boxplot be dates. There doesn't
seem to be an easy way to do that.

Use the "position" keyword, as a list of date ordinals (output of date2num).
Then, use
gca().xaxis_date(tz)
where tz is your current timezone (you can use None, that's easier).
Et voila.
You probably gonna have to play with tick rotation and date formatting, but
that's another story

Using the boxplot_demo
#...
# multiple box plots on one figure
figure()
positions = [732659, 732660]
boxplot(data, positions=positions)
gca().xaxis_date(None)

Hm. thanks for the info. But it's not perfect... I get times in my formats, but not the dates. Here is the sample code:

#!/usr/bin/python

···

#
# Example boxplot code
#

from pylab import *
from matplotlib.dates import MonthLocator, WeekdayLocator, DateFormatter
from matplotlib.dates import MONDAY,SATURDAY

# fake up some data
set1 = (rand(50)+1) * 100
set2 = (rand(50)+2) * 100

boxplot(set1,positions=[732659])
boxplot(set2,positions=[732660])
ax = gca()

ax.xaxis.set_major_locator(MonthLocator())
ax.xaxis.set_major_formatter(DateFormatter('%D'))
ax.xaxis.set_minor_locator(WeekdayLocator(MONDAY))

ax.yaxis.set_major_formatter(FormatStrFormatter('%3.0f KB/s'))
ax.xaxis_date(None)
setp(ax.get_xticklabels(),'rotation',90,fontsize=8)
show()

==================
And yes, thanks for telling me about the timezone problem. I have been doing all of my work in GMT, only to be confounded.

We really need a manual that explains all of the axis stuff.

Now, how do I get two boxplots on the same plot?

(This would be SO MUCH EASIER if boxplot would take a list of objects that listed where the various thingies when...)

On Dec 15, 2006, at 8:56 PM, Pierre GM wrote:

  2. I need to have the X axis of the boxplot be dates. There doesn't
seem to be an easy way to do that.

Use the "position" keyword, as a list of date ordinals (output of date2num).
Then, use
gca().xaxis_date(tz)
where tz is your current timezone (you can use None, that's easier).
Et voila.
You probably gonna have to play with tick rotation and date formatting, but
that's another story

Using the boxplot_demo
#...
# multiple box plots on one figure
figure()
positions = [732659, 732660]
boxplot(data, positions=positions)
gca().xaxis_date(None)

Hm. thanks for the info. But it's not perfect... I get times in my
formats, but not the dates. Here is the sample code:

Yeah, I agree, the situation is far from ideal. Besides, it turns out that
there's no deep magic behind have_dates, which is just a way to tell the axis
to use AutoDateFormatter. Which we don't need.

So:
Plot your boxeds with the positions flag:

boxplot([set1, set2],positions=[732659,732660])
ax = gca()

Then use num2date:
timefmt = '%b-%d-%Y'
gca().set_xticklabels([num2date(x).strftime(timefmt) for x in
gca().get_xticks()])

Now, how do I get two boxplots on the same plot?

Well, just draw two axes.
Simson, now that you're more experienced with matplotlib, you should really
start speaking python to it.

fig = figure()
ax1 = fig.add_subplot(121)
ax2=fig.add_subplot(122)

ax1.boxplot([set1, set2],positions=[732659,732660])
ax2.boxplot([set2, set1],positions=[732659,732660])
ax1.set_xticklabels([num2date(x).strftime(timefmt) for x in ax1.get_xticks()])
ax2.set_xticklabels([num2date(x).strftime(timefmt) for x in ax2.get_xticks()])

···

On Friday 15 December 2006 21:07, Simson Garfinkel wrote:

Now, how do I get two boxplots on the same plot?

Well, just draw two axes.
Simson, now that you're more experienced with matplotlib, you should really
start speaking python to it.

I'd love to speak python to it. But it's harder when all of the examples are in matlab...

fig = figure()
ax1 = fig.add_subplot(121)
ax2=fig.add_subplot(122)

Hm. I'll need to figure out why these two subplots appear on the same axis.

BTW, this whole subplot(ijk) instead of subplot(i,j,k) notation is really, really confusing to me...

···

ax1.boxplot([set1, set2],positions=[732659,732660])
ax2.boxplot([set2, set1],positions=[732659,732660])
ax1.set_xticklabels([num2date(x).strftime(timefmt) for x in ax1.get_xticks()])
ax2.set_xticklabels([num2date(x).strftime(timefmt) for x in ax2.get_xticks()])

I'd love to speak python to it. But it's harder when all of the
examples are in matlab...

:slight_smile:
Well, please have a look to pythonic_matplotlib.py in your examples folder.

> fig = figure()
> ax1 = fig.add_subplot(121)
> ax2=fig.add_subplot(122)

Hm. I'll need to figure out why these two subplots appear on the same
axis.

What do you mean ? You want two plots on a figure, or two figures?
You want one plot in the topleft corner, one in hte bottom right ? You can do
that as well, just tell matplotlib where to plot the axes (a bit of
terminology here: an axes is a box in your figure, in which you will draw a
subplot).

BTW, this whole subplot(ijk) instead of subplot(i,j,k) notation is
really, really confusing to me...

Don't get overwhelmed. ijk is a shortcut for (i, j, k), that works well if
you're working with less than 10 plots in either direction.

I know, the learning curve is a bit steep at first, but soon you'll be a real
pro.

BTW, this whole subplot(ijk) instead of subplot(i,j,k) notation is
really, really confusing to me...

Don't get overwhelmed. ijk is a shortcut for (i, j, k), that works well if you're working with less than 10 plots in either direction.

It is a holdover from the early days of Matlab. It makes mpl more Matlab-like (for better or worse) and saves 2-4 keystrokes. Personally, I don't like it and would be inclined to discourage it in mpl.

Eric

I agree. It may be common in matlab, but it really doesn't belong in python.

···

On Dec 16, 2006, at 12:50 PM, Eric Firing wrote:

BTW, this whole subplot(ijk) instead of subplot(i,j,k) notation is
really, really confusing to me...

Don't get overwhelmed. ijk is a shortcut for (i, j, k), that works well if you're working with less than 10 plots in either direction.

It is a holdover from the early days of Matlab. It makes mpl more Matlab-like (for better or worse) and saves 2-4 keystrokes. Personally, I don't like it and would be inclined to discourage it in mpl.

Eric