how to make matplotlib faster?

John_Hunter · December 9, 2004, 4:38pm

I have experienced some extreme inefficiency using

    > errorbar plots for large datasets. Obviously, the
    > "hlines" routine is a huge bottleneck. Would it be
    > possible, in principle, to use an efficient collection
    > instead?
As Perry noted, it would be nice to see some of Eric's code to see if
this is the kind of bottleneck he is bumping into.

It would be very straightforward to use collections here is what they
were designed for - removing bottlenecks created by instantiating many
similar objects. I've never plotted a large number of errorbar lines
so haven't bumped into this one.

Note this might break some code which is relying on the fact that the
errorbar routing is returning a list of errorbar lines. collections
are designed to respond similarly to lists of lines under the set
command. Eg

set(lines, color='r', linewidth=4)

and

set(collection, color='r', linewidth=4)

will both work.

But if someone is currently doing

lines[2].set_color('g')

or

for line in lines:
line.set_something(else)

there would be a backward incompatibility with this change. Note it
would be possible to define setitem, getitem, and possibly setslice,
getslice and iter for collections to make them behave more like lists
of objects, which would be nice if we (you) want to make this change.

Is anyone changing the properties of individual error lines returned
by errorbar?

JDH

Norbert_Nemec1 · December 9, 2004, 6:38pm

Would that be possible at all? Do individual items in a collection have an
identity at all that could be exposed in Python? Do they have individual
properties?

Note, that I probably won't have the time to look into this matter myself.
Maybe one day, but certainly not in the near future.

Ciao,
Nobbi

···

Am Donnerstag, 9. Dezember 2004 17:38 schrieb John Hunter:

Note it
would be possible to define setitem, getitem, and possibly setslice,
getslice and iter for collections to make them behave more like lists
of objects, which would be nice if we (you) want to make this change.

--
_________________________________________Norbert Nemec
         Bernhardstr. 2 ... D-93053 Regensburg
     Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199
           eMail: <Norbert@...399...>

_Matt_Newville · December 9, 2004, 7:52pm

Hi,

I'm doing relatively simple line plots the WXAgg backend, but I
also find matplotlib to be somewhat slower than I'd hope for.

On a WindowsXP box (P4 1.7GHz, 512Mb RAM), in a wx event loop
issuing a plot() as fast as I can go, I get about 1 plot every
0.25 to 0.30 sec. This is just barely fast enough for my needs.
If I could reliably go at 10 plots/sec, that would be great.

It turns out that the dynamic_demo_wx.py example does go much
faster, but it does not actually re-do a plot(). Instead it just
changes the subplots line data. That's interesting, but I need
the view to be adjusted as well, as the scale will change with
time for my data. So far, I'm just re-issuing plot(), but I'd be
willing to do something slightly fancier.

Anyway, that led me to try to track down where the slowness in
plot() was coming from. Using nothing more sophisticated than
print statements, I believe the performance bottleneck is in
axis.py in Axis.draw(), in this block:

        for tick, loc, label in zip(majorTicks, majorLocs, majorLabels):
            if not interval.contains(loc): continue
            seen[loc] = 1
            tick.update_position(loc)
            tick.set_label1(label)
            tick.set_label2(label)
            tick.draw(renderer)
            extent = tick.label1.get_window_extent(renderer)
            ticklabelBoxes.append(extent)

For me, this block (run twice for a plot()) typically takes at
least 50% of the plot time. Commenting out the
tick.draw(renderer) and the following two 'extent' lines roughly
doubles the drawing rate (though no grid or ticks are shown). I
was surprised by this, but have not tracked it down much beyond
this. I'm not using mathtext in the labels and had only standard
numerical Tick labels in this example.

I don't know if this is applicable to the slowness of the contour
plots or error bars or if collections would help here. But it
doesn't seem like tick drawing should be the bottleneck. Anyway,
this seems like a simple place to test in other situations, and
may be a good place to look for possible optimizations.

Thanks,

--Matt