Suggestions for improving speed of matplotlib

Someone here may have some ideas, but you're really going

    > to need to do some profiling to find out where your
    > bottlenecks are. If most of the time is spent in the actual
    > Agg drawing calls, no amount of Pyrex, psyco, etc will
    > help.

Seconded. Spend some time with the profiler and when you find the
bottlenecks, post what you learn.

    > By they way, I haven't looked in the source, but does
    > matplotlib (and the Agg back-end) use "vectorized" calls to
    > the drawing? AS an example, with wxPython, calling
    > DC.DrawPointList() with an NX2 array of point coordinates
    > is orders of magnitude faster than looping through the
    > array and calling DC.DrawPoint() thousands of times. The
    > overhead of that round-trip between Python and C++ is
    > substantial. Maybe tricks like that could speed up the Agg
    > back-end too.

No, there is no low hanging fruit like that for the typical use cases
(line and marker drawing in agg). All of those loops happen in
extension code over numerix arrays.

But there are plenty of opportunities for optimization in matplotlib
and the way it is used, so code snippets and profiler results will be
most helpful.

JDH

John,

If I understand correctly, the agg backend use an image
representation that has an rgb triple for each pixel. Is
that correct?

Most 2d line drawings (generally the kind you expect to go very
fast) use very few colors (certainly fewer than 100). In that
case, using a color table could cut the image size by ~1/3.

That might actually help performance, as it seems that the size
of the image (and the transfer to backend -- with wx, for
example, this must be converted to a wx bitmap) may be a
significant bottleneck, at least from what I see.

I'm not saying that this would be easy to do, and it would not
be very useful for most images, but it might help performance
for 2d line drawings.

--Matt

Something I told before can be related to some size problem for a svg and it's sure for an eps file.
All the point of a curve is present inside the eps file even if the axes are smaller. To see this effect you can do like I explain in this message.

http://sourceforge.net/mailarchive/message.php?msg_id=11005531

To avoid to have this problem I cut by hand the spectra before to plot them but it's not very convenient and that means another step before to plot anything but that's work and the size of the output file decrease dramatically.

my 2 cents.

N.

Matt Newville wrote:

···

John,

If I understand correctly, the agg backend use an image
representation that has an rgb triple for each pixel. Is that correct?

Most 2d line drawings (generally the kind you expect to go very
fast) use very few colors (certainly fewer than 100). In that
case, using a color table could cut the image size by ~1/3.

That might actually help performance, as it seems that the size
of the image (and the transfer to backend -- with wx, for
example, this must be converted to a wx bitmap) may be a
significant bottleneck, at least from what I see.

I'm not saying that this would be easy to do, and it would not
be very useful for most images, but it might help performance
for 2d line drawings.

--Matt

-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

John Hunter wrote:

But there are plenty of opportunities for optimization in matplotlib
and the way it is used, so code snippets and profiler results will be
most helpful.

I spent a bit of time earlier today looking over the code to plot_day_summary(). My hunch at the time was that creating three Line2D objects per data point was a source of overhead, so I put together a script to profile things. The results I've gathered are rather inconclusive to me, but they may help others formulate an opinion.

The script and results are at http://agni.phys.iit.edu/~kmcivor/plot_day/

Vinj, you said you were using a derivation of plot_day_summary() to plot charts of size 1000x700. I have no idea what you mean about the chart size. Having access to your plotting function and an example data set would allow me to write a more accurate profiling script.

Ken