PNG performance tips

Hi Matplotlib users,

I have an application which produces PNG files using the AGG backend.
When I profile the application I can see that much of the cpu time is
spent in the method write_png called by print_figure in backend_agg.py.

Does anyone know which backend is the best for producing fast good
quality PNG files (with fast being as important as good quality)?

In another thread I read that antialiasing could be disabled for better
performance. I tried doing that in each call to contourf and it resulted
in a performance improvement. Does anyone have other performance tips
with regard to PNG files?

Cheers,
Jesper

Jesper Larsen wrote:

Hi Matplotlib users,

I have an application which produces PNG files using the AGG backend.
When I profile the application I can see that much of the cpu time is
spent in the method write_png called by print_figure in backend_agg.py.

I have seen this myself. Keep in mind that timing includes a lot of disk I/O, so if your images are particularly large, or you're saving to a network or external disk, or if another process steps in at that moment and wants to read/write to the disk, that could be the bottleneck, more so than just the CPU time spent doing the PNG compression. On any reasonably modern PC, I suspect that's the case.

Does anyone know which backend is the best for producing fast good
quality PNG files (with fast being as important as good quality)?

They should all be approximately the same wrt actually writing out the file -- they're all using libpng either directly or indirectly. It also means there's not much that matplotlib can do to improve its performance, short of submitting patches to libpng -- but I suspect there isn't a lot of long-hanging fruit left to improve in such a widely-used library.

In another thread I read that antialiasing could be disabled for better
performance. I tried doing that in each call to contourf and it resulted
in a performance improvement. Does anyone have other performance tips
with regard to PNG files?

Saving to a Python file-like object (if you're doing that) is slower than saving directly to a file path.

See the recent thread on "Matplotlib performance" for a discussion of decimation of data (if your data set is really large).

Cheers,
Mike

···

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

Michael Droettboom wrote:

Jesper Larsen wrote:

Hi Matplotlib users,

I have an application which produces PNG files using the AGG backend.
When I profile the application I can see that much of the cpu time is
spent in the method write_png called by print_figure in backend_agg.py.

I have seen this myself. Keep in mind that timing includes a lot of
disk I/O, so if your images are particularly large, or you're saving to
a network or external disk, or if another process steps in at that
moment and wants to read/write to the disk, that could be the
bottleneck, more so than just the CPU time spent doing the PNG
compression. On any reasonably modern PC, I suspect that's the case.

Does anyone know which backend is the best for producing fast good
quality PNG files (with fast being as important as good quality)?

They should all be approximately the same wrt actually writing out the
file -- they're all using libpng either directly or indirectly. It also
means there's not much that matplotlib can do to improve its
performance, short of submitting patches to libpng -- but I suspect
there isn't a lot of long-hanging fruit left to improve in such a
widely-used library.

In another thread I read that antialiasing could be disabled for better
performance. I tried doing that in each call to contourf and it resulted
in a performance improvement. Does anyone have other performance tips
with regard to PNG files?

Saving to a Python file-like object (if you're doing that) is slower
than saving directly to a file path.

This seems to contradict your previous assertion that the bottleneck is likely
to be disk I/O - if you're saving to a Python file-like object, there's no disk
I/O. Why is this slower?

thanks,
Dave
[snip]

> I have an application which produces PNG files using the AGG backend.
> When I profile the application I can see that much of the cpu time is
> spent in the method write_png called by print_figure in backend_agg.py.

I have seen this myself. Keep in mind that timing includes a lot of
disk I/O, so if your images are particularly large, or you're saving to
a network or external disk, or if another process steps in at that
moment and wants to read/write to the disk, that could be the
bottleneck, more so than just the CPU time spent doing the PNG
compression. On any reasonably modern PC, I suspect that's the case.

> Does anyone know which backend is the best for producing fast good
> quality PNG files (with fast being as important as good quality)?

They should all be approximately the same wrt actually writing out the
file -- they're all using libpng either directly or indirectly. It also
means there's not much that matplotlib can do to improve its
performance, short of submitting patches to libpng -- but I suspect
there isn't a lot of long-hanging fruit left to improve in such a
widely-used library.

My application is web based. I am therefore considering serving the png files
directly from memory in a future release as outlined here:

http://www.scipy.org/Cookbook/Matplotlib/Matplotlib_and_Zope

Although I am still considering what impacts that will have on my caching of
the plots. I am currently saving the png files and reusing them if the same
plot is requested again - but I am considering pickling individual elements of
the plots instead since there are a lot of plots in which the only differences
is some text (I am already caching parts of the plot in memory). But I don't
know the performance of such a solution yet. I will give you a heads up when I
know (which won't be in the immediate future since I have other things that
are higher up on my to do list).

> In another thread I read that antialiasing could be disabled for better
> performance. I tried doing that in each call to contourf and it resulted
> in a performance improvement. Does anyone have other performance tips
> with regard to PNG files?

Saving to a Python file-like object (if you're doing that) is slower
than saving directly to a file path.

See the recent thread on "Matplotlib performance" for a discussion of
decimation of data (if your data set is really large).

I am writing directly to a file path and I have already decimated my data
sets - so that won't help me.

Cheers,
Jesper

···

On Mon, 2008-03-03 at 08:16 -0500, Michael Droettboom wrote:

David Moore wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Droettboom wrote:

Jesper Larsen wrote:

Hi Matplotlib users,

I have an application which produces PNG files using the AGG backend.
When I profile the application I can see that much of the cpu time is
spent in the method write_png called by print_figure in backend_agg.py.

I have seen this myself. Keep in mind that timing includes a lot of disk I/O, so if your images are particularly large, or you're saving to a network or external disk, or if another process steps in at that moment and wants to read/write to the disk, that could be the bottleneck, more so than just the CPU time spent doing the PNG compression. On any reasonably modern PC, I suspect that's the case.

Does anyone know which backend is the best for producing fast good
quality PNG files (with fast being as important as good quality)?

They should all be approximately the same wrt actually writing out the file -- they're all using libpng either directly or indirectly. It also means there's not much that matplotlib can do to improve its performance, short of submitting patches to libpng -- but I suspect there isn't a lot of long-hanging fruit left to improve in such a widely-used library.

In another thread I read that antialiasing could be disabled for better
performance. I tried doing that in each call to contourf and it resulted
in a performance improvement. Does anyone have other performance tips
with regard to PNG files?

Saving to a Python file-like object (if you're doing that) is slower than saving directly to a file path.

This seems to contradict your previous assertion that the bottleneck is likely
to be disk I/O - if you're saving to a Python file-like object, there's no disk
I/O. Why is this slower?

All things being equal -- if the Python file-like object is saving to an on-disk file -- the Python file-like object will be slower. Sometimes people do this if the want another layer of abstraction (e.g. saving to a gzip file) or to be more general.

What I mean is that:

savefig(open(filename))

is slower than

savefig(filename)

because the latter makes a Python function call to save each block of data.

Mike

···

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA