Trouble with imshow

[...]

I'll put it in as an enhancement, but I'm still unsure if there is a
bug in
there as well. Is there something I should be doing to clear memory
after the
first figure is closed other than close()? I don't understand why
memory usage
grows each time I replot, but I'm pretty sure it isn't desireable
behavior. As
I mentioned, this effect is worse with plot.

So is this a bug or improper usage?

I'm not quite sure, but I don't think there is a specifically
matplotlib
memory leak bug at work here. Are you using ipython, and if so, have
you
turned off the caching? In its default mode, ipython keeps lots of
references, thereby keeping memory in use. Also, memory management and
reporting can be a bit tricky and misleading.

Nevertheless, the attached script may be illustrating the problem. Try
running it from the command line as-is (maybe shorten the loop--it
doesn't take 100 iterations to show the pattern) and then commenting
out
the line as indicated in the comment. It seems that if anything is done
that adds ever so slightly to memory use while the figure is displayed,
then when the figure is closed, its memory is not reused. I'm puzzled.

I wasn't thinking straight--there is no mystery and no memory leak.
Ignore my example script referred to above. It was saving rows of the z
array, not single elements as I had intended, so of course memory use
was growing substantially.

Eric

You may not see a memory leak, but I still can't get my memory back
without killing python. I
turned off the ipython caching and even ran without iPython on both
Windows and Ubuntu, but when I
use imshow(), followed by close('all') and another imshow(), I run out
of memory. I can see from
the OS that the memory does not come back after close() and that it
grows after the second imshow().

Any other ideas? Looks like a bug to me otherwise.

Except that I tried the same things and did not get quite the same
result. Let's track this down. Please try the attached script, and see
if the memory usage grows substantially, or just oscillates a bit.

Eric

One thing I noticed is that if I add a "def __del__(self): print 'del'"
to image._AxesImageBase, it never gets called. _AxesImageBase keeps
float64 and uint8 rgba images in a cache, which is never freed.

Adding a __del__ method defeats (or blocks) the garbage collection.

Sorry, never heard of that. I thought __del__() is called when the
reference count reaches 0.

It is, but if there are circular reference chains (cycles--and mpl is full of them) then the garbage collector has to identify them and remove them. If it encounters a __del__ it stops and leaves that cycle alone.

Eric

···

On 02/03/2011 01:02 PM, Christoph Gohlke wrote:

On 2/3/2011 2:44 PM, Eric Firing wrote:

On 02/03/2011 12:28 PM, Christoph Gohlke wrote:

On 2/3/2011 2:15 PM, Eric Firing wrote:

On 02/03/2011 11:30 AM, Robert Abiad wrote:

On 2/3/2011 10:06 AM, Eric Firing wrote:

On 02/02/2011 10:17 PM, Eric Firing wrote:

On 02/02/2011 08:38 PM, Robert Abiad wrote:

Christoph

Since self._imcache is an instance attribute, when the instance is no
longer referenced, it should get garbage-collected, provided there is no
__del__ method.

Eric

Christoph

[...]

I'll put it in as an enhancement, but I'm still unsure if there is a
bug in
there as well. Is there something I should be doing to clear memory
after the
first figure is closed other than close()? I don't understand why
memory usage
grows each time I replot, but I'm pretty sure it isn't desireable
behavior. As
I mentioned, this effect is worse with plot.

So is this a bug or improper usage?

I'm not quite sure, but I don't think there is a specifically
matplotlib
memory leak bug at work here. Are you using ipython, and if so, have
you
turned off the caching? In its default mode, ipython keeps lots of
references, thereby keeping memory in use. Also, memory management and
reporting can be a bit tricky and misleading.

Nevertheless, the attached script may be illustrating the problem. Try
running it from the command line as-is (maybe shorten the loop--it
doesn't take 100 iterations to show the pattern) and then commenting
out
the line as indicated in the comment. It seems that if anything is done
that adds ever so slightly to memory use while the figure is displayed,
then when the figure is closed, its memory is not reused. I'm puzzled.

I wasn't thinking straight--there is no mystery and no memory leak.
Ignore my example script referred to above. It was saving rows of the z
array, not single elements as I had intended, so of course memory use
was growing substantially.

Eric

You may not see a memory leak, but I still can't get my memory back
without killing python. I
turned off the ipython caching and even ran without iPython on both
Windows and Ubuntu, but when I
use imshow(), followed by close('all') and another imshow(), I run out
of memory. I can see from
the OS that the memory does not come back after close() and that it
grows after the second imshow().

Any other ideas? Looks like a bug to me otherwise.

Except that I tried the same things and did not get quite the same
result. Let's track this down. Please try the attached script, and see
if the memory usage grows substantially, or just oscillates a bit.

Eric

One thing I noticed is that if I add a "def __del__(self): print 'del'"
to image._AxesImageBase, it never gets called. _AxesImageBase keeps
float64 and uint8 rgba images in a cache, which is never freed.

Adding a __del__ method defeats (or blocks) the garbage collection.

Sorry, never heard of that. I thought __del__() is called when the
reference count reaches 0.

It is, but if there are circular reference chains (cycles--and mpl is
full of them) then the garbage collector has to identify them and remove
them. If it encounters a __del__ it stops and leaves that cycle alone.

My understanding is that if there is a circular reference then the refcount will not be zero anyway. In this case _AxesImageBase instances and their image caches will never be deleted by the gc (__del__ method present or not) unless the circle is broken. When the interpreter quits things are deleted by other means. I don't know matplotlib code good enough to fix this and will instead work on reducing the memory overhead needed to plot an image. In the meantime it could help to explicitly delete the image cache when a plot closes or to avoid caching altogether.

Christoph

···

On 2/3/2011 3:13 PM, Eric Firing wrote:

On 02/03/2011 01:02 PM, Christoph Gohlke wrote:

On 2/3/2011 2:44 PM, Eric Firing wrote:

On 02/03/2011 12:28 PM, Christoph Gohlke wrote:

On 2/3/2011 2:15 PM, Eric Firing wrote:

On 02/03/2011 11:30 AM, Robert Abiad wrote:

On 2/3/2011 10:06 AM, Eric Firing wrote:

On 02/02/2011 10:17 PM, Eric Firing wrote:

On 02/02/2011 08:38 PM, Robert Abiad wrote:

Eric

Christoph

Since self._imcache is an instance attribute, when the instance is no
longer referenced, it should get garbage-collected, provided there is no
__del__ method.

Eric

Christoph

------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world?
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

Just to clarify the circular reference paradox. The garbage collector will pick it up because it is using weak refs instead of regular references. So, when a circular reference occurs, but is not hard ref-ed to anything, then it is “detached” and the gc picks it up. I am not very familiar on this concept in particular, but I do remember it being explained this way.

Anyone who is more knowledgeable about this, I would welcome further comment or corrections to what I remember.

Also, not to sound too annoying, but has anyone considered the idea of using compressed arrays for holding those rgba values?

Ben Root

···

On Thu, Feb 3, 2011 at 6:37 PM, Christoph Gohlke <cgohlke@…3426…3…> wrote:

On 2/3/2011 3:13 PM, Eric Firing wrote:

On 02/03/2011 01:02 PM, Christoph Gohlke wrote:

On 2/3/2011 2:44 PM, Eric Firing wrote:

On 02/03/2011 12:28 PM, Christoph Gohlke wrote:

On 2/3/2011 2:15 PM, Eric Firing wrote:

On 02/03/2011 11:30 AM, Robert Abiad wrote:

On 2/3/2011 10:06 AM, Eric Firing wrote:

On 02/02/2011 10:17 PM, Eric Firing wrote:

On 02/02/2011 08:38 PM, Robert Abiad wrote:

[…]

I’ll put it in as an enhancement, but I’m still unsure if there is a

bug in

there as well. Is there something I should be doing to clear memory

after the

first figure is closed other than close()? I don’t understand why

memory usage

grows each time I replot, but I’m pretty sure it isn’t desireable

behavior. As

I mentioned, this effect is worse with plot.

So is this a bug or improper usage?

I’m not quite sure, but I don’t think there is a specifically

matplotlib

memory leak bug at work here. Are you using ipython, and if so, have

you

turned off the caching? In its default mode, ipython keeps lots of

references, thereby keeping memory in use. Also, memory management and

reporting can be a bit tricky and misleading.

Nevertheless, the attached script may be illustrating the problem. Try

running it from the command line as-is (maybe shorten the loop–it

doesn’t take 100 iterations to show the pattern) and then commenting

out

the line as indicated in the comment. It seems that if anything is done

that adds ever so slightly to memory use while the figure is displayed,

then when the figure is closed, its memory is not reused. I’m puzzled.

I wasn’t thinking straight–there is no mystery and no memory leak.

Ignore my example script referred to above. It was saving rows of the z

array, not single elements as I had intended, so of course memory use

was growing substantially.

Eric

You may not see a memory leak, but I still can’t get my memory back

without killing python. I

turned off the ipython caching and even ran without iPython on both

Windows and Ubuntu, but when I

use imshow(), followed by close(‘all’) and another imshow(), I run out

of memory. I can see from

the OS that the memory does not come back after close() and that it

grows after the second imshow().

Any other ideas? Looks like a bug to me otherwise.

Except that I tried the same things and did not get quite the same

result. Let’s track this down. Please try the attached script, and see

if the memory usage grows substantially, or just oscillates a bit.

Eric

One thing I noticed is that if I add a “def del(self): print ‘del’”

to image._AxesImageBase, it never gets called. _AxesImageBase keeps

float64 and uint8 rgba images in a cache, which is never freed.

Adding a del method defeats (or blocks) the garbage collection.

Sorry, never heard of that. I thought del() is called when the

reference count reaches 0.

It is, but if there are circular reference chains (cycles–and mpl is

full of them) then the garbage collector has to identify them and remove

them. If it encounters a del it stops and leaves that cycle alone.

My understanding is that if there is a circular reference then the

refcount will not be zero anyway. In this case _AxesImageBase instances

and their image caches will never be deleted by the gc (del method

present or not) unless the circle is broken. When the interpreter quits

things are deleted by other means. I don’t know matplotlib code good

enough to fix this and will instead work on reducing the memory overhead

needed to plot an image. In the meantime it could help to explicitly

delete the image cache when a plot closes or to avoid caching altogether.

Christoph

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array has to be passed into agg. What *does* help is using uint8 from start to finish. It might also be possible to use some smart downsampling before generating the rgba array, but the uint8 route seems to me the first thing to attack.

Eric

···

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Ben Root

Please review the attached patch. It avoids generating and storing float64 rgba arrays and uses uint8 rgba instead. That's a huge memory saving and also faster. I can't see any side effects as _image.fromarray() converts the float64 input to uint8 anyway.

So far other attempts to optimize memory usage were thwarted by matplotlib's internal use of masked arrays. As mentioned before, users can provide their own normalized rgba arrays to avoid all this processing.

Christoph

image.diff (753 Bytes)

···

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Please review the attached patch. It avoids generating and storing
float64 rgba arrays and uses uint8 rgba instead. That's a huge memory
saving and also faster. I can't see any side effects as
_image.fromarray() converts the float64 input to uint8 anyway.

Christoph,

Thank you! I haven't found anything wrong with that delightfully simple patch, so I have committed it to the trunk. Back in 2007 I added the ability of colormapping to generate uint8 directly, precisely to enable this sort of optimization. Why it was not already being used in imshow, I don't know--maybe I was going to do it, got sidetracked, and never finished.

I suspect it won't be as simple as for the plain image, but there may be opportunities for optimizing with uint8 in other image-like operations.

So far other attempts to optimize memory usage were thwarted by
matplotlib's internal use of masked arrays. As mentioned before, users
can provide their own normalized rgba arrays to avoid all this processing.

Did you see other potential low-hanging fruit that might be harvested with some changes to the code associated with masked arrays?

Eric

···

On 02/03/2011 05:35 PM, Christoph Gohlke wrote:

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Christoph

The norm function currently converts the data to double precision floating point and also creates temporary arrays that can be avoided. For float32 and low precision integer images this seems overkill and one could use float32. It might be possible to replace the norm function with numpy.digitize if that works with masked arrays. Last, the _image.frombyte function does a copy of 'strided arrays' (only relevant when zooming/panning large images). I try to provide a patch for each.

Christoph

···

On 2/4/2011 11:54 AM, Eric Firing wrote:

On 02/03/2011 05:35 PM, Christoph Gohlke wrote:

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Please review the attached patch. It avoids generating and storing
float64 rgba arrays and uses uint8 rgba instead. That's a huge memory
saving and also faster. I can't see any side effects as
_image.fromarray() converts the float64 input to uint8 anyway.

Christoph,

Thank you! I haven't found anything wrong with that delightfully simple
patch, so I have committed it to the trunk. Back in 2007 I added the
ability of colormapping to generate uint8 directly, precisely to enable
this sort of optimization. Why it was not already being used in imshow,
I don't know--maybe I was going to do it, got sidetracked, and never
finished.

I suspect it won't be as simple as for the plain image, but there may be
opportunities for optimizing with uint8 in other image-like operations.

So far other attempts to optimize memory usage were thwarted by
matplotlib's internal use of masked arrays. As mentioned before, users
can provide their own normalized rgba arrays to avoid all this processing.

Did you see other potential low-hanging fruit that might be harvested
with some changes to the code associated with masked arrays?

Eric

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Please review the attached patch. It avoids generating and storing
float64 rgba arrays and uses uint8 rgba instead. That's a huge memory
saving and also faster. I can't see any side effects as
_image.fromarray() converts the float64 input to uint8 anyway.

Christoph,

Thank you! I haven't found anything wrong with that delightfully simple
patch, so I have committed it to the trunk. Back in 2007 I added the
ability of colormapping to generate uint8 directly, precisely to enable
this sort of optimization. Why it was not already being used in imshow,
I don't know--maybe I was going to do it, got sidetracked, and never
finished.

I suspect it won't be as simple as for the plain image, but there may be
opportunities for optimizing with uint8 in other image-like operations.

So far other attempts to optimize memory usage were thwarted by
matplotlib's internal use of masked arrays. As mentioned before, users
can provide their own normalized rgba arrays to avoid all this processing.

Did you see other potential low-hanging fruit that might be harvested
with some changes to the code associated with masked arrays?

Eric

The norm function currently converts the data to double precision
floating point and also creates temporary arrays that can be avoided.
For float32 and low precision integer images this seems overkill and one
could use float32. It might be possible to replace the norm function
with numpy.digitize if that works with masked arrays. Last, the
_image.frombyte function does a copy of 'strided arrays' (only relevant
when zooming/panning large images). I try to provide a patch for each.

masked arrays can be filled to create an ndarray before passing to digitize; whether that will be faster, remains to be seen. I've never used digitize.

Regarding frombyte, I suspect you can't avoid the copy; the data structure being passed to agg is just a string of bytes, as far as I can see, so everything is based on having a simple contiguous array.

Eric

···

On 02/04/2011 10:28 AM, Christoph Gohlke wrote:

On 2/4/2011 11:54 AM, Eric Firing wrote:

On 02/03/2011 05:35 PM, Christoph Gohlke wrote:

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Christoph

------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world?
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Please review the attached patch. It avoids generating and storing
float64 rgba arrays and uses uint8 rgba instead. That's a huge memory
saving and also faster. I can't see any side effects as
_image.fromarray() converts the float64 input to uint8 anyway.

Christoph,

Thank you! I haven't found anything wrong with that delightfully simple
patch, so I have committed it to the trunk. Back in 2007 I added the
ability of colormapping to generate uint8 directly, precisely to enable
this sort of optimization. Why it was not already being used in imshow,
I don't know--maybe I was going to do it, got sidetracked, and never
finished.

I suspect it won't be as simple as for the plain image, but there may be
opportunities for optimizing with uint8 in other image-like operations.

So far other attempts to optimize memory usage were thwarted by
matplotlib's internal use of masked arrays. As mentioned before, users
can provide their own normalized rgba arrays to avoid all this processing.

Did you see other potential low-hanging fruit that might be harvested
with some changes to the code associated with masked arrays?

Eric

The norm function currently converts the data to double precision
floating point and also creates temporary arrays that can be avoided.
For float32 and low precision integer images this seems overkill and one
could use float32. It might be possible to replace the norm function
with numpy.digitize if that works with masked arrays. Last, the
_image.frombyte function does a copy of 'strided arrays' (only relevant
when zooming/panning large images). I try to provide a patch for each.

masked arrays can be filled to create an ndarray before passing to
digitize; whether that will be faster, remains to be seen. I've never
used digitize.

I didn't say that ("can be filled...") right. I think one would need to use the mask to put in the i_bad index where appropriate. np.ma does not have a digitize function. I suspect it won't help much if at all in Normalize, but it would be a natural for use in BoundaryNorm.

It looks easy to allow Normalize.__call__ to use float32 if that is what
it receives.

I don't see any unnecessary temporary array creation apart from the conversion to float64, except for the generation of a masked array regardless of input. I don't think this costs much; if it gets an ndarray it does not copy it, and it does not generate a full mask array. Still, the function probably could be sped up a bit by handling masking more explicitly instead of letting ma do the work.

Eric

···

On 02/04/2011 11:33 AM, Eric Firing wrote:

On 02/04/2011 10:28 AM, Christoph Gohlke wrote:

On 2/4/2011 11:54 AM, Eric Firing wrote:

On 02/03/2011 05:35 PM, Christoph Gohlke wrote:

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Regarding frombyte, I suspect you can't avoid the copy; the data
structure being passed to agg is just a string of bytes, as far as I can
see, so everything is based on having a simple contiguous array.

Eric

Christoph

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Please review the attached patch. It avoids generating and storing
float64 rgba arrays and uses uint8 rgba instead. That's a huge memory
saving and also faster. I can't see any side effects as
_image.fromarray() converts the float64 input to uint8 anyway.

Christoph,

Thank you! I haven't found anything wrong with that delightfully simple
patch, so I have committed it to the trunk. Back in 2007 I added the
ability of colormapping to generate uint8 directly, precisely to enable
this sort of optimization. Why it was not already being used in imshow,
I don't know--maybe I was going to do it, got sidetracked, and never
finished.

I suspect it won't be as simple as for the plain image, but there may be
opportunities for optimizing with uint8 in other image-like operations.

So far other attempts to optimize memory usage were thwarted by
matplotlib's internal use of masked arrays. As mentioned before, users
can provide their own normalized rgba arrays to avoid all this processing.

Did you see other potential low-hanging fruit that might be harvested
with some changes to the code associated with masked arrays?

Eric

The norm function currently converts the data to double precision
floating point and also creates temporary arrays that can be avoided.
For float32 and low precision integer images this seems overkill and one
could use float32. It might be possible to replace the norm function
with numpy.digitize if that works with masked arrays. Last, the
_image.frombyte function does a copy of 'strided arrays' (only relevant
when zooming/panning large images). I try to provide a patch for each.

masked arrays can be filled to create an ndarray before passing to
digitize; whether that will be faster, remains to be seen. I've never
used digitize.

I didn't say that ("can be filled...") right. I think one would need to
use the mask to put in the i_bad index where appropriate. np.ma does
not have a digitize function. I suspect it won't help much if at all in
Normalize, but it would be a natural for use in BoundaryNorm.

It looks easy to allow Normalize.__call__ to use float32 if that is what
it receives.

I don't see any unnecessary temporary array creation apart from the
conversion to float64, except for the generation of a masked array
regardless of input. I don't think this costs much; if it gets an
ndarray it does not copy it, and it does not generate a full mask array.
   Still, the function probably could be sped up a bit by handling
masking more explicitly instead of letting ma do the work.

In class Normalize:
     result = 0.0 * val
and
     result = (val-vmin) / (vmax-vmin)

Eric

Regarding frombyte, I suspect you can't avoid the copy; the data
structure being passed to agg is just a string of bytes, as far as I can
see, so everything is based on having a simple contiguous array.

The PyArray_ContiguousFromObject call will return a copy if the input array is not already contiguous.

Christoph

···

On 2/4/2011 2:14 PM, Eric Firing wrote:

On 02/04/2011 11:33 AM, Eric Firing wrote:

On 02/04/2011 10:28 AM, Christoph Gohlke wrote:

On 2/4/2011 11:54 AM, Eric Firing wrote:

On 02/03/2011 05:35 PM, Christoph Gohlke wrote:

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Please review the attached patch. It avoids generating and storing
float64 rgba arrays and uses uint8 rgba instead. That's a huge memory
saving and also faster. I can't see any side effects as
_image.fromarray() converts the float64 input to uint8 anyway.

Christoph,

Thank you! I haven't found anything wrong with that delightfully simple
patch, so I have committed it to the trunk. Back in 2007 I added the
ability of colormapping to generate uint8 directly, precisely to enable
this sort of optimization. Why it was not already being used in imshow,
I don't know--maybe I was going to do it, got sidetracked, and never
finished.

I suspect it won't be as simple as for the plain image, but there may be
opportunities for optimizing with uint8 in other image-like operations.

So far other attempts to optimize memory usage were thwarted by
matplotlib's internal use of masked arrays. As mentioned before, users
can provide their own normalized rgba arrays to avoid all this processing.

Did you see other potential low-hanging fruit that might be harvested
with some changes to the code associated with masked arrays?

Eric

The norm function currently converts the data to double precision
floating point and also creates temporary arrays that can be avoided.
For float32 and low precision integer images this seems overkill and one
could use float32. It might be possible to replace the norm function
with numpy.digitize if that works with masked arrays. Last, the
_image.frombyte function does a copy of 'strided arrays' (only relevant
when zooming/panning large images). I try to provide a patch for each.

masked arrays can be filled to create an ndarray before passing to
digitize; whether that will be faster, remains to be seen. I've never
used digitize.

I didn't say that ("can be filled...") right. I think one would need to
use the mask to put in the i_bad index where appropriate. np.ma does
not have a digitize function. I suspect it won't help much if at all in
Normalize, but it would be a natural for use in BoundaryNorm.

It looks easy to allow Normalize.__call__ to use float32 if that is what
it receives.

I don't see any unnecessary temporary array creation apart from the
conversion to float64, except for the generation of a masked array
regardless of input. I don't think this costs much; if it gets an
ndarray it does not copy it, and it does not generate a full mask array.
    Still, the function probably could be sped up a bit by handling
masking more explicitly instead of letting ma do the work.

In class Normalize:
      result = 0.0 * val
and
      result = (val-vmin) / (vmax-vmin)

Eric

Regarding frombyte, I suspect you can't avoid the copy; the data
structure being passed to agg is just a string of bytes, as far as I can
see, so everything is based on having a simple contiguous array.

The PyArray_ContiguousFromObject call will return a copy if the input
array is not already contiguous.

Exactly. I thought you were suggesting that this was not needed, but maybe I misunderstood.

Eric

···

On 02/04/2011 12:33 PM, Christoph Gohlke wrote:

On 2/4/2011 2:14 PM, Eric Firing wrote:

On 02/04/2011 11:33 AM, Eric Firing wrote:

On 02/04/2011 10:28 AM, Christoph Gohlke wrote:

On 2/4/2011 11:54 AM, Eric Firing wrote:

On 02/03/2011 05:35 PM, Christoph Gohlke wrote:

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Christoph

------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world?
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

In fact I am suggesting that this is not needed. The copy-to-agg routine could be made 'stride aware' and use PyArray_FromObject. Not a very low hanging fruit but it seems fromarray() does this already. I'm not sure it's worth.

How about these changes to color.py (attached). This avoids copies, uses in-place operations, and calculates single precision when normalizing small integer and float32 arrays. Similar could be done for LogNorm. Do masked arrays support in-place operations?

Christoph

colors.diff (2.51 KB)

···

On 2/4/2011 3:29 PM, Eric Firing wrote:

On 02/04/2011 12:33 PM, Christoph Gohlke wrote:

On 2/4/2011 2:14 PM, Eric Firing wrote:

On 02/04/2011 11:33 AM, Eric Firing wrote:

On 02/04/2011 10:28 AM, Christoph Gohlke wrote:

On 2/4/2011 11:54 AM, Eric Firing wrote:

On 02/03/2011 05:35 PM, Christoph Gohlke wrote:

On 2/3/2011 6:50 PM, Eric Firing wrote:

On 02/03/2011 03:04 PM, Benjamin Root wrote:

Also, not to sound too annoying, but has anyone considered the idea of
using compressed arrays for holding those rgba values?

I don't see how that really helps; as far as I know, a full rgba array
has to be passed into agg. What *does* help is using uint8 from start
to finish. It might also be possible to use some smart downsampling
before generating the rgba array, but the uint8 route seems to me the
first thing to attack.

Eric

Ben Root

Please review the attached patch. It avoids generating and storing
float64 rgba arrays and uses uint8 rgba instead. That's a huge memory
saving and also faster. I can't see any side effects as
_image.fromarray() converts the float64 input to uint8 anyway.

Christoph,

Thank you! I haven't found anything wrong with that delightfully simple
patch, so I have committed it to the trunk. Back in 2007 I added the
ability of colormapping to generate uint8 directly, precisely to enable
this sort of optimization. Why it was not already being used in imshow,
I don't know--maybe I was going to do it, got sidetracked, and never
finished.

I suspect it won't be as simple as for the plain image, but there may be
opportunities for optimizing with uint8 in other image-like operations.

So far other attempts to optimize memory usage were thwarted by
matplotlib's internal use of masked arrays. As mentioned before, users
can provide their own normalized rgba arrays to avoid all this processing.

Did you see other potential low-hanging fruit that might be harvested
with some changes to the code associated with masked arrays?

Eric

The norm function currently converts the data to double precision
floating point and also creates temporary arrays that can be avoided.
For float32 and low precision integer images this seems overkill and one
could use float32. It might be possible to replace the norm function
with numpy.digitize if that works with masked arrays. Last, the
_image.frombyte function does a copy of 'strided arrays' (only relevant
when zooming/panning large images). I try to provide a patch for each.

masked arrays can be filled to create an ndarray before passing to
digitize; whether that will be faster, remains to be seen. I've never
used digitize.

I didn't say that ("can be filled...") right. I think one would need to
use the mask to put in the i_bad index where appropriate. np.ma does
not have a digitize function. I suspect it won't help much if at all in
Normalize, but it would be a natural for use in BoundaryNorm.

It looks easy to allow Normalize.__call__ to use float32 if that is what
it receives.

I don't see any unnecessary temporary array creation apart from the
conversion to float64, except for the generation of a masked array
regardless of input. I don't think this costs much; if it gets an
ndarray it does not copy it, and it does not generate a full mask array.
     Still, the function probably could be sped up a bit by handling
masking more explicitly instead of letting ma do the work.

In class Normalize:
       result = 0.0 * val
and
       result = (val-vmin) / (vmax-vmin)

Eric

Regarding frombyte, I suspect you can't avoid the copy; the data
structure being passed to agg is just a string of bytes, as far as I can
see, so everything is based on having a simple contiguous array.

The PyArray_ContiguousFromObject call will return a copy if the input
array is not already contiguous.

Exactly. I thought you were suggesting that this was not needed, but
maybe I misunderstood.

Eric

[...]

How about these changes to color.py (attached). This avoids copies, uses
in-place operations, and calculates single precision when normalizing
small integer and float32 arrays. Similar could be done for LogNorm. Do
masked arrays support in-place operations?

Christoph

Christoph,

Thank you.

Done (with slight modifications) in 8946 (trunk).

I was surprised by the speedup in normalizing large arrays when using float32 versus float64. A factor of 10 on my machine with (1000,1000), timed with ipython %timeit. Because of the way %timeit does multiple tests, I suspect it may exaggerate cache effects.

Eric

···

On 02/04/2011 02:03 PM, Christoph Gohlke wrote:

Please consider the attached patch for the _image.frombyte function. It avoids temporary copies in case of non-contiguous input arrays. Copying a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is about 7x faster (a common case for zooming/panning). Copying contiguous RGB input arrays is ~2x faster. Tested on win32-py2.7.

Christoph

_image.cpp.diff (3.47 KB)

···

On 2/5/2011 1:02 PM, Eric Firing wrote:

On 02/04/2011 02:03 PM, Christoph Gohlke wrote:
[...]

How about these changes to color.py (attached). This avoids copies, uses
in-place operations, and calculates single precision when normalizing
small integer and float32 arrays. Similar could be done for LogNorm. Do
masked arrays support in-place operations?

Christoph

Christoph,

Thank you.

Done (with slight modifications) in 8946 (trunk).

I was surprised by the speedup in normalizing large arrays when using
float32 versus float64. A factor of 10 on my machine with (1000,1000),
timed with ipython %timeit. Because of the way %timeit does multiple
tests, I suspect it may exaggerate cache effects.

Eric

Thank you!

Looks good, speeds up zooming and panning on large images as advertised. An 8000x8000 image is actually manageable now. interpolation='nearest' is still very slow until the image is substantially zoomed, but everything is quite quick with other interpolation styles. The slowness of 'nearest' looks like a basic characteristic of the implementation.

I committed the patch in 8966.

Before that I found and committed a big speed-up in Normalize.

Eric

···

On 02/08/2011 02:39 PM, Christoph Gohlke wrote:

Please consider the attached patch for the _image.frombyte function. It
avoids temporary copies in case of non-contiguous input arrays. Copying
a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is
about 7x faster (a common case for zooming/panning). Copying contiguous
RGB input arrays is ~2x faster. Tested on win32-py2.7.

Christoph

Eric,

How much is the speed-up in Normalize? It might be worth it to apply it to LogNorm as well. (as an aside, I find the duplication of code in Normalize and friends a little disconcerting…).

Also, what would it take (if it is at all possible) to take advantage of these image optimizations while using pcolor?

Ben Root

···

On Wed, Feb 9, 2011 at 1:50 AM, Eric Firing <efiring@…202…> wrote:

On 02/08/2011 02:39 PM, Christoph Gohlke wrote:

Please consider the attached patch for the _image.frombyte function. It

avoids temporary copies in case of non-contiguous input arrays. Copying

a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is

about 7x faster (a common case for zooming/panning). Copying contiguous

RGB input arrays is ~2x faster. Tested on win32-py2.7.

Christoph

Thank you!

Looks good, speeds up zooming and panning on large images as advertised.

An 8000x8000 image is actually manageable now.

interpolation=‘nearest’ is still very slow until the image is

substantially zoomed, but everything is quite quick with other

interpolation styles. The slowness of ‘nearest’ looks like a basic

characteristic of the implementation.

I committed the patch in 8966.

Before that I found and committed a big speed-up in Normalize.

Eric

Bug Report:

At some point between the recent revision and r8934, setting the alpha value to anythhing but None will cause the image to not show. I suspect it has something to do with some of the recent revisions. Maybe the alpha values were being converted into an integer, causing them to be zero? Then again, even setting alpha to 1 will cause the image to disappear.

Ideas? Thoughts? I included an example script below.

Ben Root

Example script:

import numpy as np
import matplotlib.pyplot as plt

z = np.random.random((40, 50))

fig = plt.figure()

ax = fig.add_subplot(1, 2, 1)
ax.imshow(z, alpha=1.0)
ax.set_title(‘Blank!’)

ax = fig.add_subplot(1, 2, 2)
ax.imshow(z, alpha=None)
ax.set_title(“Not Blank”)

plt.show()

···

On Wed, Feb 9, 2011 at 1:50 AM, Eric Firing <efiring@…202…> wrote:

On 02/08/2011 02:39 PM, Christoph Gohlke wrote:

Please consider the attached patch for the _image.frombyte function. It

avoids temporary copies in case of non-contiguous input arrays. Copying

a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is

about 7x faster (a common case for zooming/panning). Copying contiguous

RGB input arrays is ~2x faster. Tested on win32-py2.7.

Christoph

Thank you!

Looks good, speeds up zooming and panning on large images as advertised.

An 8000x8000 image is actually manageable now.

interpolation=‘nearest’ is still very slow until the image is

substantially zoomed, but everything is quite quick with other

interpolation styles. The slowness of ‘nearest’ looks like a basic

characteristic of the implementation.

I committed the patch in 8966.

Before that I found and committed a big speed-up in Normalize.

Eric

     >

     >
     > Please consider the attached patch for the _image.frombyte
    function. It
     > avoids temporary copies in case of non-contiguous input arrays.
    Copying
     > a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is
     > about 7x faster (a common case for zooming/panning). Copying
    contiguous
     > RGB input arrays is ~2x faster. Tested on win32-py2.7.
     >
     > Christoph
     >

    Thank you!

    Looks good, speeds up zooming and panning on large images as advertised.
      An 8000x8000 image is actually manageable now.
    interpolation='nearest' is still very slow until the image is
    substantially zoomed, but everything is quite quick with other
    interpolation styles. The slowness of 'nearest' looks like a basic
    characteristic of the implementation.

    I committed the patch in 8966.

    Before that I found and committed a big speed-up in Normalize.

    Eric

Eric,

How much is the speed-up in Normalize? It might be worth it to apply it
to LogNorm as well. (as an aside, I find the duplication of code in
Normalize and friends a little disconcerting...).

Timing: I don't recall exactly, but it was along the lines of 3.5 seconds getting knocked down to 0.6 seconds.

Yes, I will apply it to LogNorm.

Duplication: yes, but every time I look at it I see that there are subtle but essential differences here and there, so it does not look worthwhile to try to factor out the remaining commonality.

Also, what would it take (if it is at all possible) to take advantage of
these image optimizations while using pcolor?

The norm and cmap optimizations are already taken advantage of by all color mapping operations. The optimization of using uint8 was already being done in quadmesh, nonuniform image, and pcolor image (all used by pcolorfast). The image.frombyte optimization that Christoph just came up with (8966) is specific to image. There may be some similar optimizations that could be made for other image-like plots (reducing copying and temporaries), but I don't think there will be the same level of payoff. Pcolor itself is hopelessly slow by nature, and should be used only for relatively small arrays. And then there is hexbin, which is *really* slow. It could be sped up with the addition of some cython, but would need a quadmesh-like extension to make it really speedy. (All this is off the top of my head, so I may be misstating a point here and there.)

Eric

···

On 02/09/2011 12:36 PM, Benjamin Root wrote:

On Wed, Feb 9, 2011 at 1:50 AM, Eric Firing <efiring@...202... > <mailto:efiring@…202…>> wrote:
    On 02/08/2011 02:39 PM, Christoph Gohlke wrote:

Ben Root

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

     >

     >
     > Please consider the attached patch for the _image.frombyte
    function. It
     > avoids temporary copies in case of non-contiguous input arrays.
    Copying
     > a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is
     > about 7x faster (a common case for zooming/panning). Copying
    contiguous
     > RGB input arrays is ~2x faster. Tested on win32-py2.7.
     >
     > Christoph
     >

    Thank you!

    Looks good, speeds up zooming and panning on large images as advertised.
      An 8000x8000 image is actually manageable now.
    interpolation='nearest' is still very slow until the image is
    substantially zoomed, but everything is quite quick with other
    interpolation styles. The slowness of 'nearest' looks like a basic
    characteristic of the implementation.

    I committed the patch in 8966.

    Before that I found and committed a big speed-up in Normalize.

    Eric

Bug Report:

At some point between the recent revision and r8934, setting the alpha
value to anythhing but None will cause the image to not show. I suspect
it has something to do with some of the recent revisions. Maybe the
alpha values were being converted into an integer, causing them to be
zero? Then again, even setting alpha to 1 will cause the image to
disappear.

Ideas? Thoughts? I included an example script below.

Thanks for the report. I'll fix it some time today.

Eric

···

On 02/09/2011 02:29 PM, Benjamin Root wrote:

On Wed, Feb 9, 2011 at 1:50 AM, Eric Firing <efiring@...202... > <mailto:efiring@…202…>> wrote:
    On 02/08/2011 02:39 PM, Christoph Gohlke wrote:

Ben Root

Example script:

import numpy as np
import matplotlib.pyplot as plt

z = np.random.random((40, 50))

fig = plt.figure()
ax = fig.add_subplot(1, 2, 1)
ax.imshow(z, alpha=1.0)
ax.set_title('Blank!')

ax = fig.add_subplot(1, 2, 2)
ax.imshow(z, alpha=None)
ax.set_title("Not Blank")

plt.show()

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

This should fix it:

Index: lib/matplotlib/colors.py

colors_alpha_fix.diff (725 Bytes)

···

On 2/9/2011 4:29 PM, Benjamin Root wrote:

On Wed, Feb 9, 2011 at 1:50 AM, Eric Firing <efiring@...202... > <mailto:efiring@…202…>> wrote:

    On 02/08/2011 02:39 PM, Christoph Gohlke wrote:
    >

    >
    > Please consider the attached patch for the _image.frombyte
    function. It
    > avoids temporary copies in case of non-contiguous input arrays.
    Copying
    > a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is
    > about 7x faster (a common case for zooming/panning). Copying
    contiguous
    > RGB input arrays is ~2x faster. Tested on win32-py2.7.
    >
    > Christoph
    >

    Thank you!

    Looks good, speeds up zooming and panning on large images as advertised.
      An 8000x8000 image is actually manageable now.
    interpolation='nearest' is still very slow until the image is
    substantially zoomed, but everything is quite quick with other
    interpolation styles. The slowness of 'nearest' looks like a basic
    characteristic of the implementation.

    I committed the patch in 8966.

    Before that I found and committed a big speed-up in Normalize.

    Eric

Bug Report:

At some point between the recent revision and r8934, setting the alpha
value to anythhing but None will cause the image to not show. I suspect
it has something to do with some of the recent revisions. Maybe the
alpha values were being converted into an integer, causing them to be
zero? Then again, even setting alpha to 1 will cause the image to
disappear.

Ideas? Thoughts? I included an example script below.

Ben Root

Example script:

import numpy as np
import matplotlib.pyplot as plt

z = np.random.random((40, 50))

fig = plt.figure()
ax = fig.add_subplot(1, 2, 1)
ax.imshow(z, alpha=1.0)
ax.set_title('Blank!')

ax = fig.add_subplot(1, 2, 2)
ax.imshow(z, alpha=None)
ax.set_title("Not Blank")

plt.show()

===================================================================
--- lib/matplotlib/colors.py (revision 8967)
+++ lib/matplotlib/colors.py (working copy)
@@ -49,6 +49,7 @@
  'chartreuse' are supported.
  """
  import re
+import math
  import numpy as np
  from numpy import ma
  import matplotlib.cbook as cbook
@@ -547,6 +548,8 @@
          if alpha is not None:
              alpha = min(alpha, 1.0) # alpha must be between 0 and 1
              alpha = max(alpha, 0.0)
+ if bytes:
+ alpha = int(math.floor(alpha*255.9999999))
              if (lut[-1] == 0).all():
                  lut[:-1, -1] = alpha
                  # All zeros is taken as a flag for the default bad

Christoph