I sent this message sometime on Friday but as it doesn't seem to have made it to the sourceforge list yet I'm assuming it's not going to and trying again.
> Hi, I made a suggestion for improving imshow performance
> for plotting an image already in byte string form a
> while ago; some of the results are currently in CVS. I
> seems that other changes made to CVS at the same time or
> since mean that the floating point buffer source code is
> now much faster and I'd suggest taking anything sourced
> from my code back out.
Maybe that explains why I was never able to get the same performance
benefits you were seeing.... But I was able to get 1.5x - 2.5x faster
with your patch, which is pretty damned good. And the memory
benefit of your patch could be substantial as well. Why pull it?
When I first tried the new CVS code (other than my own) I was rather
rushed trying to get a paper finished and tested with too small an image
and thus saw too small a speedup for it to be worth it. I'm also using
non-contiguous arrays which make the relative speed increase smaller. I
also didn't think the interface was particularly user friendly.
I'm now sure why my code was faster (a mixture of data copying using
optimised functions and no need do floating point to integer conversion)
and have written some new code which is slightly faster than the old and
has a nicer interface (patch to CVS attached).
What are your latest profiling numbers?
Profiling with my latest version and comparing to CVS I get:
Running with 1024x1024:
Array set up: resident stack size: 37408, size: 12571
Tests done: resident stack size: 41536, size: 13596
Byte set up: resident stack size: 12876, size: 6429
Tests done: resident stack size: 12880, size: 6428
Fractional improvement: 13.724
Running with 2048x2048:
Array set up: resident stack size: 143956, size: 39197
Tests done: resident stack size: 156252, size: 42269
Byte set up: resident stack size: 37464, size: 12573
Tests done: resident stack size: 37464, size: 12572
Fractional improvement: 13.100
Running with 4096x4096:
Array set up: resident stack size: 561756, size: 143646
Tests done: resident stack size: 609388, size: 155963
Byte set up: resident stack size: 66376, size: 37178
Tests done: resident stack size: 132280, size: 37177
Fractional improvement: 13.943
Your patch came in just at the time I was leaving for Paris for a
meeting, after which I experienced a rather nasty total hard drive
failure that set me back in my mpl maintenance. I am not sure I am
ready to give up on the ideas you introduced.... Can you remind me --
did I manage to incorporate all of the matplotlib.axes and
matplotlib.image patches I needed to take advantage of your patches to
the _image extension code?
I think it's partly in there in a non-functional form, the patch I've
attached removes it and adds a new function to the Axes class called
directshow. This accepts the same syntax as imshow (where relevant)
rather than adding options to imshow; I chose to do this as my old
syntax for passing through imshow wasn't that easy to understand didn't
makes the different functionality clear. This function calls a class
call DirectImage which inherits from AxesImage.
I've also rewritten my c++ image object creation function called
frombyte to take an unsigned byte array as input rather than a buffer.
By using the std::memcopy function rather than a loop for copying the
speed advantage of passing data in as a buffer disappears and using
arrays is generally more intuitive. The function still only takes x*y*4
arrays as input at the moment as the processor time decrease from not
using loops to copy is fairly significant.
patch (10.1 KB)
On Thu, 2005-05-26 at 21:15 -0500, John Hunter wrote: