On Wednesday, November 17, 2010, Benjamin Root <ben.root@…553…> wrote:
On Tue, Nov 16, 2010 at 5:20 PM, J P <jpscipy@…149…> wrote:
Hi all, here’s my first patch for matplotlib. Someone noticed at Stack Overflow that the plot_surface function in mplot3d wasn’t especially fast for a lot of points (and small rstrides/cstrides) and using shading and a single color. I found some parts of the code that weren’t vectorized. These are my changes so far.
Summary of changes:
- Changed from double looping over aranges to using xrange
- Made the normalization of the normals and their dot product with the vector [-1,-1,0.5] to find the shading a vectorized operation.
- Changed a list comprehension which calculated the colors using an iterative approach to using the already built-in vectorization of the Normalization class and using the np.outer function. The result is a numpy array rather than a list which actually speeds up things down the line.
- removed the corners array from plot_surface which wasn’t ever used or returned. It didn’t really slow things down, but I’m thinking that it is cruft.
For change number two, I made a separate function that generates the shades, but feel free to move that around if you prefer… or maybe it should be a function that begins with a _ because it shouldn’t be used externally. These changes give varying levels of speed improvement depending on the number of points and the rstrides/cstrides arguments. With larger numbers of points and small rstrides/cstrides, these changes can more than halve the running time. I have found no difference in output after my changes.
I know there is more work to be done within the plot_surface function and I’ll submit more changes soon.
Thank you for your efforts to improve the performance of mplot3d. I will take a look at the patches you have submitted and test them out. What I am probably going to do is break down the patches into smaller pieces and incorporate them piece-by-piece.
I tried refactoring plot_surface once before to mixed results. The key feature I was trying to gain was compatibility with masked arrays. I wanted to duplicate the behavior of contourf and pcolor where masked out portions of the surface would not be created. I managed to get it to work for that particular use-case, but I broke a bunch of other use-cases along the way. I am curious to see if your patches make this easier/harder to do.
Thank you for your efforts and thank you for using matplotlib!
I have looked through the patches, and there are definite
improvements. I have broken the work into four separate patches. The
first patch is essentially code cleanup and utilization of xrange
(plot_surface_cleanup.patch). This patch can be back-ported without
concern (although it doesn’t fix any bug per se).
The second patch does the vectorization of the shades. The original
patch that was submitted had some edge cases, but I have found that
just simply converting that for-loop that creates the shades into a
list comprehension (and then pass into np.array) yielded almost
identical speedups without changing the current code path. (Note: I
am a minimalist when it comes to patches). This is in
The third patch improves the calculation of the normals in
plot_surface by pre-allocating the arrays for calculating the vectors
and doing a final call to np.cross rather than appending on a list. I
deviated slightly from the original patch by calling “which” as
“which_pt”, adding a couple of comments, and also added an else
condition to create a “normal” with an empty list. This last part is
to keep the code path as similar as it was before. It was
theoretically possible to utilize a variable called normal elsewhere
without all the same conditions holding, so this guarantees that
normal exists, which was the case before. This patch is
Finally, the fourth patch utilizes numpy array functionality for
calculating the vertexes. This patch also forgoes the use of
transposed arrays. I took the original patch a step further and
eliminated the array transpose line earlier in the plot_surface
function. The array transpose was not being properly utilized here,
and I saw no speed penalty/speedup either way, so in favor of simpler
code, I eliminated its use. This patch is
Of the four patches, the speedups are in mostly found in the second
patch (100% speedup). The first patch does provide noticeable
improvements. There is also a slight improvement with the third
patch. I am up in the air regarding speed improvements with the
fourth patch, but I wonder if there might be some memory improvements
here, or if any speedup is being hidden by the for-loop that the
vectorization is done in.
I will let these patches be mulled over before applying them. Thanks
to JP for submitting the original patch.