Hi,
When improving the performance of plotting high-dimensional data using faceted scatter plots, I noticed that much of time was spent on the axis creation (even 50%!).
On my machine creating 20x20 array of subplots without actually plotting anything takes about 11 seconds (for comparison plotting 5000 points on all of them takes only 0.6s!):
import matplotlib
matplotlib.interactive(True)
import matplotlib.pyplot as plt
fig, axes = plt.subplots(20,20)
plt.show()
Profiling shows that 50% of computation time is spent on axis/ticks creation [1], which I have to remove anyways. Is there any easy way of creating thinned axes without ticks and spines?
So far I solved the problem by subclassing Axes class (see this gist [2]) and removing all spines and ticks. Running the above example gives a 10x boost in performance (from 11s to 0.9s).
import thin_axes
fig, axes = plt.subplots(20,20, subplot_kw=dict(projection='thin'))
plt.show()
Profiling results show more uniform distribution of computing time across functions (most time is spent on creating and applying transforms [3]).
The thinned class seems a bit hacky. Is there any other way to create a raw Axes object without spines, ticks, labels etc., just pure canvas with appropriate transforms?
Yours,
Bartosz
[1] profiling results of vanilla Axes: http://pbrd.co/1jlovoo
[2] https://gist.github.com/btel/a6b97e50e0f26a1a5eaa
[3] profiling results of thined Axes:
Hi, I also have found tick marks to be a real performance drain and am trying to fix this. I have yet to get my ideas all in a shape which is worthy of a pull request. It's a rather large change under the hood and so there are probably quite a few edge cases which I'm not really aware of since I'm sure I only care about 50% (or less) of the full range of flexibility. That said, simple graphs with basic tick marks are much slower than they need to be.
My work is at GitHub - jbmohler/mplfastaxes: prototype for optimized MPL tick marks and I also used the custom projection method to replace the Axes/Axis classes. I have incorporated your example because I think it is interesting (even through 20x20 grid of axes seems crazy to me ... it may make sense though ).
You have addressed a somewhat different case than myself because I've focused on the speed of drawing the graphics where-as your gist illustrates that making a new figure with many axes is very slow. I believe the same ideas apply and I'm going to spend some time right now improving my code's initialization which is basically unchanged from MPL at this point.
Joel
···
On 07/08/2014 11:33 AM, Bartosz wrote:
Hi,
When improving the performance of plotting high-dimensional data using
faceted scatter plots, I noticed that much of time was spent on the axis
creation (even 50%!).
On my machine creating 20x20 array of subplots without actually plotting
anything takes about 11 seconds (for comparison plotting 5000 points on
all of them takes only 0.6s!):
import matplotlib
matplotlib.interactive(True)
import matplotlib.pyplot as plt
fig, axes = plt.subplots(20,20)
plt.show()
Profiling shows that 50% of computation time is spent on axis/ticks
creation [1], which I have to remove anyways. Is there any easy way of
creating thinned axes without ticks and spines?
So far I solved the problem by subclassing Axes class (see this gist
[2]) and removing all spines and ticks. Running the above example gives
a 10x boost in performance (from 11s to 0.9s).
import thin_axes
fig, axes = plt.subplots(20,20, subplot_kw=dict(projection='thin'))
plt.show()
I wonder if the tick marks could take advantage of advantages of Line2DCollection (if it hasn’t already), or maybe go so far as to have them be PatchCollections? We could maintain full feature set and such, but take advantage of some of the optimized rendering pathways in the backends that were originally made for plot()?
Just thinking off the top of my head at the moment.
Cheers!
Ben Root
···
On Wed, Jul 9, 2014 at 12:54 PM, Joel B. Mohler <joel@…272…1193…> wrote:
On 07/08/2014 11:33 AM, Bartosz wrote:
Hi,
When improving the performance of plotting high-dimensional data using
faceted scatter plots, I noticed that much of time was spent on the axis
creation (even 50%!).
On my machine creating 20x20 array of subplots without actually plotting
anything takes about 11 seconds (for comparison plotting 5000 points on
all of them takes only 0.6s!):
import matplotlib
matplotlib.interactive(True)
import matplotlib.pyplot as plt
fig, axes = plt.subplots(20,20)
plt.show()
Profiling shows that 50% of computation time is spent on axis/ticks
creation [1], which I have to remove anyways. Is there any easy way of
creating thinned axes without ticks and spines?
So far I solved the problem by subclassing Axes class (see this gist
[2]) and removing all spines and ticks. Running the above example gives
a 10x boost in performance (from 11s to 0.9s).
import thin_axes
fig, axes = plt.subplots(20,20, subplot_kw=dict(projection=‘thin’))
plt.show()
Hi, I also have found tick marks to be a real performance drain and am
trying to fix this. I have yet to get my ideas all in a shape which is
worthy of a pull request. It’s a rather large change under the hood and
so there are probably quite a few edge cases which I’m not really aware
of since I’m sure I only care about 50% (or less) of the full range of
flexibility. That said, simple graphs with basic tick marks are much
slower than they need to be.
My work is at https://github.com/jbmohler/mplfastaxes and I also used
the custom projection method to replace the Axes/Axis classes. I have
incorporated your example because I think it is interesting (even
through 20x20 grid of axes seems crazy to me … it may make sense
though ).
You have addressed a somewhat different case than myself because I’ve
focused on the speed of drawing the graphics where-as your gist
illustrates that making a new figure with many axes is very slow. I
believe the same ideas apply and I’m going to spend some time right now
improving my code’s initialization which is basically unchanged from MPL
at this point.
Joel
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel