What I was alluding to was that if a backend primitive was
> added that allowed plotting a symbol (patch?) or point for
> an array of points. The base implementation would just do
> a python loop over the single point case so there is no
> requirement for a backend to overload this call. But it
> could do so if it wanted to loop over all points in C. How
> flexible to make this is open to discussion (e.g., allowing
> x, and y scaling factors, as arrays, for the symbol to be
> plotted, and other attributes that may vary with point such
> as color)
To make this work in the current design, you'll need more than a new
backend method.
Plot commands like scatter instantiate Artists (Circle) and add them
to the Axes as a generic patch instances. On a call to draw, the Axes
instance iterates over all of it's patch instances and forwards the
call on to the artists it contains. These, in turn instantiate gc
instances which contain information like linewidth, facecolor,
edgecolor, alpha , etc... The patch instance also transforms its data
into display units and calls the relevant backend method. Eg, a
Circle instance would call
renderer.draw_arc(gc, x, y, width, ...)
This makes it relatively easy to write a backend since you only have
to worry about 1 coordinate system (display) and don't need to know
anything about the Artist objects (Circle, Line, Rectangle, Text, ...)
The point is that no existing entity knows that a collection of
patches are all circles, and noone is keeping track of whether they
share a property or not. This buys you total flexibility to set
individual properties, but you pay for it in performance, since you
have to set every property for every object and call render methods
for each one, and so on.
My first response to this problem was to use a naive container class,
eg Circles, and an appropriate backend method, eg, draw_circles. In
this case, scatter would instantiate a Circles instance with a list of
circles. When Circles was called to render, it would need to create a
sequence of location data and a sequence of gcs
locs = [ (x0, y0, w0, h0), (x1, y1, w1, h1), ...]
gcs = [ circ0.get_gc(), circ1.get_gc(), ...]
and then call
renderer.draw_ellipses( locs, gcs).
This would provide some savings, but probably not dramatic ones. The
backends would need to know how to read the GCs. In backend_agg
extension code, I've implemented the code (in CVS) to read the python
GraphicsContextBase information using the python API.
_gc_get_linecap
_gc_get_joinstyle
_gc_get_color # returns rgb
This is kind of backward, implementing an object in python and then
accessing it at the extension level code using the Python API, but it
does keep as much of the frontend in python as possible, which is
desirable. The point is that for your approach to work and to not
break encapsulation, the backends have to know about the GC.
The discussion above was focused on preserving all the individual
properties of the actors (eg every circle can have it's own linewidth,
color, alpha, dash style). But this is rare. Usually, we just want to
vary one or two properties across a large collection, eg, color in
pcolor and size and color in scatter.
Much better is to implement a GraphicsContextCollection, where the
relevant properties can be either individual elements or
len(collection) sequences. If a property is an element, it's
homogeneous across the collection. If it's len(collection), iterate
over it. The CircleCollection, instead of storing individual Circle
instances as I wrote about above, stores just the location and size
data in arrays and a single GraphicsContextCollection.
def scatter(x, y, s, c):
collection = CircleCollection(x, y, s)
gc = GraphicsContextCollection()
gc.set_linewidth(1.0) # a single line width
gc.set_foreground(c) # a len(x) array of facecolors
gc.set_edgecolor('k') # a single edgecolor
collection.set_gc(gc)
axes.add_collection(collection)
return collection
And this will be blazingly fast compared to the solution above, since,
for example, you transform the x, y, and s coordinates as numeric
arrays rather than individually. And there is almost no function call
overhead. And as you say, if the backend doesn't implement a
draw_circles method, the CircleCollection can just fall back on
calling the existing methods in a loop.
Thoughts?
JDH