refactoring the backend drawing methods

In dealing with the profiler output from some of Fernando's log plots,
I was reminded of the very inefficient way matplotlib handles marker
plots -- see my last post "log scaling fixes in backend_ps.py" for
details. Some of these problems were fixed for scatter plots using
collections, but line markers remained inefficient.

On top of this inefficiency, there have been three lingering problems
with backend design that have bothered me. 1) No path operations
(MOVETO, LINETO, etc), 2) transforms are being done in the front end
which is inefficient (some backends have transformations for free, eg
postscript) and can lead to plotting artifacts (eg Agg, which has a
concept of subpixel rendering), and 3) backends have no concept of
state or a gc stack, which can lead to lots of receptive code and
needless function calls.

I've begin to address some of these concerns with a new backend method
"draw_markers". Currently the backend has too many drawing methods,
and this is yet another one. My goal is to define a core set, many
fewer than we have today, and do away with most of them. Eg
draw_pixel, draw_line, draw_point, draw_rectangle, draw_polygon, can
all be replaced by draw_path, with paths comprised solely of MOVETO,
LINETO and (optionally) ENDPOLY.

Which leads me to question one for the backend maintainers: can you
support a draw_path method? I'm not sure that GTK and WX can. I have
no idea about FLTK, and QT, but both of these are Agg backends so it
doesn't matter. All the Agg backends automagically get these for
free. I personally would be willing to lose the native GTK and WX
backends.

I've implemented draw_markers for agg in CVS. lines.py tests for this
on the renderer so it doesn't break any backend code right now.
Basically if you implement draw_markers, it calls it, otherwise it
does what it always did. This leads to complicated code in lines.py
that I hope to flush when the port to the other backends is complete.
draw_markers is the start of fixing backend problems 1 and 2 above.
Also, I will extend the basic path operations to support splines,
which will allow for more sophisticated drawing, and better
representations of circles -- eg Knuth uses splines to draw circles in
TeX) which are a very close approximation to real circles.

I'm not putting this in backend_bases yet since I'm using the presence
of the method as the test for whether a backend is ported yet in
lines.py

    def draw_markers(self, gc, path, x, y, transform):

path is a list of path elements -- see matplotlib.paths. Right now
path is only a data structure, which suffices for my simple needs
right now, but we can provide user friendly methods to facilitate the
generation of these lists down the road.

The coordinates of the "path" argument in draw_markers are display (eg
points for PS) and are simply points*dpi (this could be generalized if
need be with its own transform, but I don't see a need right now --
markers in matplotlib are by definition in points). x and y are in
data coordinates, and transform is a matplotlib.transform
Transformation instance. There are a few types of transformations
(separable, nonseparable and affine) but all three have a consistent
interface -- there is an (optional) nonlinear component, eg log or
polar -- and all provide an affine vec 6. Thus the transformation can
be done like

        if transform.need_nonlinear():
            x,y = transform.nonlinear_only_numerix(x, y)

        # the a,b,c,d,tx,ty affine which transforms x and y
        vec6 = transform.as_vec6_val()
        # apply an affine transformation of x and y

This setup buys us a few things -- for large data sets, it can save
the cost of doing the transformation for backends that have
transformations built in (eg ps, when the transformation happens at
rendering). For agg, it saves the number of passes through the data
since the transformation happens on the rendering loop, which it has
to make anyway. It also allows agg to try/except the nonlinear
transformation part, and drop data points which throw a domain_error
(nonpositive log). This means you can toggle log/linear axes with the
'l' command and won't raise even if you have nonpositive data on the
log axes.

Most importantly it buys you speed, since the graphics context is
marker path one need to be set once, outside the loop, and then you
can iterate over the x,y position vertices and draw that marker at
each position. This results in a 10x performance boost for large
numbers of markers in agg

Old
    N=001000: 0.24 s
    N=005000: 0.81 s
    N=010000: 1.30 s
    N=050000: 5.97 s
    N=100000: 11.46 s
    N=500000: 56.87 s

New:
    N=001000: 0.13 s
    N=005000: 0.19 s
    N=010000: 0.28 s
    N=050000: 0.66 s
    N=100000: 1.04 s
    N=500000: 4.51 s

agg implements this in extension code, which might be harder for
backend writers to follow as an example. So I wrote a template in
backend ps, which I named _draw_markers -- the underscore prevents it
from actually being called by lines.py. It is basically there to show
other backend writers how to iterate over the data structures and use
the transform

    def _draw_markers(self, gc, path, x, y, transform):
        """
        I'm underscore hiding this method from lines.py right now
        since it is incomplete
        
        Draw the markers defined by path at each of the positions in x
        and y. path coordinates are points, x and y coords will be
        transformed by the transform
        """
        if debugPS:
            self._pswriter.write("% markers\n")

        if transform.need_nonlinear():
            x,y = transform.nonlinear_only_numerix(x, y)

        # the a,b,c,d,tx,ty affine which transforms x and y
        vec6 = transform.as_vec6_val()
        # this defines a single vertex. We need to define this as ps
        # function, properly stroked and filled with linewidth etc,
        # and then simply iterate over the x and y and call this
        # function at each position. Eg, this is the path that is
        # relative to each x and y offset.
        ps = []
        for p in path:
            code = p[0]
            if code==MOVETO:
                mx, my = p[1:]
                ps.append('%1.3f %1.3f m')
            elif code==LINETO:
                mx, my = p[1:]
                ps.append('%1.3f %1.3f l')
            elif code==ENDPOLY:
                fill = p[1]
                if fill: # we can get the fill color here
                    rgba = p[2:]
                    
        vertfunc = 'some magic ps function that draws the marker relative to an x,y point'
        # the gc contains the stroke width and color as always
        for i in xrange(len(x)):
            # for large numbers of markers you may need to chunk the
            # output, eg dump the ps in 1000 marker batches
            thisx = x[i]
            thisy = y[i]
            # apply affine transform x and y to define marker center
            #draw_marker_here

        print 'I did nothing!'

For PS specifically, ideally we would define a native postscript
function for the path, and call this function for each vertex. Can
you insert PS functions at arbitrary points in PS code, or do they
have to reside in the header? If the former, we may want to buffer
the input with stringio to save the functions we need, since we don't
know until runtime which functions we'll be defining.

OK, give it a whirl. Nothing is set in stone so feel free to comment
on the design. I think we could get away with just a few backend
methods:

  # draw_lines could be done with a path but we may want to special
  # case this for maximal performance
  draw_lines

  draw_markers

  draw_path

  ... and I'll probably leave the collection methods...

Ted Drain mentioned wanting draw_ellipse for high resolution ellipse
drawing (eg or using discrete vertices). I'm not opposed to it, but I
wonder if the spline method of drawing ellipses referred to above
might not suffice here. In which case draw_ellipse would be subsumed
under draw_path.

Although what I've done is incomplete, I thought it might be better to
get something in CVS to give other backend writers time to implement
it, and to get some feedback before finishing the refactor.

Also, any feedback on the idea of removing GD, native GTK and native
WX are welcome. I'll bounce this off the user list in any case.

JDH

John Hunter wrote:

I've begin to address some of these concerns with a new backend method
"draw_markers". Currently the backend has too many drawing methods,
and this is yet another one. My goal is to define a core set, many
fewer than we have today, and do away with most of them. Eg
draw_pixel, draw_line, draw_point, draw_rectangle, draw_polygon, can
all be replaced by draw_path, with paths comprised solely of MOVETO,
LINETO and (optionally) ENDPOLY.

Though the idea of having a minimal set of draw methods is esthetically appealing, in practice having to reuse these few commonly drawn methods to construct simple compound objects can become cumbersome and annoying. I would suggest taking another look at Adobe's PDF language and Apple's Quartz (or display PDF). It is iteresting to see that each new version adds one or two new draw methods, e.g. rectangle and rectangles, to make it easier and faster to draw these common paths. I'm guessing these new methods are based on copius experience.

While you are at it, I would suggest looking at the clipping issue. As in PS and PDF, a path can be used as a clipping region. This allows the backend, in PS and PDF, to do the clipping for you, which can make the code simpler and also faster. Clipping using arbitrary paths may currently be an issue for AGG, but in my opinion, it is something that AGG will eventually have to come to grips with.

Just my $0.01.

  -- Paul

···

--
Paul Barrett, PhD Space Telescope Science Institute
Phone: 410-338-4475 ESS/Science Software Branch
FAX: 410-338-4767 Baltimore, MD 21218

Which leads me to question one for the backend maintainers: can you
support a draw_path method? I'm not sure that GTK and WX can. I have
no idea about FLTK, and QT, but both of these are Agg backends so it
doesn't matter. All the Agg backends automagically get these for
free. I personally would be willing to lose the native GTK and WX
backends.

I don't think that GTK can support a draw_path method at the moment, but
when GTK starts to use Cairo it should.

Also, any feedback on the idea of removing GD, native GTK and native
WX are welcome. I'll bounce this off the user list in any case.

The GTK+ developers have recently announced that Cairo is now a GTK+
dependency
http://www.osnews.com/story.php?news_id=9609

Pango is currently being updated to use Cairo, and later GDK and GTK+
are expected to support/use Cairo. It looks like GTK rendering is in the
process of being improved dramatically, so I'd recommend keeping the
native GTK backend a bit longer to see what happens.

Also it means we can expect all future Linux distributions (that include
GTK+ 2.8 or later) to include Cairo.

Regards
Steve

···

On Tue, 2005-02-08 at 08:51 -0600, John Hunter wrote:

John Hunter wrote:

    def draw_markers(self, gc, path, x, y, transform):

It might be worth generalizing this to take some optional array/scalar
arguments to apply marker by marker scaling (or even independent x/y
scaling) as well as overrides to some gc items (e.g. color). This doesn't
need to be done right away, but may provide some very useful capability
(e.g., error bars could be done as markers if y or x size could be scaled
for every x,y location.

Perry

I've implemented draw_markers() for Cairo, and tested it using line-
styles.py - it seems to be working OK. I did find that it caused
draw_lines() to stop working and had to modify it to get it working
again.

I don't think 'fill' and 'fill_rgb' information should be encoded into
the 'path', and would to prefer to have rendering separated into two
independent steps:
1) call a function to parse 'path' and generate a path - the path is a
general path (with no fill or colour info) that you can later use any
way you wish.
2) set colour etc as desired and fill/stroke the path.

The draw_markers() code I've written calls generate_path() before
drawing each marker and it reads the fill value and the fill_rgb each
time which it unnecessary since the values are the same for all the
markers. Passing the fill_rgb as an extra argument to draw_markers()
would be one way to 'fix' this.

Cairo (and probably Agg, PS, SVG) supports rel_move_to() and rel_line_to
() - so you can define a path using relative rather than absolute
coords, which can sometimes be useful.
For example, instead of
  translate(x,y)
   generate_absolute_path(path)
        stroke()
you can use
  move_to(x,y)
        generate_relative_path(path)
        stroke()
and the path is stroked relative to x,y with no need to translate the
coordinates.

Steve