Mixed-mode rendering

Hi,

On trunk, there is mixed-mode rendering support built in to (at least) the SVG and PDF backends, though there are no calls to start/stop_rasterizing() that utilize the raster mode. I’ve implemented mode switching for those backends, and would appreciate feedback on what I’ve done.

There are two modes that might drive a switch to raster for some artists:

  1. User-driven specification of known complex artitsts.

  2. Automatic detection of artist complexity (by type or vertex count).

The first mode is what I coded up, so I’ll discuss it below.

A list of artists to rasterize is passed as a draw_raster kwarg to savefig, which percolates down into print_* in the backend-specific figure canvas. When the backend’s canvas creates a renderer, the draw_raster list is placed as an attribute on the renderer instance. I figured that the renderer should be responsible for transporting the list of artists needing rasterization, since it’s at the renderer level that pixel vs. vector matters.

The switch to/from raster mode was made in Axes.draw, where the artists for each axes are looped over. In the artist loop, I check if the artist to be rendered is listed in the draw_raster attribute on the renderer instance. If so, the appropriate calls are made to start and stop rasterizing.

Sample usage:

f=pyplot.figure()

ax=f.add_subplot(111)

p,=ax.plot(range(10))

f.savefig(‘test.pdf’, draw_raster=(p,))

svn diff at http://www.deeplycloudy.com/20080503-matplotlib-mixed-mode-r5110.diff

Thanks,

Eric Bruning

Graduate Research Assistant, Meteorology, Univ. Oklahoma

As of 6/1/2008, Research Assoc., Univ. Maryland/CICS and NOAA/NESDIS/STAR

Hi Eric, thanks for the patch. There are a couple of aspects of the
design here that I am not comfortable with, but I think with a few
changes this will be useful (though Michael, who implemented the mixed
mode renderer, will surely have important comments). The primary
thing that bothers me is that one of the core aspects of the
matplotlib backend design is that the renderers know nothing about
artists -- artists know about renderers, but not the other way around.
So I don't like using the renderer to store the rasterized artists.
It makes more sense to me for the artist to have has a property set
("set_rasterized" which could be True|False|None where None means "do
the default thing for the renderer"). Then you could do:

   if a.get_rasterized():
       renderer.start_rasterizing()
       a.draw(renderer)
       renderer.stop_rasterizing()
   else:
       a.draw(renderer)

Doing this in the axes.draw method may not be the most natural place
to do this since it could be done in the artist.draw method, but it
may be the most expedient. This is an area where having support for
before_draw and after_draw hooks might be useful. One potential
problem with either of these approached is it looks like the mixed
mode renderer is set up to handle multiple rasterized draws before
dumping the aggregate image into the backend on a stop_renderer, so
doing the start/stop in any of the approaches above would obviate this
efficiency. The axes could aggregate the rasterized artists before
rendering and then do them all together, but making this play properly
with zorder will be tricky. It does something like this already with
the "animated" artists so you may want to look at that. For animated
artists in the current implementation, zorder is ignored (the animated
artists are drawn on top). Chaco does something a bit more
sophisticated than this, since they have separate rendering levels and
buffers.

Another, less critical, aspect of the patch that bothers me is
tagging the renderer with the undocumented attribute "draw_raster"
and then checking this with a hasattr in axes.draw. python let's you
do this kind of stuff, and I've done plenty of it myself in
application building, but in my experience it makes for code that is
hard to maintain once the code base grows sufficiently large.
Although the mpl setters and getters are not the most pythonic
approach, they do make for code that is fairly readable and,
importantly, easily documented.

JDH

···

On Sat, May 3, 2008 at 11:44 PM, Eric Bruning <eric.bruning@...149...> wrote:

The switch to/from raster mode was made in Axes.draw, where the artists for
each axes are looped over. In the artist loop, I check if the artist to be
rendered is listed in the draw_raster attribute on the renderer instance. If
so, the appropriate calls are made to start and stop rasterizing.

Thanks for having a second look at this, Eric. I consider mixed-mode drawing somewhat of an experiment at this point -- it's still an open question whether it should be included in the next release, and definitely needs more use cases. It is currently only used in the trunk by quad meshes. In the common case where the mesh is of sufficient resolution, the rasterized version results in a smaller, faster file. However, it probably needs to be exposed as an option to the user, in the case where the quads are large (or to do it adaptively based on dpi, but that may be too smart).

John Hunter wrote:

The switch to/from raster mode was made in Axes.draw, where the artists for
each axes are looped over. In the artist loop, I check if the artist to be
rendered is listed in the draw_raster attribute on the renderer instance. If
so, the appropriate calls are made to start and stop rasterizing.
    
Hi Eric, thanks for the patch. There are a couple of aspects of the
design here that I am not comfortable with, but I think with a few
changes this will be useful (though Michael, who implemented the mixed
mode renderer, will surely have important comments). The primary
thing that bothers me is that one of the core aspects of the
matplotlib backend design is that the renderers know nothing about
artists -- artists know about renderers, but not the other way around.
So I don't like using the renderer to store the rasterized artists.
It makes more sense to me for the artist to have has a property set
("set_rasterized" which could be True|False|None where None means "do
the default thing for the renderer"). Then you could do:

   if a.get_rasterized():
       renderer.start_rasterizing()
       a.draw(renderer)
       renderer.stop_rasterizing()
   else:
       a.draw(renderer)
  

This is where I was (implicitly) going with all this.

Doing this in the axes.draw method may not be the most natural place
to do this since it could be done in the artist.draw method, but it
may be the most expedient. This is an area where having support for
before_draw and after_draw hooks might be useful. One potential
problem with either of these approached is it looks like the mixed
mode renderer is set up to handle multiple rasterized draws before
dumping the aggregate image into the backend on a stop_renderer, so
doing the start/stop in any of the approaches above would obviate this
efficiency.
   The axes could aggregate the rasterized artists before
rendering and then do them all together, but making this play properly
with zorder will be tricky.

That's right. To receive significant gains, you would generally want to perform a number of drawing operations in a single raster buffer. Given how mpl is currently designed, that generally implies a Collection (which is fairly easy to wrap start/stop_rasterizing calls around) -- it doesn't necessarily have to mean a whole bunch of separate Artist objects (which would be much harder to do given the current way things are drawn). The latter may be ultimately more optimal, but the former is an easy win.

My concern with this patch is that it expects the user to know about artists at all. Sure, there are many advanced techniques where the user has to know what artists are etc., but for a lot of the pylab-level usage, that's not a concept many users must be familiar with. I think it makes more sense to just add a "rasterized" kwarg (and get/set_rasterized) to calls such as "pcolormesh" so the user can choose how it is drawn. The "draw_raster" list concept requires users to create a "set" from something that IMHO is more like a flag on individual objects. As an obtuse example, it would be like creating a list of all blue objects, and a list of all red ones, rather than just setting their colors appropriately.

As for the implementation, Eric's patch does appear to deal with the z-order problem, (by interleaving rasterized and non-rasterized drawing correctly base on zorder), but it doesn't combine adjacent rasterized artists into a single buffer. The drawing loop in Axes.draw could fairly easily be modified to do that. However, I think a better solution that doesn't require an explicit "draw_raster" list, is to make "stop_rasterizing" lazy. At the moment, when "stop_rasterizing" is called, the buffer is immediately flushed and written out. If instead it just set a flag, causing the buffer to be flushed when the next vector object is written, then something like

  start_rasterizing()
  draw_line()
  stop_rastering()
  start_rastering()
  draw_line()
  stop_rasterizing()
  draw_line()

would write out a single raster buffer with two lines, followed by a vector line. Of course, and here is the tricky bit, if the two rasterized objects are really far apart spatially, you waste a lot of space on transparent pixels between them. We can let the user decide whether to rasterize each individually with a nice clean API, but letting the user decide whether to combine adjacent rasterizations gets tricky and I think is asking too much of someone who doesn't know the mpl internals. Perhaps that is an argument against trying to implicitly combine adjacent rasterized draws -- it's trying to be too smart?

Anyway, somewhere in the midst of all this is the correct path...

Cheers,
Mike

···

On Sat, May 3, 2008 at 11:44 PM, Eric Bruning <eric.bruning@...149...> wrote:

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA