Subsetting fonts in Postscript

[I've been discussing this off-list with John Hunter, and I thought I'd summarize that conversation in case anyone else on this list has any thoughts or suggestions.]

I've started working on the problem of reducing Postscript output file sizes by saving out only the glyphs that are used in the figure. There are (at least) two alternative approaches:

1. Subset the Truetype font into another Truetype font and embed it as we do now. This could theoretically be done with fonttools/ttx. Writing out .ttf files looks to be rather complex, and there's a lot of griping about the format itself to be found on the 'net. John also mentioned that he'd prefer not to add the requirement of fonttools to the mix from past experience.

2. Convert the Truetype font to a Type 3 font (which is basically a set of standard Postscript commands). There is a small C application (http://www.this.net/~frank/ttconv.tar.gz) that converts TTF to Type 3 that looks to work quite well. Some modifications would have to be made to actually subset the font and to integrate with Python etc., but it's fairly straightforward code, and the licensing is amenable to including it in the matplotlib source tree.

Clearly, I'm leaning toward option #2, but thought I'd open it to the crowd to see if there are any other options or opinions on the matter.

The plan is to make the choice of the existing or new behavior be an option, with the default TBD.

Cheers,
Mike

Michael Droettboom wrote:

[I've been discussing this off-list with John Hunter, and I thought I'd summarize that conversation in case anyone else on this list has any thoughts or suggestions.]

I've started working on the problem of reducing Postscript output file sizes by saving out only the glyphs that are used in the figure. There are (at least) two alternative approaches:

1. Subset the Truetype font into another Truetype font and embed it as we do now. This could theoretically be done with fonttools/ttx. Writing out .ttf files looks to be rather complex, and there's a lot of griping about the format itself to be found on the 'net. John also mentioned that he'd prefer not to add the requirement of fonttools to the mix from past experience.

2. Convert the Truetype font to a Type 3 font (which is basically a set of standard Postscript commands). There is a small C application (Domain im Kundenauftrag registriert) that converts TTF to Type 3 that looks to work quite well. Some modifications would have to be made to actually subset the font and to integrate with Python etc., but it's fairly straightforward code, and the licensing is amenable to including it in the matplotlib source tree.

Clearly, I'm leaning toward option #2, but thought I'd open it to the crowd to see if there are any other options or opinions on the matter.

I'm very glad to hear that you are working on this, and option #2 sounds good to me. Is the potential advantage of #1 better ultimate rendering quality? Or smaller file size?

It looks like fonttools has been untouched since 2002, correct?

The plan is to make the choice of the existing or new behavior be an option, with the default TBD.

Is there any reason *not* to do the subsetting?

Eric

···

Cheers,
Mike

There was some original confusion in a potential loss of quality in
truetype/type2 conversions, because of quartic vs cubic spline
approximations in the two specifications. When we were concerned that
some users may be hit by a loss-of-quality in conversion, we
considered making the conversion and subsetting optional. Michael
later clarifed that the loss (which happens only in corner cases)
would occur in the type3->truetype conversion, and not in the
truetype->type3 case we are interested in because type3 uses quartic
and truetype uses cubic. Unless there is a good reason to make it
optional, I would like to make it as simple as possible and simply do
the conversion and embedding every time. This will make support and
debugging easier.

JDH

···

On 7/5/07, Eric Firing <efiring@...229...> wrote:

> The plan is to make the choice of the existing or new behavior be an
> option, with the default TBD.

Is there any reason *not* to do the subsetting?

Eric Firing wrote:

Michael Droettboom wrote:

1. Subset the Truetype font into another Truetype font and embed it as we do now. This could theoretically be done with fonttools/ttx. Writing out .ttf files looks to be rather complex, and there's a lot of griping about the format itself to be found on the 'net. John also mentioned that he'd prefer not to add the requirement of fonttools to the mix from past experience.

2. Convert the Truetype font to a Type 3 font (which is basically a set of standard Postscript commands). There is a small C application (Domain im Kundenauftrag registriert) that converts TTF to Type 3 that looks to work quite well. Some modifications would have to be made to actually subset the font and to integrate with Python etc., but it's fairly straightforward code, and the licensing is amenable to including it in the matplotlib source tree.

Clearly, I'm leaning toward option #2, but thought I'd open it to the crowd to see if there are any other options or opinions on the matter.

I'm very glad to hear that you are working on this, and option #2 sounds good to me. Is the potential advantage of #1 better ultimate rendering quality? Or smaller file size?

Potentially on both counts. Hinting will not be converted, and since TT and PS have slightly different rendering models, there is the potential for rounding error etc. (though I don't know how real of a problem that is.) Also, Type 3 is an ASCII format, so if the file weren't subsetted, the size would certainly be larger. So a lot depends on the ratio of glyphs in the original font to glyphs in the figure, obviously. Of course, we could have an "auto" mode, where whichever is ultimately smaller is written out.

It looks like fonttools has been untouched since 2002, correct?

I wasn't able to find anything newer either.

The plan is to make the choice of the existing or new behavior be an option, with the default TBD.

Is there any reason *not* to do the subsetting?

If hinting is a requirement, yes, if the PS file is to be used on a lo-res printer or screen. Somewhat of a side case, maybe.

Cheers,
Mike

Will this (whichever method is chosen) work for PDF too?

Just wondering,

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

You might take a look at what kind of PostScript and PDF output you
get from cairo right now, (since cairo has many different kinds of
font subsetting, (type3, type42 and others), and it's regularly being
tested on as many PostScript and PDF viewers as possible).

I don't know if there's anything special about the PostScript output
you're currently producing that wouldn't make it acceptable to use
cairo's PostScript output directly. But even if you just want code,
it's inside cairo under the LGPL.

-Carl

···

On Thu, 05 Jul 2007 07:37:21 -1000, Eric Firing wrote:

> 2. Convert the Truetype font to a Type 3 font (which is basically a set
> of standard Postscript commands). There is a small C application
> (Domain im Kundenauftrag registriert) that converts TTF to Type 3
> that looks to work quite well. Some modifications would have to be made
> to actually subset the font and to integrate with Python etc., but it's
> fairly straightforward code, and the licensing is amenable to including
> it in the matplotlib source tree.
>
> Clearly, I'm leaning toward option #2, but thought I'd open it to the
> crowd to see if there are any other options or opinions on the matter.

Hey Carl,

I looked at cairo when we first started with the postscript backend,
but in the bad old days it was just a raster dump. I understand it
has come a long way since.

mpl's postscript backend supports latex expressions in PS output,
which requires a fair amount of complex trickery in the postscript
backend, though we we could probably do it with embedded rasters in
cairo. The postscript backend is also standalone with no dependencies
other than mpl and numpy, and adding cairo to the mix might be a bit
difficult for across platforms for some users (though this appears to
have gotten a lot better too). LGPL means we cannot reuse the code.

While I like the idea of using cairo for both raster and vector
outputs in principle because it offloads a lot of work onto a large
and well supported project, it would probably take a fair amount of
work to get all of mpl's functionality into the cairo backend (I don't
know this since I have not tested the backend for some time, but does
it support, for example unicode_demo, mathtext_demo, usetex, and
image_demo ?).

JDH

···

On 7/5/07, Carl Worth <cworth@...528...> wrote:

You might take a look at what kind of PostScript and PDF output you
get from cairo right now, (since cairo has many different kinds of
font subsetting, (type3, type42 and others), and it's regularly being
tested on as many PostScript and PDF viewers as possible).

I don't know if there's anything special about the PostScript output
you're currently producing that wouldn't make it acceptable to use
cairo's PostScript output directly. But even if you just want code,
it's inside cairo under the LGPL.

Carl Worth wrote:

You might take a look at what kind of PostScript and PDF output you
get from cairo right now, (since cairo has many different kinds of
font subsetting, (type3, type42 and others), and it's regularly being
tested on as many PostScript and PDF viewers as possible).
  

Thanks for the tip. Indeed, using the unicode_test.py example (which probably has a greater than average amount of text in it), the file sizes are (with the size of the font section is parentheses):

    backend_ps.py: 135763 (127211)
    cairo: 49102 (39669)

Interestingly, the non-font part is slightly larger for Cairo (9433 vs. 8552)

I don't know if there's anything special about the PostScript output
you're currently producing that wouldn't make it acceptable to use
cairo's PostScript output directly. But even if you just want code,
it's inside cairo under the LGPL.
  

It may be worthwhile to look at Cairo's font subsetting code if it's determined that the Python Postscript backend has other advantages. I'm sure people who've been here longer than I have can better speak to those pros and cons.

Cheers,
Mike

Michael Droettboom wrote:

Carl Worth wrote:
  

You might take a look at what kind of PostScript and PDF output you
get from cairo right now, (since cairo has many different kinds of
font subsetting, (type3, type42 and others), and it's regularly being
tested on as many PostScript and PDF viewers as possible).
  

Thanks for the tip. Indeed, using the unicode_test.py example (which probably has a greater than average amount of text in it), the file sizes are (with the size of the font section is parentheses):

    backend_ps.py: 135763 (127211)
    cairo: 49102 (39669)

Interestingly, the non-font part is slightly larger for Cairo (9433 vs. 8552)
  

Though, I should add, there is a bug in Cairo output with unicode_demo.py: The y-axis label reads "stream-vera/VeraSe.ttf"...

Cheers,
Mike

Unfortunately, because it is LGPL, I don't think we can in good
conscience look at the code, because doing so probably violates the
spirit of the LGPL which says you can link with it but not reuse the
code in a non GPL/LGPL program. Others may have a different
interpretation, and if look but don't copy is OK under the LGPL as it
is journalism (read and summarize with citation but don't plagiarize)
then its fine by me but that's not my current understanding.

JDH

···

On 7/5/07, Michael Droettboom <mdroe@...31...> wrote:

It may be worthwhile to look at Cairo's font subsetting code if it's
determined that the Python Postscript backend has other advantages. I'm
sure people who've been here longer than I have can better speak to
those pros and cons.

John Hunter wrote:

It may be worthwhile to look at Cairo's font subsetting code if it's
determined that the Python Postscript backend has other advantages. I'm
sure people who've been here longer than I have can better speak to
those pros and cons.

Unfortunately, because it is LGPL, I don't think we can in good
conscience look at the code, because doing so probably violates the
spirit of the LGPL which says you can link with it but not reuse the
code in a non GPL/LGPL program. Others may have a different
interpretation, and if look but don't copy is OK under the LGPL as it
is journalism (read and summarize with citation but don't plagiarize)
then its fine by me but that's not my current understanding.

Agreed. I haven't looked at it the Cairo source yet, so you can still consider me "untainted" in that regard. My earlier comment was mainly out of licensing confusion. :wink:

Do you agree that it is still an open question whether it's better to spend time improving the matplotib PS backend, or to fix (if possible) the issues with matplotlib's Cairo integration? It does ultimately come down to a tradeoff: an additional dependency vs. extra maintenance burden. Maybe it would be a good start to enumerate the Cairo backend's current shortcomings. (So far I've seen some minor text bugs, and math rendering is raster dumps.)

Cheers,
Mike

···

On 7/5/07, Michael Droettboom <mdroe@...31...> wrote:

Do you agree that it is still an open question whether it's better to
spend time improving the matplotib PS backend, or to fix (if possible)
the issues with matplotlib's Cairo integration? It does ultimately come
down to a tradeoff: an additional dependency vs. extra maintenance

The postscript backend as it stands is in good shape, and is full
featured (Darren can tell you how much work he has put into supporting
and enhancing the latex support). The last major issue with it is the
font size issue, and with your help a solution is on the horizon. So
it is definitely a good use of time to fix this last bit. It doesn't
sound like your "option 2" is a ton of work, but correct me if I'm
wrong.

While I would love to see cairo become a full featured backend, and
for there to be additional GUI support like tkcairo, wxcairo, etc, we
are a lot farther from that goal than we are to getting the font sizes
down in the existing postscript backend. And I like the fact the mpl
is completely BSD-ish -- relying on a core component which is LGPL
would be a step back in my book, though having it as an option would
be great.

http://www.scipy.org/License_Compatibility

burden. Maybe it would be a good start to enumerate the Cairo backend's
current shortcomings.

As a start, you might try adding cairo to the list of backends in
examples/backend_driver.py and see if everything passes, and take a
look at the generated images, eg compared to those of Agg, and see if
you identify any other discrepancies. Steve Chaplin who wrote the
cairo backend can also elaborate.

(So far I've seen some minor text bugs, and math
rendering is raster dumps.)

Do you mean mathtext or usetex? - The former is mpl's own math layout
using the cm*.ttf files, and should work like any other text in the
file. The latter uses tex and dvipng rasters (at least in agg), but I
don't think it is supported in cairo. So I am not sure where these
rasters are coming from, unless cairo is converting all text to
rasters.

JDH

···

On 7/5/07, Michael Droettboom <mdroe@...31...> wrote:

John Hunter wrote:

Do you agree that it is still an open question whether it's better to
spend time improving the matplotib PS backend, or to fix (if possible)
the issues with matplotlib's Cairo integration? It does ultimately come
down to a tradeoff: an additional dependency vs. extra maintenance

The postscript backend as it stands is in good shape, and is full
featured (Darren can tell you how much work he has put into supporting
and enhancing the latex support). The last major issue with it is the
font size issue, and with your help a solution is on the horizon. So
it is definitely a good use of time to fix this last bit. It doesn't
sound like your "option 2" is a ton of work, but correct me if I'm
wrong.

No, not a ton of work. And this context is helpful.

While I would love to see cairo become a full featured backend, and
for there to be additional GUI support like tkcairo, wxcairo, etc, we
are a lot farther from that goal than we are to getting the font sizes
down in the existing postscript backend. And I like the fact the mpl
is completely BSD-ish -- relying on a core component which is LGPL
would be a step back in my book, though having it as an option would
be great.

http://www.scipy.org/License_Compatibility

Agreed.

Do you mean mathtext or usetex? - The former is mpl's own math layout
using the cm*.ttf files, and should work like any other text in the
file. The latter uses tex and dvipng rasters (at least in agg), but I
don't think it is supported in cairo. So I am not sure where these
rasters are coming from, unless cairo is converting all text to
rasters.

mathtext_demo.py -- It originally looked like the math text was rasterized, but the tick labels are not. On closer inspection, it seems all the text is rasterized. The fonts are not rasterized when using the Cairo backend with unicode_demo.py. Haven't looked into that any deeper...

Cheers,
Mike

···

On 7/5/07, Michael Droettboom <mdroe@...31...> wrote:

> Do you agree that it is still an open question whether it's better to
> spend time improving the matplotib PS backend, or to fix (if possible)
> the issues with matplotlib's Cairo integration? It does ultimately come
> down to a tradeoff: an additional dependency vs. extra maintenance

The postscript backend as it stands is in good shape, and is full
featured (Darren can tell you how much work he has put into supporting
and enhancing the latex support). The last major issue with it is the
font size issue, and with your help a solution is on the horizon. So
it is definitely a good use of time to fix this last bit. It doesn't
sound like your "option 2" is a ton of work, but correct me if I'm
wrong.

It was a fair amount of work figuring out how to support latex. Jouni started
work on a dvi parser, see dviread.py in matplotlib/lib/matplotlib, which
could greatly simplify the gymnastics we currently use to support latex in ps
output. If dviread were to be further developed, latex could also be used in
conjunction with the pdf backend (Jouni's reason for starting dviread), the
svg backend, and I guess it would work with cairo as well. But making dviread
robust will probably take more work than options 1 or 2, so it is probably
best to pursue one of those options for now.

While I would love to see cairo become a full featured backend, and
for there to be additional GUI support like tkcairo, wxcairo, etc, we
are a lot farther from that goal than we are to getting the font sizes
down in the existing postscript backend. And I like the fact the mpl
is completely BSD-ish -- relying on a core component which is LGPL
would be a step back in my book, though having it as an option would
be great.

Why can't we all just get along?

Do you mean mathtext or usetex? - The former is mpl's own math layout
using the cm*.ttf files, and should work like any other text in the
file. The latter uses tex and dvipng rasters (at least in agg), but I
don't think it is supported in cairo. So I am not sure where these
rasters are coming from, unless cairo is converting all text to
rasters.

I think he is right, gtkcairo converts mathtext to rasters. usetex is not
support in gtkcairo.

Darren

···

On Thursday 05 July 2007 03:46:13 pm John Hunter wrote:

On 7/5/07, Michael Droettboom <mdroe@...31...> wrote:

> I don't know if there's anything special about the PostScript output
> you're currently producing that wouldn't make it acceptable to use
> cairo's PostScript output directly. But even if you just want code,
> it's inside cairo under the LGPL.

I looked at cairo when we first started with the postscript backend,
but in the bad old days it was just a raster dump. I understand it
has come a long way since.

Yes, it's definitely a _lot_ better than that now. As of any recent
release of cairo, (1.4.x), you will probably get all-vector output for
the kinds of things I would expect matplotlib to do.

If you do hit something that requires a raster-based fallback in
cairo, (translucence or similar), the current releases of cairo do
still compute the fallback by doing full-page rasterization. But
there's a patch already put together, (by the expert Adrian Johnson),
that makes cairo do rasterization for only the minimal necessary
region, (so expect that in cairo 1.6 in the future).

mpl's postscript backend supports latex expressions in PS output,
which requires a fair amount of complex trickery in the postscript
backend, though we we could probably do it with embedded rasters in
cairo.

Embedding latex expressions is really cool. If you do try something
like this with cairo and find that you wish cairo would do something
that it can't, then please let me know.

The postscript backend is also standalone with no dependencies
other than mpl and numpy, and adding cairo to the mix might be a bit
difficult for across platforms for some users (though this appears to
have gotten a lot better too).

Yes, cairo should work extremely well across platforms, (and
particularly the "generic" backends like the image, PDF, PostScript,
and SVG backends). The only outstanding platform-specific issues are in
display-device-specific backends such as in cairo's quartz backend,
(but even it does work extremely well---just not quite perfectly---and
the mozilla people are working hard to complete it).

LGPL means we cannot reuse the code.

That's your choice of course.

As far as type3 goes, there's really nothing special there. It would
be just as easy (or easier) to just read the PostScript language
reference and implement things directly as compared to reading cairo's
code. That's all I did to write it originally, and it's not hard at
all.

Now, some of the other font subsetting work in cairo is a bit more
sophisticated. Adrian Johnson has done most of that, so he would
probably be the person you would need to ask if you would like the
code to be made available under a more liberal licence than the LGPL,
(or the Mozilla Public License as cairo is currently made available
under either of those).

While I like the idea of using cairo for both raster and vector
outputs in principle because it offloads a lot of work onto a large
and well supported project, it would probably take a fair amount of
work to get all of mpl's functionality into the cairo backend (I don't
know this since I have not tested the backend for some time, but does
it support, for example unicode_demo, mathtext_demo, usetex, and
image_demo ?).

It doesn't look to me like there's a lot of missing work.

Here are the results from unicode_demo:

  Looking at matplotlib's cairo backend

To summarize, all of the PNG, PostScript, PDF, and SVG output looks
fine from the cairo backend. Meanwhile, the PDF backend, (as of
0.87.7) seems to generate broken output for the accented characters,
and the SVG backend doesn't position/scale the text correctly.

Cairo's PDF and PostScript output is smaller than matplotlib's native
output, (factor of 2.75), while cairo's SVG output is a fair amount
larger than matplotlib's, (factor of 11), since it's embedding all of
the text glyphs, (which could be either good or bad depending on what
you really want).

I didn't seem to have any usetex demo installed with the Debian 0.87.7
package of python-matplotlib-doc, and with both mathtext_demo and
image_demo I got the following inscrutable error messages:

  /usr/lib/python2.4/site-packages/matplotlib/backends/backend_cairo.py:329:
  UserWarning: cairo with Numeric support is required for
  _draw_mathtext()

  /usr/lib/python2.4/site-packages/matplotlib/backends/backend_cairo.py:162:
  UserWarning: cairo with Numeric support is required for draw_image()

Does anybody know what that could mean? I have no idea what "cairo
with Numeric support" is. Is it perhaps something specific to the
pycairo python bindings of cairo?

-Carl

···

On Thu, 5 Jul 2007 13:26:22 -0500, "John Hunter" wrote:

On 7/5/07, Carl Worth <cworth@...528...> wrote:

The postscript backend as it stands is in good shape, and is full
featured (Darren can tell you how much work he has put into supporting
and enhancing the latex support). The last major issue with it is the
font size issue, and with your help a solution is on the horizon. So
it is definitely a good use of time to fix this last bit. It doesn't
sound like your "option 2" is a ton of work, but correct me if I'm
wrong.

For what it's worth, I think I'd be inclined to agree with you
there. If your existing code is working just fine, then switching to
cairo is just more work. But if you do start having to do any serious
maintenance, then you might want to reconsider.

http://www.scipy.org/License_Compatibility

Thanks, John, for sharing this essay. Please allow me to respond to a
few points:

  In my experience, the benefits of collaborating with the
  private sector are real, whereas the fear that some private
  company will "steal" your product and sell it in a proprietary
  application leaving you with nothing is not.

In my experience, there is real harm that can come when proprietary
modifications to a license made available under a permissive license
are not contributed back. An extremely clear case is that of the X
Window System which went through a period of several independent
software vendors trying to out-compete each other on their own
proprietary modifications to the system, (resulting in the near death
of the system altogether).

I've had some discussions with Jim Gettys about that process and how
the MIT license for X has played out over the years. You argue that a
project most needs the extra users provided by a permissive license
during its formative years until it reaches critical mass and the
network effects kick in. Jim actually argues the point differently and
says that the extra protections of the GPL are most necessary during
the formative period, but not at all needed once the project reaches
critical mass. So I've heard him express that he wishes there were a
way to allow a project to grow under the GPL and then change to
something like the MIT license once it reaches critical
mass.

  There is a lot of GPL code in the world, and it is a constant
  reality in the development of matplotlib that when we want to
  reuse some algorithm, we have to go on a hunt for a non-GPL
  version.

So that's a cost that you need to weigh against the decision to not be
able to accept any GPL code into your project. But I think the fact
that there _is_ a lot of GPL code in the world is a strong argument
against your original thesis that a license more permissible than the
GPL is necessary to bootstrap a free software project to critical
mass. There _is_ a lot of GPL code, which means there _are_ a lot of
users of that code, and a lot of those users are businesses that don't
have a problem using, (and modifying, and contributing back to), GPL
code.

  There are two unpalatable options. 1) Go with GPL and lose the
  mind-share of the private sector 2) Forgo GPL code and retain
  the contribution of the private sector.

You've chosen (2) along with a decision to try to campaign authors of
GPL code to relicense their code as BSD/MIT (ish) whenever you want to
use it. I would guess you'll find that quite difficult in many cases,
(I don't agree that the GPL is most often chosen without intention
just because it is "famous").

I think an easier route to take path (2) is to use the LGPL for your
library, and then only have to convince authors to re-license the
subset of their GPL application code as LGPL that you're actually
interested in incorporating into your library. I would predict that
you will be more successful at that more often than convincing people
to relicense GPL to BSD/MIT (ish).

You only bring the LGPL up at the end of your essay as almost an
afterthought and dismiss it with a very vague, "but many companies are
still loath to use it out of legal concerns". Do you have actual
evidence to point to for that?

It would be simpler if there were direct experiments we could run to
measure some of these things, but there aren't, (and conditions do
continue to change). My experience with the cairo project suggests
that we've been able to achieve a very successful library
implementation, (with plenty of "corporate" contribution), with an
LGPL (and MPL) license.

  This is a very tough decision because their is a lot of very
  high quality software that is GPL and we need to use it;

Network effects are strong---when they're good, don't fight them. :slight_smile:

And I've even been annoyed enough with having to get code relicensed
from GPL to LGPL+MPL for use in cairo that I'm thinking the next
library I invent might be simply GPL from the beginning.

Which brings me to my final point. I think it's very interesting (and
worthwhile) to debate license decisions like this. But at the end of
the day, when a project chooses a free software license, and that
project becomes at all "established" it's probably rarely a good idea
to change that license. I just don't think that it's right to engage a
community with one set of ground rules, and then to go and change
those rules out from under the community.

So, even if the current license from matplotlib would allow you to
easily change from it to the LGPL (which I think it does), I wouldn't
make any argument that you should think of changing the project
license.

don't think it is supported in cairo. So I am not sure where these
rasters are coming from, unless cairo is converting all text to
rasters.

Definitely not converting all text to raster, (unless someone's using
an ancient version of cairo).

-Carl

···

On Thu, 5 Jul 2007 14:46:13 -0500, "John Hunter" wrote:

Carl,

I have made a few changes in svn to facilitate testing cairo with backend_driver (and to fix a bug that turned up), and I will do a bit more on this later today or tomorrow. The result of a quick pass through the backend_driver test with png output is quite encouraging, though. There are some bugs in string placement, image handling, and clipping, but most things work, including mathtext. Default fonts seem to be different.

Eric

Carl Worth wrote:

···

On Thu, 5 Jul 2007 13:26:22 -0500, "John Hunter" wrote:

On 7/5/07, Carl Worth <cworth@...528...> wrote:

I don't know if there's anything special about the PostScript output
you're currently producing that wouldn't make it acceptable to use
cairo's PostScript output directly. But even if you just want code,
it's inside cairo under the LGPL.

I looked at cairo when we first started with the postscript backend,
but in the bad old days it was just a raster dump. I understand it
has come a long way since.

[...]

I have made a few changes in svn to facilitate testing cairo with
backend_driver (and to fix a bug that turned up), and I will do a bit
more on this later today or tomorrow.

Cool. I've started downloading all the matplotlib source history with
git-svn, so once that's done I'll take a look. Hopefully it's obvious
how to run through the cairo backend with the test suite---otherwise
I'll ask.

                                      The result of a quick pass
through the backend_driver test with png output is quite encouraging,
though. There are some bugs in string placement, image handling, and
clipping, but most things work, including mathtext. Default fonts seem
to be different.

If there's anything I can do to help I'll do what I can---let me know.

Oh, and I meant to say that it's a bit annoying that
savefig("somefile") doesn't work with the cairo backend. My
understanding is that this is supposed to automatically select the
correct file extension based on the backend type, (with the implicit
assumption that any given backend only supports one backend type).

That seems like a useful way of using savefig, and I don't think it's
correct to break it just because cairo supports multiple file types.

My suggestion would be to make it default to .png if no additional
information is provided, and then to also add some sort of pseudo
backends so that the other cairo-supported file types could easily be
obtained with this same savefig call. For example something like:

  python myscript.py -dCairoPDF

What do you think? Would that be simple to implement?

-Carl

PS. I'd be more inclined to name the backends things like cairo-pdf
than CairoPDF but it seems that the latter better fits the existing
convention for matplotlib backend naming.

···

On Thu, 05 Jul 2007 13:22:11 -1000, Eric Firing wrote:

Carl Worth wrote:

I have made a few changes in svn to facilitate testing cairo with
backend_driver (and to fix a bug that turned up), and I will do a bit
more on this later today or tomorrow.

Cool. I've started downloading all the matplotlib source history with
git-svn, so once that's done I'll take a look. Hopefully it's obvious
how to run through the cairo backend with the test suite---otherwise
I'll ask.

                                      The result of a quick pass
through the backend_driver test with png output is quite encouraging,
though. There are some bugs in string placement, image handling, and
clipping, but most things work, including mathtext. Default fonts seem
to be different.

If there's anything I can do to help I'll do what I can---let me know.

Thanks. One place to start would be with the string placement. If you compare png output from Cairo vs Agg, I think you will find that strings are being positioned differently, sometimes very subtly, sometimes (especially for plot title) by quite a bit. If you can figure out where the differences are coming from, we can decide whether changes are needed in Cairo, in one or more of the other backends, or both. (I think SVG also positions strings quite differently; I think ps positioning is much closer to Agg.)

Oh, and I meant to say that it's a bit annoying that
savefig("somefile") doesn't work with the cairo backend. My
understanding is that this is supposed to automatically select the
correct file extension based on the backend type, (with the implicit
assumption that any given backend only supports one backend type).

That seems like a useful way of using savefig, and I don't think it's
correct to break it just because cairo supports multiple file types.

My suggestion would be to make it default to .png if no additional
information is provided, and then to also add some sort of pseudo

Yes, I was looking at making that the default.

backends so that the other cairo-supported file types could easily be
obtained with this same savefig call. For example something like:

  python myscript.py -dCairoPDF

What do you think? Would that be simple to implement?

I think it would be easy. It might be done most easily and consistently via the rc mechanism. Figure.savefig already has a kwarg for it, so it would be a matter of having that kwarg default to the rc setting. For the backend specification I would suggest "-dCairo.pdf" etc, which is mnemonic and easy to parse.

Eric

···

On Thu, 05 Jul 2007 13:22:11 -1000, Eric Firing wrote:

-Carl

PS. I'd be more inclined to name the backends things like cairo-pdf
than CairoPDF but it seems that the latter better fits the existing
convention for matplotlib backend naming.

Carl Worth wrote:
[...]

My suggestion would be to make it default to .png if no additional
information is provided, and then to also add some sort of pseudo
backends so that the other cairo-supported file types could easily be
obtained with this same savefig call. For example something like:

  python myscript.py -dCairoPDF

What do you think? Would that be simple to implement?

It's done, except that it is '-dCairo.pdf'.

Also, examples/backend_driver.py now supports case-insensitive specifation of backends, and the same form of cairo specification. See docstring at the top.

I found that the cairo backend writes ps files but not eps--it enlists backend_ps for eps files. Does pycairo not have a way to specify eps rather than ps output?

Eric

···

On Thu, 05 Jul 2007 13:22:11 -1000, Eric Firing wrote:

-Carl

PS. I'd be more inclined to name the backends things like cairo-pdf
than CairoPDF but it seems that the latter better fits the existing
convention for matplotlib backend naming.