[ Curiosa ] Fun with unicode

Hello,

I've been playing with the unicode rendering code that got added in
matplotlib 0.74. All tests have been done on Linux with the 0.74-1
debian package, lazy me... They should work on any platform, but you
will have to find out the gory details (!).

[ Most of these are examples, but there are a few remarks for the ps backend
   maintainers mostly, enclosed in square brackets in the text. No patches yet,
   but if you find the ideas useful, I can give it a try later. ]

To try this out, you may also need the rather complete FreeFont unicode font:
1) download the ttf from
    http://savannah.nongnu.org/download/freefont/freefont-ttf.tar.gz
2) put the .ttf files in your prefered Truetype fonts directory
3) remove ~/.ttffont.cache
4) restart matplotlib

Part I: Guess what I do for a living

Get your favorite interactive backend and display pretty plots

>>> plot([0.3,0.01,-0.01,-0.01,-0.1,-0.1,-0.3,-0.01,0.01,0.01,0.1,0.1,0.3],\
... [105,100,98,90,92,101,105,100,98,90,92,101,105],'kD-')
>>> ylim(85,110)
>>> xlabel(u'\u03bc\u2080H(T)', name='FreeSans')
>>> ylabel(u'R(\u03a9)', name='FreeSans')
>>> ^D

or weird formulas

>>> figtext(0.5,0.5,u'\u0127\u03c9 \u226a k\u0432T',name='FreeSerif',\
... size=30, ha='center', va='center', color='r')
>>> ^D

You can also save to svg, and even to postscript (or eps) provided you set the
ps.useafm preference to False for now.

Part II: All work and no play...

>>> plot([0.3,0.01,-0.01,-0.01,-0.1,-0.1,-0.3,-0.01,0.01,0.01,0.1,0.1,0.3],\
... [105,100,107,90,92,101,105,100,98,90,92,101,105],'kD-')
>>> ylim(85,110)
>>> text(-0.01,107,u' \u261c booh! the ugly artifact!',name='FreeSerif',\
... size=20, va='top', ha='left')
>>> ^D

:wink:

Part III: Ugly, dirty and mean

Now it's time to produce a PDF. Run ps2pdf on one of the plots above,
at look at the ugly Type 3 fonts in your prefered PDF viewer.
The only way to get decent PDF is to set ps.useafm to True again.

For this to work, we have to provide the AFM files for FreeFont:
1) download the source of the font from:
    http://savannah.nongnu.org/download/freefont/freefont-sfd.tar.gz
2) download fontforge from fontforge.sourceforge.net
3) open each .sfd file in freefont, and run File\Generate Fonts;
    choose type 'PS Type 0'; this should produce a corresponding
    .afm and .ps file; save the .ps file for later.
4) move the afm file to a directory which is searched by matplotlib.
    Any subdirectory of /usr/share/fonts/ will do, provided said
    subdirectory is not a symlink
    [ is this a bug ? the implementation is in lib/font_manager.py,
      function x11FontDirectory(); os.path.walk() ignores symlinks ]
5) remove ~/.afmfont.cache
6) restart matplotlib
7) when you save the first .ps figure, the cache is rebuilt

Now that we have a postscript, we need to convince ghostscript to display it.
The first step is to provide the Type 0 fonts, like this:
1) move the .ps files we previously saved into a directory in ghostscript's
    path (try gs -h). A subdirectory won't work this time. Don't ask me why.
2) rename the font file to the name of the font, without extension, like
    $ mv FreeSans.ps FreeSans

If we try to look at our figure now, ghostscript will complain about
'/rangecheck yada yada' and fail to display the figure. This is because
postscript doesn't understand utf-8 encoding.
Postscript does however understand unicode hexa codes. So we have to replace
(R(\316\251)) [ octal representation of utf-8 characters ]
with
<0052002803a90029> [ each 4 hexa figures are one character ]
For now, we have to do that manually in our favorite text editor. To compute
the hexa code in python, we do:
>>> unistr=u'R(\u03a9)'
>>> print ('<'+'%04x'*len(unistr)+'>') % tuple([ord(c) for c in unistr])
>>> ^D

[ It looks that the ps backend should do just that with unicode strings if
   ps.useafm is True, as utf-8 is useless anyway.
   Using unicode hexa may also allow a much simpler implementation of
   draw_unicode() (in lib/backends/backend_ps.py) in the Type 42 case, by
   avoiding to position the glyphs one by one ]

I successfully tested .eps files produced with this procedure on both
a recent ghostscript and acrobat distiller; distiller or ps2pdf will produce
PDFs with nice embedded Type 1 fonts.

Part IV: Publish or perish

Producing pretty PDFs is well and nice, but most publishers will ask for
.eps with all fonts embedded. So we have to embed the fonts into the .eps
file. I could find no program to do this. DO NOT use gs -sDEVICE=pswrite
for that. Not only will ghostscript mangle the fonts, but also the plots (!).

Luckily, the FreeSans.ps from above is already a postscript with embedded
fonts, so we are golden. Just cat the font files together with the .eps
and merge the headers and footers by hand.

[ It would be nice to have a ps.embedfonts preference. For Type 0, this is
easy, as above; I don't know for Type 1. Also, it would be good to embed
only the needed glyphs, but I haven't looked at how to do it ]

Well, that's all for tonight. In conclusion, unicode support works already
very well, and allows lots of fun things. Thank you guys for the good
work.

BC

Baptiste,

Thank you so much for this tutorial. It was very easy to follow. I have a
couple of comments sprinkled through the instructions

Hello,

I've been playing with the unicode rendering code that got added in
matplotlib 0.74. All tests have been done on Linux with the 0.74-1
debian package, lazy me... They should work on any platform, but you
will have to find out the gory details (!).

[ Most of these are examples, but there are a few remarks for the ps
backend maintainers mostly, enclosed in square brackets in the text. No
patches yet, but if you find the ideas useful, I can give it a try later. ]

To try this out, you may also need the rather complete FreeFont unicode
font: 1) download the ttf from
    http://savannah.nongnu.org/download/freefont/freefont-ttf.tar.gz
2) put the .ttf files in your prefered Truetype fonts directory
3) remove ~/.ttffont.cache
4) restart matplotlib

Part I: Guess what I do for a living

Get your favorite interactive backend and display pretty plots

>>> plot([0.3,0.01,-0.01,-0.01,-0.1,-0.1,-0.3,-0.01,0.01,0.01,0.1,0.1,0.3]
>>>,\

... [105,100,98,90,92,101,105,100,98,90,92,101,105],'kD-')

>>> ylim(85,110)
>>> xlabel(u'\u03bc\u2080H(T)', name='FreeSans')
>>> ylabel(u'R(\u03a9)', name='FreeSans')
>>> ^D

or weird formulas

>>> figtext(0.5,0.5,u'\u0127\u03c9 \u226a k\u0432T',name='FreeSerif',\

... size=30, ha='center', va='center', color='r')

>>> ^D

You can also save to svg, and even to postscript (or eps) provided you set
the ps.useafm preference to False for now.

Part II: All work and no play...

>>> plot([0.3,0.01,-0.01,-0.01,-0.1,-0.1,-0.3,-0.01,0.01,0.01,0.1,0.1,0.3]
>>>,\

... [105,100,107,90,92,101,105,100,98,90,92,101,105],'kD-')

>>> ylim(85,110)
>>> text(-0.01,107,u' \u261c booh! the ugly
>>> artifact!',name='FreeSerif',\

... size=20, va='top', ha='left')

>>> ^D

:wink:

Part III: Ugly, dirty and mean

Now it's time to produce a PDF. Run ps2pdf on one of the plots above,
at look at the ugly Type 3 fonts in your prefered PDF viewer.
The only way to get decent PDF is to set ps.useafm to True again.

I copied my ttf fonts to matplotlib/fonts/ttf/, and reinstalled MPL from CVS.
At this point, with ps.useafm=False, I was able to generate a pdf with really
nice looking fonts. How were your fonts ugly?

For this to work, we have to provide the AFM files for FreeFont:
1) download the source of the font from:
    http://savannah.nongnu.org/download/freefont/freefont-sfd.tar.gz
2) download fontforge from fontforge.sourceforge.net
3) open each .sfd file in freefont, and run File\Generate Fonts;
    choose type 'PS Type 0'; this should produce a corresponding
    .afm and .ps file; save the .ps file for later.

This was a little confusing, I think it should read 'open each .sfd file in
fontforge, and run ...' I was able to generate afm and ps files, but I think
something went wrong (see below). Did you have any trouble with this step?

4) move the afm file to a directory which is searched by matplotlib.
    Any subdirectory of /usr/share/fonts/ will do, provided said
    subdirectory is not a symlink
    [ is this a bug ? the implementation is in lib/font_manager.py,
      function x11FontDirectory(); os.path.walk() ignores symlinks ]
5) remove ~/.afmfont.cache
6) restart matplotlib
7) when you save the first .ps figure, the cache is rebuilt

At this point, I was not able to save a ps figure with ps.useafm=True. I get
the following error, maybe fontforge had trouble generating the afm files?

At any rate, these fonts look really great. I'm going to start working on the
unicode mappings so you can do '$\hbar$' instead of u'\u0127'. Again, thanks
for the tutorial! Can we commit the freefont ttf's to CVS?

Darren

/usr/lib/python2.4/site-packages/matplotlib/pylab.py in savefig(*args,
**kwargs)
    738 def savefig(*args, **kwargs):
    739 fig = gcf()
--> 740 return fig.savefig(*args, **kwargs)
    741 if Figure.savefig.__doc__ is not None:
    742 savefig.__doc__ = Figure.savefig.__doc__

/usr/lib/python2.4/site-packages/matplotlib/figure.py in savefig(self, *args,
**kwargs)
    515 kwargs[key] = rcParams['savefig.%s'%key]
    516
--> 517 self.canvas.print_figure(*args, **kwargs)
    518
    519

/usr/lib/python2.4/site-packages/matplotlib/backends/backend_gtkagg.py in
print_figure(self, filename, dpi, facecolor, edgecolor, orientation)
     89 else:
     90 agg = self.switch_backends(FigureCanvasAgg)
---> 91 try: agg.print_figure(filename, dpi, facecolor, edgecolor,
orientation)
     92 except IOError, msg:
     93 error_msg_gtk('Failed to save\nError message:
%s'%(msg,), self)

/usr/lib/python2.4/site-packages/matplotlib/backends/backend_agg.py in
print_figure(self, filename, dpi, facecolor, edgecolor, orientation)
    403 from backend_ps import FigureCanvasPS # lazy import
    404 ps = self.switch_backends(FigureCanvasPS)
--> 405 ps.print_figure(filename, dpi, facecolor, edgecolor,
orientation)
    406 else:
    407 raise IOError('Do not know know to handle extension
*%s' % ext)

/usr/lib/python2.4/site-packages/matplotlib/backends/backend_ps.py in
print_figure(self, outfile, dpi, facecolor, edgecolor, orientation)
    909 self._pswriter = StringIO()
    910 renderer = RendererPS(width, height, self._pswriter)
--> 911 self.figure.draw(renderer)
    912
    913 self.figure.set_facecolor(origfacecolor)

/usr/lib/python2.4/site-packages/matplotlib/figure.py in draw(self, renderer)
    393
    394 # render the axes
--> 395 for a in self.axes: a.draw(renderer)
    396
    397 # render the figure text

/usr/lib/python2.4/site-packages/matplotlib/axes.py in draw(self, renderer)
   1358
   1359 for zorder, i, a in dsu:
-> 1360 a.draw(renderer)
   1361
   1362 self.title.draw(renderer)

/usr/lib/python2.4/site-packages/matplotlib/text.py in draw(self, renderer)
    315
    316 return
--> 317 bbox, info = self._get_layout(renderer)
    318
    319 for line, wh, x, y in info:

/usr/lib/python2.4/site-packages/matplotlib/text.py in _get_layout(self,
renderer)
    162 whs = []
    163 for line in lines:
--> 164 w,h = renderer.get_text_width_height(
    165 line, self._fontproperties,
ismath=self.is_math_text())
    166 if not len(line) and not self.is_math_text():

/usr/lib/python2.4/site-packages/matplotlib/backends/backend_ps.py in
get_text_width_height(self, s, prop, ismath)
    191 if ismath: s = s[1:-1]
    192 font = self._get_font_afm(prop)
--> 193 l,b,w,h = font.get_str_bbox(s)
    194
    195 fontsize = prop.get_size_in_points()

/usr/lib/python2.4/site-packages/matplotlib/afm.py in get_str_bbox(self, s)
    315 for c in s:
    316 if c == '\n': continue
--> 317 wx, name, bbox = self._metrics[ord(c)]
    318 l,b,w,h = bbox
    319 if l<left: left = l

KeyError: 295

···

On Wednesday 20 April 2005 7:13 pm, Baptiste Carvello wrote:

I take that back, the fonts look nice in Adobe Reader 7, but not so good in
kpdf.

···

On Saturday 07 May 2005 10:46 pm, Darren Dale wrote:

>
> Now it's time to produce a PDF. Run ps2pdf on one of the plots above,
> at look at the ugly Type 3 fonts in your prefered PDF viewer.
> The only way to get decent PDF is to set ps.useafm to True again.

I copied my ttf fonts to matplotlib/fonts/ttf/, and reinstalled MPL from
CVS. At this point, with ps.useafm=False, I was able to generate a pdf with
really nice looking fonts. How were your fonts ugly?

At this point, I was not able to save a ps figure with ps.useafm=True. I
get the following error

[...]

KeyError: 295

I was able to work through this problem this morning, it was an issue with
fontforge: inexperienced operator.

If we try to look at our figure now, ghostscript will complain about
'/rangecheck yada yada' and fail to display the figure. This is because
postscript doesn't understand utf-8 encoding.
Postscript does however understand unicode hexa codes. So we have to replace
(R(\316\251)) [ octal representation of utf-8 characters ]
with
<0052002803a90029> [ each 4 hexa figures are one character ]
For now, we have to do that manually in our favorite text editor. To compute
the hexa code in python, we do:
>>> unistr=u'R(\u03a9)'
>>> print ('<'+'%04x'*len(unistr)+'>') % tuple([ord(c) for c in unistr])

I tried replacing the characters in my ps file, but was not able to render the
characters. Would you mind sending me your file, so I can see what you did?

Producing pretty PDFs is well and nice, but most publishers will ask for
.eps with all fonts embedded. So we have to embed the fonts into the .eps
file. I could find no program to do this. DO NOT use gs -sDEVICE=pswrite
for that. Not only will ghostscript mangle the fonts, but also the plots

(!).

Luckily, the FreeSans.ps from above is already a postscript with embedded
fonts, so we are golden. Just cat the font files together with the .eps
and merge the headers and footers by hand.

[ It would be nice to have a ps.embedfonts preference. For Type 0, this is
easy, as above; I don't know for Type 1. Also, it would be good to embed
only the needed glyphs, but I haven't looked at how to do it ]

I slept on exactly these considerations last night. I was thinking that we
should embed only the needed glyphs, and it should not be optional. The
output has to be portable. Embedding AFM fonts should also take care of the
PostScript kerning issues we have been talking about.

···

On Saturday 07 May 2005 10:46 pm, Darren Dale wrote:

--
Darren S. Dale

Bard Hall
Department of Materials Science and Engineering
Cornell University
Ithaca, NY. 14850

dd55@...143...

I am continuing to make progress reproducing Baptiste's results. I am now at
the final step of his tutorial, I have been able to generate a postscript
file using afm fonts, by editing the strings as he suggested. On my system,
~/.fonts is on the gs path, so I stuck the Free*.ps files there and renamed
them.

One note. The postscript file renders well, but I had trouble converting it to
a pdf with ghostscript-7.07. Some of the glyphs would render right on top of
each other, here is an example where I tried to render
\hbar\hbar\hbar\hbar\omega\omega\omega\omega\hbar\omega\hbar\omega:
http://people.ccmr.cornell.edu/~dd55/matplotlib/unicode/useafm.ps
http://people.ccmr.cornell.edu/~dd55/matplotlib/unicode/useafm.pdf

I ended up uninstalling the GPL'd ghostscript and installed the AFPL verion
8.51. After that, ps2pdf worked as expected.

Darren