Apparent bug with EPS files using Liberation fonts

Greetings. The attached script saves three EPS files (also attached) of a figure containing a text object using the Liberation Sans font. In the three EPS files, the text object contains no spaces, a normal space, and a non-breaking space, respectively. Ghostscript handles the first and third files but not the second, for which it gives the message “Error: /undefined in --get–” at the Postscript “show” instruction for the text. I noticed that the second file embeds a character for the non-breaking space “uni00A0” but not the normal space, so I suppose that’s why the error is occurring. I tried a few other fonts and did not see any problems. I have matplotlib v. 1.0.0, Python 2.6.5, and Liberation fonts 1.06.0.20100721. Would someone kindly look into whether this might be a bug with matplotlib or the Liberation fonts? Many thanks!

mpl-eps-liberation.py (354 Bytes)

mpl-eps-liberation-nospace.eps (6.38 KB)

mpl-eps-liberation-space.eps (6.41 KB)

mpl-eps-liberation-nbspace.eps (6.73 KB)

mpl-eps-liberation-space.out (1.03 KB)

I can confirm that evince also has a problem with the second image, but not the first or the third images. This is using the latest matplotlib from svn.

The error from evince is very non-descriptive:

undefined -21

** (evince:3597): WARNING **: Error rendering thumbnail
undefined -21

Ben Root

···

On Mon, Aug 30, 2010 at 10:48 AM, Stan West <stan.west@…883…> wrote:

Greetings. The attached script saves three EPS files (also attached) of a figure containing a text object using the Liberation Sans font. In the three EPS files, the text object contains no spaces, a normal space, and a non-breaking space, respectively. Ghostscript handles the first and third files but not the second, for which it gives the message “Error: /undefined in --get–” at the Postscript “show” instruction for the text. I noticed that the second file embeds a character for the non-breaking space “uni00A0” but not the normal space, so I suppose that’s why the error is occurring. I tried a few other fonts and did not see any problems. I have matplotlib v. 1.0.0, Python 2.6.5, and Liberation fonts 1.06.0.20100721. Would someone kindly look into whether this might be a bug with matplotlib or the Liberation fonts? Many thanks!

From: ben.v.root@…149… [mailto:ben.v.root@…716…] On Behalf Of Benjamin Root
Sent: Tuesday, August 31, 2010 23:20

I can confirm that evince also has a problem with the second image, but not the first or the third images. This is using the latest matplotlib from svn.

Thank you for the confirmation, Ben.

Here’s what I’ve found so far. I examined the Liberation sources (the SFD files) in FontForge and as text, and I gather that some of them use a non-standard encoding. Liberation Sans, for example, does not define a space glyph with the name “space”; instead it defines a glyph for the non-breaking space at code point U+00A0 with the name “uni00A0” and gives U+0020 (the plain space) as an alternate encoding. (In the file LiberationSans-Regular.sfd, these definitions start at line 2929.) However, matplotlib assumes that the font uses Postscript’s StandardEncoding. I suppose that when Postscript processes “(some text) show”, it looks for the space glyph under the standard name “space” but finds nothing. Here is an excerpt of matplotlib/ttconv/pprdrv_tt.cpp from SVN starting at line 415:

/*-------------------------------------------------------------
** Define the encoding array for this font.
** Since we don't really want to deal with converting all of
** the possible font encodings in the wild to a standard PS
** one, we just explicitly create one for each font.
-------------------------------------------------------------*/
void ttfont_encoding(TTStreamWriter& stream, struct TTFONT *font, std::vector<int>& glyph_ids, font_type_enum target_type)
    {
        stream.putline("/Encoding StandardEncoding def");
        // if (target_type == PS_TYPE_3) {
        //     stream.printf("/Encoding [ ");
        //     for (std::vector<int>::const_iterator i = glyph_ids.begin();
        //          i != glyph_ids.end(); ++i) {
        //         const char* name = ttfont_CharStrings_getname(font, *i);
        //         stream.printf("/%s ", name);
        //     }
        //     stream.printf("] def\n");
        // } else {
        //     stream.putline("/Encoding StandardEncoding def");
        // }
    } /* end of ttfont_encoding() */

I saw in the SVN logs that the commented code for non-standard encodings had a brief life of about a month earlier this year before being declared more trouble than it was worth.

Getting back to the fonts, I found that not all of the Liberation fonts use this non-standard encoding. The Liberation Sans Narrow fonts in the current release define “space” at U+0020 with U+00A0 as an alternate encoding, and they work fine in matplotlib EPS files. I also checked a few of the fonts in one older release, 1.0, and they also work correctly.

One work-around I found is to use Unicode strings for text containing spaces, which in the EPS file causes spaces to be looked up under the glyph name “uni00A0”. If embedding Type 3 fonts, another work-around (which I only spot-checked) is to effectively standardize the encoding by editing the EPS file, changing “/uni00A0” to “/space” in the font definition and in glyphshow operations that call for “/uni00A0”.

Stan,

Thanks for the insightful analysis. Could you file a bug report with some of this information (at the very least, reference your message on the mailing list)?

Ben Root

···

On Wed, Sep 1, 2010 at 11:29 AM, Stan West <stan.west@…595…> wrote:

From: ben.v.root@…149… [mailto:ben.v.root@…149… ] On Behalf Of Benjamin Root
Sent: Tuesday, August 31, 2010 23:20

I can confirm that evince also has a problem with the second image, but not the first or the third images. This is using the latest matplotlib from svn.

Thank you for the confirmation, Ben.

Here’s what I’ve found so far. I examined the Liberation sources (the SFD files) in FontForge and as text, and I gather that some of them use a non-standard encoding. Liberation Sans, for example, does not define a space glyph with the name “space”; instead it defines a glyph for the non-breaking space at code point U+00A0 with the name “uni00A0” and gives U+0020 (the plain space) as an alternate encoding. (In the file LiberationSans-Regular.sfd, these definitions start at line 2929.) However, matplotlib assumes that the font uses Postscript’s StandardEncoding. I suppose that when Postscript processes “(some text) show”, it looks for the space glyph under the standard name “space” but finds nothing. Here is an excerpt of matplotlib/ttconv/pprdrv_tt.cpp from SVN starting at line 415:

/*-------------------------------------------------------------
** Define the encoding array for this font.
** Since we don't really want to deal with converting all of


** the possible font encodings in the wild to a standard PS
** one, we just explicitly create one for each font.
-------------------------------------------------------------*/
void ttfont_encoding(TTStreamWriter& stream, struct TTFONT *font, std::vector<int>& glyph_ids, font_type_enum target_type)


    {
        stream.putline("/Encoding StandardEncoding def");
        // if (target_type == PS_TYPE_3) {
        //     stream.printf("/Encoding [ ");
        //     for (std::vector<int>::const_iterator i = glyph_ids.begin();
        //          i != glyph_ids.end(); ++i) {
        //         const char* name = ttfont_CharStrings_getname(font, *i);


        //         stream.printf("/%s ", name);
        //     }
        //     stream.printf("] def\n");
        // } else {
        //     stream.putline("/Encoding StandardEncoding def");


        // }
    } /* end of ttfont_encoding() */

I saw in the SVN logs that the commented code for non-standard encodings had a brief life of about a month earlier this year before being declared more trouble than it was worth.

Getting back to the fonts, I found that not all of the Liberation fonts use this non-standard encoding. The Liberation Sans Narrow fonts in the current release define “space” at U+0020 with U+00A0 as an alternate encoding, and they work fine in matplotlib EPS files. I also checked a few of the fonts in one older release, 1.0, and they also work correctly.

One work-around I found is to use Unicode strings for text containing spaces, which in the EPS file causes spaces to be looked up under the glyph name “uni00A0”. If embedding Type 3 fonts, another work-around (which I only spot-checked) is to effectively standardize the encoding by editing the EPS file, changing “/uni00A0” to “/space” in the font definition and in glyphshow operations that call for “/uni00A0”.

From: ben.v.root@…149… [mailto:ben.v.root@…716…] On Behalf Of Benjamin Root
Sent: Thursday, September 02, 2010 14:39

Thanks for the insightful analysis. Could you file a bug report with some of this information (at the very least, reference your message on the mailing list)?

It took me a while to get back to it, but I’m happy to file a report. The tracker ID is 3062773.