Yeah, I plot to pcolor a lot but haven’t recently so next time I do I’ll check. It would make a lot of sense for saving overhead there as you have stated.
The overhead doesn’t seem to be to big for small plots but was just curious where it was most useful.
···
On Thu, Aug 1, 2013 at 12:59 AM, Michael Droettboom <mdroe@…86…> wrote:
On 07/31/2013 10:38 AM, Jeffrey Spencer
wrote:
Michael,
Pdftocairo is a good tool to know so thanks for that tip.
I still think currently it is a regression with the current
‘stamp’ method to use it on all accounts. I understand in a
complicated figure with a bunch of subplots that this would be
beneficial and create smaller code. I don’t see how in single
figures this would often result in reduced files sizes.
The case where it has an enormous impact is when the same shape is
used multiple times. For example in a scatter, hexbin or pcolor
plot.
I usually output single figures with one plot and I don't
think one of them that I am currently working on was smaller
in 1.4.x. They all resulted in reduced file sizes with mpl
1.1.1. This figure of 3d spheres resulted in 60kb instead of
roughly 80kb after running pdftocairo. Anyway, you said in
coming versions a threshold should be set before stamping of
objects occurs so a fix is on the way eventually.
Yes, but it's too complex of a fix to throw in quickly. I think the
overall benefit of stamping is preferable to not doing it at all at
this point.
Mike
Thanks for all the help,
Jeff
On Wed, Jul 31, 2013 at 11:31 PM,
Michael Droettboom <mdroe@…86…> wrote:
On 07/30/2013 04:20 PM, Jeffrey Spencer wrote:
Michael,
Thanks that is very informative. Answers most
of the problems I was having and read MEP14 which
looks really useful
That being said does the ps backend subset the
fonts or use collections for drawing (is the
collections feature global or just in the pdf
backend)?
The ps backend has the same behavior as pdf on both
counts. TTF fonts are subsetted, but the fonts that come
from TeX come to use as Type1 fonts, which matplotlib
currently does not know how to subset. It also handles
collections in the same way (by creating a “stamp” and
reusing it).
I usually use .eps output and convert to pdf
using epstopdf unless the figure has an alpha
channel because always results in a much smaller
file (60kB roughly for this file or plain figure
around 10kB) than direct pdf output with the
output looking the same. I pretty much always have
usetex=True so maybe the pdf file is always
embedding the full fonts.
Yes, when usetex=True, matplotlib does not do any font
subsetting (in any backend). To get around this
limitation, one can use the pdftocairo
tool (part of
poppler utils), to convert from pdf to a pdf with
subsetted fonts. With your example, I was able to get the
pdf down to ~80k. With MEP14, we would basically move
such functionality into matplotlib itself, but that’s sort
of a long term, semi-back-burner project so it could be a
while.
It's possible that epstopdf is doing some font subsetting
of its own. But as you point out, Postscript (as a
specification) doesn’t support alpha, so it’s not useful
when you need alpha.
Also, does the Cairo backend support
usetex=True or subsetting? I know I had read it
did not support usetex but that was maybe 2 years
ago or so. The x,y,z axis look correct with cairo
but the IPA Fonts don’t render properly. The
legend font says it is size 12 but if you zoom in
extremely close you can see they are the correct
fonts just way to small. The file size is around
60kB as well so I am guessing it supports
subsetting of fonts.
Cairo does support font subsetting, but the matplotlib
Cairo backend has no support for usetex. I’m surprised
this worked for you at all. When I run your example with
the Cairo backend, the IPA characters appear as raw TeX
source code, i.e. “\textipa{i}”, which is what I would
expect given that the regular font renderer doesn’t
understand that syntax.
The pgf backend would also subset fonts if
output to .pdf I’m assuming because that is the
default with pdftex? It results in similar size
files to the .eps output for this file (roughly
60kB also).
Yes.
The IPA font uses the package
(\usepackage{tipa}) and therefore that is why I
think these look differently. That package draws
these fonts with its’ font libraries instead of
whatever is selected as the text font. Maybe I’m
wrong about this but that is my understanding
because even in normal latex code the fonts look
different than the standard text.
That is correct. The default font for usetex=True is
Computer Modern, whereas it is Bitstream Vera Sans in the
default font rendering. I was referring to the difference
between 1.2 and 1.4 which was using TeX fonts in both
cases, but due to a bug in 1.3/1.4 was rendering the IPA
in serif when you had requested sans-serif.
Mike
Cheers,
Jeff
On Wed, Jul 31, 2013 at
4:43 AM, Michael Droettboom <mdroe@…86…>
wrote:
There are two different things going on
here.
Between 1.2.1 and now, there was a bugfix
to the font selection routine that
inadvertently introduced a bug selecting
fonts in the usetex backend. You may
notice that on master, the IPA font
selected is different. The file size
difference can be attributed to the
slightly larger font size of the one it
selected vs. the one it should have. Note
that when usetex is True, the fonts are
not subsetted, so you always get the full
font embedded in the file (MEP14 work will
fix this in the future).
See b5c340 for the bug that introduced the
commit, and https://github.com/matplotlib/matplotlib/pull/2260
for the fix (which should make it into
1.3.0 final).
Between 1.1.1 and 1.2.1 a change was made
in how collections are handled.
Previously, each path was redrawn
individually. In 1.2, if a path is reused
multiple times, a “stamp” is created and
then it is “used” multiple times. In
principle, this generally reduces file
sizes by a large amount. However, in the
case of this figure with the 3D spheres,
each path is used only once, so rather
than getting the file size savings of that
approach, we only get the overhead. The
backend could be smarter by not doing this
when the path is only used a small number
of times. Such a fix would be welcome,
but is probably too large/risky to try to
get into the current release cycle. It
will have to wait for 1.3.1
Cheers,
Mike
On 07/30/2013 12:24 PM, Jeffrey
Spencer wrote:
K, I have just made the
script self-contained but it loads
external data so I have attached
that as well. If you want me to just
separate out the plotting commands
let me know. I have also attached my
matplotlib rc file which is the same
on all three systems. All the
modifications to the matplotlibrc
file are copied to the top and in
the first 30 lines or so.
Of note, the smallest file
sizes for pdf are using the pgf
backend around 60kb. Not sure if
that helps at all. It is also
around the same size if I export
to .eps and then convert to pdf.
About 60kb. The problem with eps
in these 3d figures though is the
back wall I think has an alpha
channel because just becomes a
solid wall in the output. No lines
through it like the other two
walls.
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out.
[http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk](http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk)
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
[https://lists.sourceforge.net/lists/listinfo/matplotlib-users](https://lists.sourceforge.net/lists/listinfo/matplotlib-users)
Get your SQL database under version control
now!
Version control is standard for application
code, but databases havent
caught up. So what steps can you take to put
your SQL databases under
version control? Why should you start doing
it? Read more to find out.
[http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk](http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk)
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
[https://lists.sourceforge.net/lists/listinfo/matplotlib-users](https://lists.sourceforge.net/lists/listinfo/matplotlib-users)
On Tue, Jul
30, 2013 at 11:23 PM, Jouni K.
Seppänen <jks@…397…>
wrote:
Jeffrey Spencer <jeffspencerd@…287…
>
writes:
> I have three different
versions of matplotlib that
all output different
> file sizes with
matplotlib 1.1.1 providing the
smallest. This is for the
> same exact script. I can
post the script if that helps.
>
> MPL 1.4.x: 539.32kb,
Ubuntu 12.10
> MPL 1.1.1: 172.56kb
Ubuntu 12.10
> MPL 1.2.1: 475.9kb,
Ubuntu 13.04
Yes, it would be interesting to
know what the plotting commands
are.
Just as a guess, since all the
sizes are a few hundred
kilobytes, it
could be a difference in e.g.
font embedding - many TrueType
fonts are
of comparable size.
--
Jouni K. Seppänen
[http://www.iki.fi/jks](http://www.iki.fi/jks)
Get your SQL database under
version control now!
Version control is standard for
application code, but databases
havent
caught up. So what steps can you
take to put your SQL databases
under
version control? Why should you
start doing it? Read more to
find out.
[http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk](http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk)
Matplotlib-users mailing list
Matplotlib-users@...1738....net
[https://lists.sourceforge.net/lists/listinfo/matplotlib-users](https://lists.sourceforge.net/lists/listinfo/matplotlib-users)