Pgfplots (TikZ) backend: implementation strategy

Hi,

I'm looking into replacing my MATLAB(R) plotting routines by something
slicker, and quite naturally found matplotlib. It has all the
capabilities that I would need, except that I can't yet transform my
plots into TikZ.
For MATLAB(R), I used this rather elaborate script
<http://win.ua.ac.be/~nschloe/content/matlab2tikz>.

Well, I thought I can just go ahead and start writing and equivalent
backend; the documentation is really nice and clear (quite unlike
MATLAB's!) so it was no big problem to get into the concepts of the
backend. I played around a little and thought about how I could
implement this and that, and some questions arose which can maybe best
answered here.

For the sake of clarity, let me just give a snippet of Pgfplots (TikZ)
code that I would like the backend to produce

===================== *snip* =====================
[...]
\begin{semilogyaxis}
[axis on top,
xtick={2,4,6,8,10,12,14,16},
ytick={1e-15,1e-10,1e-05,1,100000},
xmin=0.000000e+00,xmax=1.700000e+01,
ymin=1.000000e-15,ymax=1.000000e+05,
xmajorgrids,
ymajorgrids,
title={$\norm{F(\psi)}_2$},
xlabel={$k$},
width=\figurewidth,
height=\figureheight,
scale only axis
]
% Line plot
\addplot [color=red,only marks,mark=*,mark options={solid,fill=red}]
coordinates{
(1.000000e+00,3.206000e+01) (2.000000e+00,3.860000e+01)
(3.000000e+00,1.421000e+03) (4.000000e+00,4.143000e+02)
(5.000000e+00,1.445000e+02) (6.000000e+00,3.775000e+01)
(7.000000e+00,7.455000e+00) (8.000000e+00,7.228000e-01)
(9.000000e+00,2.275000e-02) (1.000000e+01,4.953000e-05)
(1.100000e+01,9.718000e-10) (1.200000e+01,5.534000e-07)
(1.300000e+01,4.217000e-11) (1.400000e+01,3.930000e-03)
(1.500000e+01,2.067000e-07) (1.600000e+01,7.231000e-12)
};
\end{semilogyaxis}
[...]
===================== *snap* =====================

This yields a purely marker plot (without lines) on a coordinate
system where the y-coordinate is log-scaled. You see that the code is
rather semantic and can easily be edited (which is think is the whole
point of Pgfplots as opposed to pure TikZ).

Now, if for a matplotlib plot I had query functions for the the axes
ranges, ticks, grids, titles, data values, and so on and so forth, it
would be a more or less complicated parsing of those and creating what
we see above.

However, it seems to me that the concept of backends is different.
draw_path() would be called to plot the graph itself, the coordinate
axes, and basically everything that resembles a line, giving its
outline (color, shape, markers, start and end points) but *no*
semantic information, that is, whether the path is part of an axis, an
arrow or whatever.
Is that correct?

Considering this, what do matplotlob masterbrains :slight_smile: think would be a
good way to extract Pgfplots code out of a matplotlib figure? Is a
backend feasible at all? Would a function as "matplotlib2tikz(
myFigure )" be more advisable, making use of all sorts of query
functions? (Such as myFigure.axes.get_xlim() -- Does something like
that exist at all?)

Cheers,
Nico

Nico Schl�mer wrote:

Hi,

I'm looking into replacing my MATLAB(R) plotting routines by something
slicker, and quite naturally found matplotlib. It has all the
capabilities that I would need, except that I can't yet transform my
plots into TikZ.
For MATLAB(R), I used this rather elaborate script
<http://win.ua.ac.be/~nschloe/content/matlab2tikz>.

Well, I thought I can just go ahead and start writing and equivalent
backend; the documentation is really nice and clear (quite unlike
MATLAB's!) so it was no big problem to get into the concepts of the
backend. I played around a little and thought about how I could
implement this and that, and some questions arose which can maybe best
answered here.

For the sake of clarity, let me just give a snippet of Pgfplots (TikZ)
code that I would like the backend to produce

===================== *snip* =====================
[...]
\begin{semilogyaxis}
[axis on top,
xtick={2,4,6,8,10,12,14,16},
ytick={1e-15,1e-10,1e-05,1,100000},
xmin=0.000000e+00,xmax=1.700000e+01,
ymin=1.000000e-15,ymax=1.000000e+05,
xmajorgrids,
ymajorgrids,
title={$\norm{F(\psi)}_2$},
xlabel={$k$},
width=\figurewidth,
height=\figureheight,
scale only axis
]
% Line plot
\addplot [color=red,only marks,mark=*,mark options={solid,fill=red}]
coordinates{
(1.000000e+00,3.206000e+01) (2.000000e+00,3.860000e+01)
(3.000000e+00,1.421000e+03) (4.000000e+00,4.143000e+02)
(5.000000e+00,1.445000e+02) (6.000000e+00,3.775000e+01)
(7.000000e+00,7.455000e+00) (8.000000e+00,7.228000e-01)
(9.000000e+00,2.275000e-02) (1.000000e+01,4.953000e-05)
(1.100000e+01,9.718000e-10) (1.200000e+01,5.534000e-07)
(1.300000e+01,4.217000e-11) (1.400000e+01,3.930000e-03)
(1.500000e+01,2.067000e-07) (1.600000e+01,7.231000e-12)
};
\end{semilogyaxis}
[...]
===================== *snap* =====================

This yields a purely marker plot (without lines) on a coordinate
system where the y-coordinate is log-scaled. You see that the code is
rather semantic and can easily be edited (which is think is the whole
point of Pgfplots as opposed to pure TikZ).

Now, if for a matplotlib plot I had query functions for the the axes
ranges, ticks, grids, titles, data values, and so on and so forth, it
would be a more or less complicated parsing of those and creating what
we see above.

However, it seems to me that the concept of backends is different.
draw_path() would be called to plot the graph itself, the coordinate
axes, and basically everything that resembles a line, giving its
outline (color, shape, markers, start and end points) but *no*
semantic information, that is, whether the path is part of an axis, an
arrow or whatever.
Is that correct?
  

Yes -- the backend interface is designed for "dumber" output formats that don't know anything about the semantics of plots -- PDF, PS, SVG etc. are basically "draw a line here, put some text there" sorts of things. Of course, you could probably write a TikZ backend without much semantic detail fairly easily.

Considering this, what do matplotlob masterbrains :slight_smile: think would be a
good way to extract Pgfplots code out of a matplotlib figure? Is a
backend feasible at all? Would a function as "matplotlib2tikz(
myFigure )" be more advisable, making use of all sorts of query
functions? (Such as myFigure.axes.get_xlim() -- Does something like
that exist at all?)
  

That sounds more feasible -- however, keep in mind that anything without a public interface (a get_* method) is free to change in a future version of matplotlib. I suspect (though haven't thought it all the way through) that you may be required to dig into private members to get everything you need. I also suspect that this will be a lot of work to support all of the kinds of plots that matplotlib supports, and problems are likely to arise when the semantics of Pgfplot and/or TikZ don't mesh well. For example, does Pgf plot support the same set of nonlinear transformations as matplotlib?

Other people have suggested TikZ support in the past, and I, personally, haven't been very convinced of the usefulness of such a thing. It's just as easy to include a PDF or EPS in LaTeX, and the mathtext and/or usetex functionality goes a long way to making the plots blend in nicely -- though it does take some care to control the size of the plots to make the font sizes match, it's by no means impossible. Jae-Joon's work with fancy arrow styles brings in a lot of the finesse features of TikZ to matplotlib. What do you see as the use cases for a TikZ backend?

Mike

···

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

That sounds more feasible -- however, keep in mind that anything without a
public interface (a get_* method) is free to change in a future version of
matplotlib. I suspect (though haven't thought it all the way through) that
you may be required to dig into private members to get everything you need.

That may be possible, particularly I suppose for the actual plot data
(x, y(, z) values). I'm going to have to look into what
matplotlib.get* can offer me. Is there a list with all the methods
available?

I also suspect that this will be a lot of work to support all of the kinds
of plots that matplotlib supports, and problems are likely to arise when the
semantics of Pgfplot and/or TikZ don't mesh well. For example, does Pgf
plot support the same set of nonlinear transformations as matplotlib?

matplotlib and Pgfplots certainly have a different feature set, but
for a couple of common simple plots (e.g., 2D x, y data, log-scaled
axes, lines, markers, and so forth) it should be possible to write a
very basic extendible translator.

Other people have suggested TikZ support in the past, and I, personally,
haven't been very convinced of the usefulness of such a thing.

Well, yeah, I guess it is possible to export PDF files in such a way
that it looks sort of nice if you watch out what you do with the font
sizes, and font family in general. Over the years I've become somewhat
tired of that whole tinkering with the fonts, and eventually I'd be
about 80% happy with the result I got. If you don't want to do that
every time, and you want 100%, TikZ/Pgfplots can come in quite handy;
especially if a simple module can be written rather quickly. If you
have a simple 2D plot and you would like to get it cleanly into your
LaTeX file, why not use TikZ if it's available?

Cheers,
Nico

Matplotlib, by design, needs to know the exact dimension (height,
width and descent) of texts that the backend will produce (before the
output is produced), and I wonder if that's going to be possible with
TikZ.
Unless you can solve this problem, I don't think tikz backend will be feasible.

Some form of simple translator like "matplotlib2tikz(myFigure)" should
certainly be possible. But things like legends and annotations will be
very tricky, if not impossible.

In my opinion, there is no reason that matplotlib can make user 100%
happy, because matplotlib itself actually runs tex/latex in usetex
mode. If TikZ can do that, I believe matplotlib also can do. If you're
not 100% happy with the output, please report why it is so. If you're
using matplotlib with usetex mode, and somehow the fonts does not
match with other texts in your document, it only means that there is a
room for improvement in matplotlib. At least to me.

Regards,

-JJ

Hi,

I've started something for a translator, see

http://win.ua.ac.be/~nschloe/other/websvn/filedetails.php?repname=matplotlib2tikz&path=%2Ftrunk%2Fmatplotlib2tikz.py&templatechoice=Elegant

It's my first time with Python, so there'll probably a lot of things
that you don't do (tm); hints very much appreciated!

In my opinion, there is no reason that matplotlib can make user 100%
happy, because matplotlib itself actually runs tex/latex in usetex
mode. If TikZ can do that, I believe matplotlib also can do.

I attached a PDF with two figures, one PDF generated and one TikZ
(generated using my little script). There's clearly a difference in
how seamless the two figures embed into the document, but I might not
be using the PDF backend appropriately. Any hints on how I could get
the fonts in place to allow for a fairer comparison?

Cheers,
Nico

pdf-vs-tikz.pdf (44.2 KB)

Nico Schlömer wrote:

I attached a PDF with two figures, one PDF generated and one TikZ
(generated using my little script). There's clearly a difference in
how seamless the two figures embed into the document

The only real difference I see is font size. I"d play with that, and also play with how you scale the image when you embed it in TeX.

I know when I was putting a bunch of Matlab figures into a big LaTeX doc (years ago) , I got the best results by drawing the figures one size, then scaling them to fit the LaTeX doc (scaled to \texwidth, or something like that).

Also psfrag is great, though I haven't found a pdffrag -- is there one?

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

The only real difference I see is font size.

And the font family, and the color.

The advantage that I see with TikZ is that the font is *exactly* the
font used in the surrounding text, no matter the scaling of the axes.
A major reason for me to use TikZ was actually that I can rescale my
figures later, without having to regenerate the plots anew, tinkering
with font sizes to make them ~sort of~ match. (Also, one can use LaTeX
colors in the plots, e.g. (as in my case), corporate colors as defined
in the presentation template.)

Cheers,
Nico

Matplotlib determines the font properties when the output file is created.
But, TikZ (at least the way how your translator works) lets those
information determined when latex is run.
No doubt that TikZ result is much better than the matplotlib backends,
but that is because TikZ and maplotlib works differently. While TikZ
results looks better, this will greatly limit the functionality of
matplotlib. And I'm afraid that your translator will not do much more
than converting very simple plots.

Anyhow, good luck with your project!
Regards,

-JJ

···

On Tue, Jan 12, 2010 at 5:14 PM, Nico Schlömer <nico.schloemer@...761.....> wrote:

I attached a PDF with two figures, one PDF generated and one TikZ
(generated using my little script). There's clearly a difference in
how seamless the two figures embed into the document, but I might not
be using the PDF backend appropriately. Any hints on how I could get
the fonts in place to allow for a fairer comparison?

matplotlib. And I'm afraid that your translator will not do much more
than converting very simple plots.

Although Pgfplots can also do ~somewhat~ fancy stuff, I think you're
definitely right on that one. When I look at the matplotlib gallery,
I'm getting dizzy.

For everyday purposes, though, I'm rather interested in somewhat
simple plots: 2D graphs, maybe an image plot, something like that.
I'll see how far I get with that..

Thanks very much for your guys' advice!

Cheers,
Nico

Hey, and is there any sort of matplotlib market place where I could
put the file for general bashing/downloading once it can do more than
a sin-plot?

--Nico

Nico Schlömer wrote:

The advantage that I see with TikZ is that the font is *exactly* the
font used in the surrounding text, no matter the scaling of the axes.

I used to do that with psfrag -- it is a really nice tool. I miss it with pdftex. It does add an extra step, but it also supports any PS graphics. I wanted it the other day for a diagram I made with INkScape (I couldn't get the TeX plugin working...)

Maybe you could port psfrag to pdf instead (my selfish desire...)

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Maybe you could port psfrag to pdf instead (my selfish desire...)

Ever tried
http://tug.ctan.org/tex-archive/support/pdfrack/
?
--Nico

Nico Schlömer wrote:

Hey, and is there any sort of matplotlib market place where I could
put the file for general bashing/downloading once it can do more than
a sin-plot?
  

Well, github is my suggestion. If it's a patchset of the MPL source, then fork the MPL repository at http://github.com/astraw/matplotlib . If it's a standalone thing, just create a new project. Github makes this kind of sharing easy at all levels from casual one-off events to close collaboration.

-Andrew

Andrew Straw wrote:

Nico Schl�mer wrote:
  

Hey, and is there any sort of matplotlib market place where I could
put the file for general bashing/downloading once it can do more than
a sin-plot?
  

Well, github is my suggestion. If it's a patchset of the MPL source, then fork the MPL repository at http://github.com/astraw/matplotlib . If it's a standalone thing, just create a new project. Github makes this kind of sharing easy at all levels from casual one-off events to close collaboration.
  

We can also link to it from the matplotlib website once you have a location established, for example from here:

   http://matplotlib.sourceforge.net/users/toolkits.html

Mike

···

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

Well, there is something basic now at

    http://github.com/nicki/matplotlib2tikz

It can handle lines plots, images, and color bars; adding more stuff
should not be hard, and in case a few people are interested and
willing to contribute, the script will progress rather quickly I
reckon.

I myself am just picking up Python, so a quick look of the one or the
other pro would certainly be appreciated.

Cheers,
Nico

···

On Wed, Jan 13, 2010 at 2:24 PM, Michael Droettboom <mdroe@...31...> wrote:

Andrew Straw wrote:

Nico Schlömer wrote:

Hey, and is there any sort of matplotlib market place where I could
put the file for general bashing/downloading once it can do more than
a sin-plot?

Well, github is my suggestion. If it's a patchset of the MPL source, then
fork the MPL repository at http://github.com/astraw/matplotlib . If it's a
standalone thing, just create a new project. Github makes this kind of
sharing easy at all levels from casual one-off events to close
collaboration.

We can also link to it from the matplotlib website once you have a location
established, for example from here:

http://matplotlib.sourceforge.net/users/toolkits.html

Mike

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA