mpl1 draft

_John_Hunter · July 19, 2007, 5:18pm

I've been working on a laboratory in which we can fruitfully discuss,
test, implement mpl1 design issues. I am a big fan of
python-as-modeling-language approach to design. I have tried to solve
from the ground up some of the design flaws in matplotlib -- the
transformation architecture and the data model, in which transformed
data is pushed to the backend with every draw. The goal was to get a
single file of pure python so people can get their heads around the
code in one place, and experiment w/o having to go through a
compile/install cycle. You will need the latest svn matplotlib and
the latest svn enthought traits 2 -- see the header of mpl1/mtraits.py
for install instructions for the latter.

The sketch is in mpl1/mpl1.py in matplotlib svn, and it does produce a
graph (see attached). Right now only path drawing is implemented. It
is now time to think about how to handle the Axis. We want to figure
out the right way to bundle and xaxis and a yaxis with an artist so
that we can support multiple y-axis etc on one Axes. Drawing axis
ticks also brings up another problem I have not figured out -- how to
draw markers in points at data locations in the figure. matplotlib
uses some trickery in the transforms (transoffset and friends)
designed to handle this. An alternative that I am considering is
making a first class primitive called Markers, which have a list of
x,y locations, a marker path, an affine and some path properties. The
renderer can then cache the path and then draw markers in points in
the right place. I am open to other ideas, but this is my current
thinking.

Most of the effort here has been trying to get the transformations
right, so please give me feedback and or make corrections and
suggestions -- I'm not wild about the naming either, so feel free to
come up with something better. There is also the question of whether
we want to pay up and use 4x4 from the ground up and just ignore the
3rd dimension to open the door for 3D support. My inclination is
probably not, but I am open to ideas.

Included below is the "DESIGN_GOALS" document, also in mpl1 svn::

Here are some of the things I would like to accomplish with mpl1. Any
and all of this is open to discussion. What I present below is pretty
ambitious, so if there is support, we will need significant
contributions from several developers for several months. Ideally, we
would get a good sketch working, and then organize a spint (3-4 days?)
for late August, where we try get as far as possible to making this
viable.

= Data copying =

Push the data to the backend only once, or only when required. Update
the transforms in the backend, but do not push transformed data on
every draw. This is potentially a major win, because we currently
move the data around on every draw. Eg, see how mpl1.py handles pusing
the paths when the renderer is set (Figure.set_renderer) but on draw
commands (Figure.draw) only pushes the current affine.

= Transformations =

Support a normal transformation architecture. The current draft
implementation assumes one nonlinear transformation, which happens at
a high layer, and all transformations after that are affines. In the
mpl1 draft, there are three affines: the transformation from view
limits -> axes units (AxesCoords.affineview), the transformation from
axes units to normalized figure units (AxesCoords.affineaxes), and the
transformation from normalized figure units to display
(Renderer.affinerenderer)

Do we want to use 3x3 or 4x4 to leave the door open for 3D developers?

How do transformations (linear and nonlinear) play with Axis features
(ticking and gridding). The ideal is a framework in which ticking,
gridding and labeling work intelligently with arbitrary, user
supplied, transformations. What is the proper transformation API?

= Objects that talk to the backend "primitives" =

Have just a few, fairly rich obects, that the backends need to
understand. Clear candidates are a Path, Text and Image, but despite
their names, don't confuse these with the eponymous matplotlib
matplotlib Artists, which are higher level than what I'm thinking of
here (eg matplotlib.text.Text does *a lot* of layout, and this would
be offloaded ot the backend in this conception of the Text primitive).
Each of these will carry their metadata, eg a path will carry its
stroke color, facecolor, linewidth, etc..., and Text will carry its
font size, color, etc.... We may need some optimizations down the
road, but we should start small. For now, let's call these objects
"primitives".

This approach requires the backends to be smarter, but they have to
handle fewer entities.

= Where do the plot functions live? =

In matplotlib, the plot functions are matplotlib.axes.Axes methods and
I think there is consensus that this is a poor design. Where should
these live, what should they create, etc?

= How much of an intermediate artist layer do we need? =

Do we want to create high level objects like Circle, Rectangle and
Line, each of which manage a Path object under the hood? Probably,
for user convenience and general compability with matplotlib. By
using traits properly here, many current matplotlib Arists will be
thin interfaces around one or more primitives.

I think the whole matplotlib.collections module is poorly designed,
and should be chucked wholesale, in favor of faster, more elegant,
optimizations and special cases. Just having the right Path object
will reduce the need for many of these, eg LineCollection,
PolygonCollection, etc... Also, everything should be numpy enabled,
and the sequence-of-python-tuples approach that many of the
collections take should be dropped. Obviously some of the more useful
things there, like quad meshes, need to be ported and retained.

= Z-ordering, containers, etc =

Peter has been doing a lot of nice work on z-order and layers for
chaco, stuff that looks really useful for picking, interaction, etc...
We should look at this approach, and think carefully about how this
should be handled. Paul may be a good candidate for this, since he
has been working recently on the picking API.

= Extension code =

I would like to shed all of the CXX extension code -- it is just too
small a nitch in the python world to base our project on. SWIG is
pretty clearly the right choice. mpl1 will use numpy for
transformations with some carefully chosen extension code where
necessary, to get rid of _transforms.cpp. I also plan to use the SWIG
agg wrapper, so this gets rid of _backend_agg. If we can enhance the
SWIG agg wrapper, we can also do images through there, getting rid of
_image.cpp. Having a fully featured, python-exposed agg wrapper will
be a plus in mpl and beyond. But with the agg license change, I'm
open to discussion of other approaches.

The major missing piece in ft2font, which is a pretty elaborate CXX
module. Michael may want to consider alternatives, including looking
at the agg support for freetype, and the kiva/chaco approach.

I want to do away with *all* GUI extension code. This should live
outside MPL if at all, eg in a toolkit if we need it. This means
someone needs to figure out how to get TkInter talking to a python
buffer object or a numpy array. Maintaining the GUI extension code
across platforms is an unending headache.

= Traits =

I think we should make a major committment to traits and use them from
the ground up. Even without the UI stuff, they add plenty to make
them worthwhile, especially the validation and notification features.
With the UI (wx only) , they are a major win for many GUI developers.
Compare the logic for sharing an x-axis using matplotlib transforms
with Axes.sharex with the approach used in mpl1.py with sync_trait-ed
affines.

= Axis handling =

The whole concept of the Axes object needs to be rethought, in light
of the fact that we need to support multiple axis objects on one Axes.
The matplotlib implementation assumes 1 xaxis and 1 yaxis per Axes,
and we hack two y-axis support (examples/two_scales.py) with some
transform shenanigans via twinx and multiple Axes where one is hidden,
but the approach is not scalable and is unwieldy.

This will require a fair amount of thought, but we should aim for
supporting an arbitrary number of axis obects, presumably associated
with individual artists or primitives. They also need to be *much*
faster. matplotlib uses Artists for each tick, tickline, gridline,
ticklabel, etc, and this is mind-numbingly slow. I have a prototype
axis implementations that draws the ticks with a single path using
repeated MOVETO and LINETO, for example, which will be incomparably
faster than using a separate object for each tick.

The other important featiure for axis support is that, for the most
part, they should be arbitrarily placeable (eg a "detached" axis).

= Breakage =

I think we need to be prepared to break the hell out of matplotlib.
The API will basically be a significant rewrite. pylab will still
mostly work unchanged -- that is the beauty of pylab -- though API
calls on return objects may be badly broken. We can mitigate this pain
if we desire with clever wrapper objects, but once you start calling
methods on return objects, you join the community of power users, and
this is the community I'm most willing to inconvenience with breakage.
We'll probably want to install into a new namespace, eg "mpl", and
envision both matplotlib and mpl co-existing for some time. In fact,
mpl might depend on matplotlib initially, eg until a CXX-free ft2font
is available.

We should expect to be supporting and using matplotlib for a long
time, since the proposals discussed here imply that it will be a long
wait until mpl1 is feature complete with matplotlib. In fact, we could
rightly consider this to be the mpl2 proposal, and keep releasing
matplotlib ehancements to 1.0 and beyond w/o signfificant breakage.
It's a nominal difference so I don't really have a preference.

Or we could forget all this wild speculation and resume our normally
scheduled lives.

= Chaco and Kiva =

It is a good idea for an enterprising developer to take a careful look
at the current Chaco and Kiva to see if we can further integrate with
them. I am gun shy because they seem formiddable and complex, and one
of my major goals here is to streamline and simplify, but they are
incredible pieces of work and we need to carefully consider them,
especially as we integrate other parts of the enthought suite into our
core, eg traits, increasing the possibility of synergies.

= Unit handling, custom object types =

There is a legitimate need to be able to feed custom objects into
matplotlib. Recent versions of matplotlib support this with a unit
registry in the "units" module. A clear use case is plotting with
native python datetime objects, which is supported in 0.90 via the
unit handling, which should probably be called "custom object handling
and conversion". This is a deep and complicated subject, involving
questions of where the original data live, how they are converted to
useful types (arrays of floats) etc. It's worth thinking this about
as we discuss redesign issues.

_Fernando_Perez1 · July 19, 2007, 5:28pm

Is Peter Wang on this list? If not, perhaps you should CC him and tip
him to come over. I know Robert monitors this, but we shouldn't make
him the single point of responsibility for keeping tabs on the bridges
with Chaco/ETS.

Just a minor logistical comment. Otherwise, go John!!!

In related news, I'll be posting the traits/configuration work I've
been playing with soon. I'm starting to like the ConfigObj/Traits
combo a LOT. Stay tuned.

Cheers,

f

···

On 7/19/07, John Hunter <jdh2358@...149...> wrote:

= Chaco and Kiva =

It is a good idea for an enterprising developer to take a careful look
at the current Chaco and Kiva to see if we can further integrate with
them. I am gun shy because they seem formiddable and complex, and one
of my major goals here is to streamline and simplify, but they are
incredible pieces of work and we need to carefully consider them,
especially as we integrate other parts of the enthought suite into our
core, eg traits, increasing the possibility of synergies.

_Darren_Dale · July 19, 2007, 6:26pm

Hi John,

I've been working on a laboratory in which we can fruitfully discuss,
test, implement mpl1 design issues.

[...]

You will need the latest svn matplotlib and
the latest svn enthought traits 2 -- see the header of mpl1/mtraits.py
for install instructions for the latter.

I have not been able to install traits by following the instructions in
mtraits.py. easy_install is pulling in enthought.util-3.0a1, which conflicts
with enthought.resource-2.0b1. Why do you pull etsconfig, util and debug from
one place, and traits 2 from another? I would have thought it easier to do:

easy_install -f http://code.enthought.com/enstaller/eggs/source/unstable/
enthought.traits-2.0b2.dev-r12847

but that doesnt work either, it doesnt download any of traits dependencies:

Searching for enthought.traits-2.0b2.dev-r12847
Reading http://code.enthought.com/enstaller/eggs/source/unstable/
Best match: enthought.traits-2.0b2.dev-r12847 [unknown version]
Downloading
http://code.enthought.com/enstaller/eggs/source/unstable/enthought.traits-2.0b2.dev-r12847.zip
Processing enthought.traits-2.0b2.dev-r12847.zip
Running enthought.traits-2.0b2.dev-r12847/setup.py -q
bdist_egg --dist-dir /tmp/easy_install-iDf8BC/enthought.traits-2.0b2.dev-r12847/egg-dist-tmp-oozCl0
install_requires:
        enthought.etsconfig >=2.0b1.dev, <3.a
        enthought.util >=2.0b1.dev, <3.a
test_requires:
        nose >= 0.9,
ui_requires:
        enthought.pyface >=2.0b1.dev, <3.a
        enthought.resource >=2.0b1.dev, <3.a
wx_requires:
        enthought.traits.ui.wx >=2.0b1.dev, <3.a
Adding enthought.traits 2.0b2.dev-r12847 to easy-install.pth file

Installed /usr/lib64/python2.5/site-packages/enthought.traits-2.0b2.dev_r12847-py2.5-linux-x86_64.egg
Skipping dependencies for enthought.traits 2.0b2.dev-r12847

···

On Thursday 19 July 2007 01:18:21 pm John Hunter wrote:

_John_Hunter · July 19, 2007, 6:26pm

I think the answer is because the install is broken and you have to
get some combination of packages that work together through a little
hackery. They're working on it ...

I encountered a similar problem at home last night, and Dave
recommended on the enthought list. I haven't had a chance to test
this yet. If this works, please update the install instructions in
mtraits if you get a minute.

Here is Dave's answer::

The problem is that this command, without any versions specified, is
mixing versions from the ETS 2.5 release and the still nascent ETS
3.0. And, as your seeing, they don't mix because most of the 2.x stuff
declares that it doesn't work with anything later than a 2.x version --
i.e. enthought.resource 2.0b2 requires an enthought.util version less
than 3.0a but you've already installed enthought.util version 3.0a1.
Easy_install is just doing its job here by giving you an error.

Clearly we need to get the uninstallable ETS 3.0 components out of the
various repos -- at least until they install smoothly. (See below
discussion.) In the meantime, to resolve this for yourself, back out
any 3.0a enthought components, and then do the following command:

sudo easy_install -f
http://code.enthought.com/enstaller/eggs/source/unstable/ \
"enthought.etsconfig < 3.0a" "enthought.util <3.0a" "enthought.debug
<3.0a"

For the rest of the world, I think we have to pull out the ETS v3.x
components from the repo until we get them to where they can be
installed. Bryce, you'll have to get them from somewhere else for your
project -- perhaps the customer-specific repo. Let's talk tomorrow and
then I can get them out of the repo and people can stop running into
this problem.

-- Dave

···

On 7/19/07, Darren Dale <dd55@...143...> wrote

On Thursday 19 July 2007 01:18:21 pm John Hunter wrote:

I have not been able to install traits by following the instructions in
mtraits.py. easy_install is pulling in enthought.util-3.0a1, which conflicts
with enthought.resource-2.0b1. Why do you pull etsconfig, util and debug from
one place, and traits 2 from another? I would have thought it easier to do:

Gael_Varoquaux1 · July 19, 2007, 6:32pm

> I have not been able to install traits by following the instructions
> in
> mtraits.py. easy_install is pulling in enthought.util-3.0a1, which
> conflicts
> with enthought.resource-2.0b1. Why do you pull etsconfig, util and
> debug

from

> one place, and traits 2 from another? I would have thought it easier
> to do:

I think the answer is because the install is broken and you have to
get some combination of packages that work together through a little
hackery. They're working on it ...

You replied faster than I could grep my mailbox :->.

The problem is that the traits 2. egg that you want to install depends on
ets2 components, but the dependance has not been well coded in the
package (ets2 is in general api incompatible with ets3, and the
dependance should specify a version number below 3.a) and some ets3
components get pulled in. So its a packaging bug that will be addressed.
In the mean time John solution is the good answer.

Ga�l

PS: sorry for the dup, John, I "miss-mutted"

···

On Thu, Jul 19, 2007 at 01:26:05PM -0500, John Hunter wrote:

On 7/19/07, Darren Dale <dd55@...143...> wrote
> On Thursday 19 July 2007 01:18:21 pm John Hunter wrote:

_Darren_Dale · July 19, 2007, 6:49pm

>
> I have not been able to install traits by following the instructions in
> mtraits.py.

[...]

I encountered a similar problem at home last night, and Dave
recommended on the enthought list. I haven't had a chance to test
this yet. If this works, please update the install instructions in
mtraits if you get a minute.

Here is Dave's answer::

[...]

sudo easy_install -f
http://code.enthought.com/enstaller/eggs/source/unstable/ \
"enthought.etsconfig < 3.0a" "enthought.util <3.0a" "enthought.debug
<3.0a"

That worked. The instructions in mtraits have been updated.

···

On Thursday 19 July 2007 02:26:05 pm John Hunter wrote:

On 7/19/07, Darren Dale <dd55@...143...> wrote
> On Thursday 19 July 2007 01:18:21 pm John Hunter wrote:

Eric_Firing2 · July 19, 2007, 7:02pm

Darren Dale wrote:

···

On Thursday 19 July 2007 02:26:05 pm John Hunter wrote:

On 7/19/07, Darren Dale <dd55@...143...> wrote

On Thursday 19 July 2007 01:18:21 pm John Hunter wrote:

I have not been able to install traits by following the instructions in
mtraits.py.

[...]

I encountered a similar problem at home last night, and Dave
recommended on the enthought list. I haven't had a chance to test
this yet. If this works, please update the install instructions in
mtraits if you get a minute.

Here is Dave's answer::

[...]

sudo easy_install -f
http://code.enthought.com/enstaller/eggs/source/unstable/ \
"enthought.etsconfig < 3.0a" "enthought.util <3.0a" "enthought.debug
<3.0a"

That worked. The instructions in mtraits have been updated.

The instructions still say to check out traits 2.0, but Robert is recommending that we go with traits 3. Do you really want to stick with version 2 now?

Eric

_John_Hunter · July 19, 2007, 7:05pm

No, I'm happy to move over. But I spent way more time getting traits
working and installed than I wanted to, and I wanted to spend most of
my time coding the sketch, so once I had it working I did not want to
break it. If someone wants to take the lead getting a working traits3
install with instructions and then migrate mpl1 (probably not much to
do there) I'm happy to switch over. I think Robert was recommending
the first release of Traits3 for us, which hasn't happened yet. But
if the svn version is working and installable, I'm happy to make the
switch now if advised.

JDH

···

On 7/19/07, Eric Firing <efiring@...229...> wrote:

The instructions still say to check out traits 2.0, but Robert is
recommending that we go with traits 3. Do you really want to stick with
version 2 now?

Eric_Firing2 · July 19, 2007, 7:14pm

John Hunter wrote:

The instructions still say to check out traits 2.0, but Robert is
recommending that we go with traits 3. Do you really want to stick with
version 2 now?

No, I'm happy to move over. But I spent way more time getting traits
working and installed than I wanted to, and I wanted to spend most of
my time coding the sketch, so once I had it working I did not want to
break it. If someone wants to take the lead getting a working traits3
install with instructions and then migrate mpl1 (probably not much to
do there) I'm happy to switch over. I think Robert was recommending
the first release of Traits3 for us, which hasn't happened yet. But
if the svn version is working and installable, I'm happy to make the
switch now if advised.

JDH

John,

I thought initially that a simple svn checkout of 3 was working with a slight tweak (editing api.py), but it looks like there are still some dependencies that don't show up immediately but that do show up when trying to run mpl1; it is again the ui code trying to pull in things from outside traits. So I don't have an immediate solution. It looks like the effort to make traits independently installable still has a ways to go.

Eric

···

On 7/19/07, Eric Firing <efiring@...229...> wrote:

Peter_Wang · July 19, 2007, 7:58pm

Actually I am subscribed to the list, but thanks to Robert for pointing out this thread to me.

John, much of what you have written is very interesting, and I will have a more detailed response later. I just want to say real quick, though, that I have been trying to work out Chaco's next architectural steps, and there is definitely some overlap with what you've outlined, but coming from a different direction.

-Peter

···

On Jul 19, 2007, at 12:28 PM, Fernando Perez wrote:

Is Peter Wang on this list? If not, perhaps you should CC him and tip
him to come over. I know Robert monitors this, but we shouldn't make
him the single point of responsibility for keeping tabs on the bridges
with Chaco/ETS.

_Bill_Baxter · July 19, 2007, 8:05pm

Chaco may be formidable and complex, but so is the list of features
and requirements you just posted. What about just focusing on a Pylab
wrapper for Chaco? And working with Peter to make Chaco everything
you envison. Or does Chaco have the same needs-a-rewrite architecture
issues as the mpl?

Just to be clear, I don't have any first hand experience with Chaco,
other than running the demos once. The main problems with Chaco I'm
aware of are 1) entanglement with the rest of ETS, which they're
working on, 2) no pylab like easy-to-use interface.

--bb

···

On 7/20/07, John Hunter <jdh2358@...149...> wrote:

= Chaco and Kiva =

It is a good idea for an enterprising developer to take a careful look
at the current Chaco and Kiva to see if we can further integrate with
them. I am gun shy because they seem formiddable and complex, and one
of my major goals here is to streamline and simplify, but they are
incredible pieces of work and we need to carefully consider them,
especially as we integrate other parts of the enthought suite into our
core, eg traits, increasing the possibility of synergies.

Peter_Wang · July 19, 2007, 8:28pm

Chaco may be formidable and complex, but so is the list of features
and requirements you just posted. What about just focusing on a Pylab
wrapper for Chaco? And working with Peter to make Chaco everything
you envison. Or does Chaco have the same needs-a-rewrite architecture
issues as the mpl?

There are certainly directions I'd like to take the architecture, but I'm not planning a rewrite anytime soon. One rewrite every 4 years is more than enough for me.

Just to be clear, I don't have any first hand experience with Chaco,
other than running the demos once. The main problems with Chaco I'm
aware of are 1) entanglement with the rest of ETS, which they're
working on, 2) no pylab like easy-to-use interface.

(1): Other than traits (and a teensy bit of traits UI), Chaco requires only Kiva and Enable. Its setup.py reflects this. This has been the case for a while, but historically the issue has been that all the interdependencies at the traits UI level sucked in basically the rest of ETS.

(2): Chaco2.shell has some rudimentary pylab-like features, but obviously is nowhere near complete. There are some examples of the sorts of things it can do: https://svn.enthought.com/enthought/browser/branches/enthought.chaco2_2.0/examples/shell. One thing to note about the shell is that its commands are just convenience functions that wrap existing Chaco containers and components, so the structure of the live plot that is built with, say, an imshow() command is similar to one that you could build by hand. This means that you can dynamically extend its behavior by adding new tools that the command-line interface doesn't know about. It also means that you can use the command-line interface to construct a plot (or grid of plots) and trivially embed that into an external application, no differently than if you had hand-coded to the lower, object-oriented layer.

-Peter

···

On Jul 19, 2007, at 3:05 PM, Bill Baxter wrote:

_Chris.Barker · July 19, 2007, 10:31pm

Lots of god stuff John!

There is also the question of whether
we want to pay up and use 4x4 from the ground up and just ignore the
3rd dimension to open the door for 3D support.

I say yes! 3-d really is a very often needed and requested feature. Sure, we can go to VTK or something for really sophisticated 3-d work, but being able to do the basic stuff with MPL would be wonderful.

If the framework supports it cleanly internally, it's much more likely that the 3-d stuff will get written.

This is potentially a major win, because we currently
move the data around on every draw.

Is it that expensive to push data around? In any case, it does sound cleaner and more efficient not to.

Do we want to use 3x3 or 4x4 to leave the door open for 3D developers?

4X4 -- is there much cost?

This approach requires the backends to be smarter, but they have to
handle fewer entities.

How many back-ends does the future hold? It seems if the GUI toolkits all use *Agg, then that's only one render for all of them. Then we need:

SVG
PDF
PS
???

Cairo would be nice, as it gives us almost all of them at once, but I guess licensing keeps that a non-starter. Oh well.

In matplotlib, the plot functions are matplotlib.axes.Axes methods and
I think there is consensus that this is a poor design.

Well, the OO interface has always felt a bit clunky to me, but I'm not sure where else plot functions could go -- I'd love to hear ideas, though.

Do we want to create high level objects like Circle, Rectangle and
Line, each of which manage a Path object under the hood?

I like that idea -- working with Paths should be saved for the gurus.

Just having the right Path object
will reduce the need for many of these, eg LineCollection,
PolygonCollection, etc...

sounds good.

Also, everything should be numpy enabled,
and the sequence-of-python-tuples approach that many of the
collections take should be dropped.

who hoo!

However, numpy doesn't handle "ragged" arrays well. I wonder if there's a good way to implement those, so that transforms can be done numpy-efficient.

= Extension code =

If we can enhance the
SWIG agg wrapper, we can also do images through there, getting rid of
_image.cpp. Having a fully featured, python-exposed agg wrapper will
be a plus in mpl and beyond.

Very nice.

But with the agg license change, I'm
open to discussion of other approaches.

hmm GPL now. Well, maybe Cairo's LGPL isn't so bad after all!

I want to do away with *all* GUI extension code.

yeah!

= Traits =
I think we should make a major committment to traits and use them from
the ground up.

Good plan.

= Breakage =

I think we need to be prepared to break the hell out of matplotlib.
The API will basically be a significant rewrite.

Well worth it.

> pylab will still

mostly work unchanged -- that is the beauty of pylab

As a rule for the future though, a stable OO interface would be nice.

Or we could forget all this wild speculation and resume our normally
scheduled lives.

no!!

= Chaco and Kiva =

It is a good idea for an enterprising developer to take a careful look
at the current Chaco and Kiva

OK. I have to ask -- why aren't we all just using Chaco? I know I'm not because ??years ago, Enthought was not really supporting anything but Windows -- is that still true? Would it be a whole lot less work to support GTK, OS-X, ??? in Chaco than keep developing a separate lib?

Great conversation starters!

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

_John_Hunter · July 19, 2007, 11:31pm

> This is potentially a major win, because we currently
> move the data around on every draw.

Is it that expensive to push data around? In any case, it does sound
cleaner and more efficient not to.

It can be very expensive. Imagine you are smoothly panning or zooming
a line object with 100,000 x,y points. All you are really doing is
changing the affine. Although we've done some things to help this
case, in matplotlib we still have to create a new path object every
time in the agg backend, and then transform it. It's much cheaper
just to push the affine to the backend in this case. So interaction
with large data sets should get better.

> Do we want to use 3x3 or 4x4 to leave the door open for 3D developers?

4X4 -- is there much cost?

The potential cost is not in the 3x3 vs 4x4, but in the extra row of
junk data you would store in the data matrix, which is N extra values
for plotting N points . The matrix multiplication would be 3x3 * 3xN
vs 4x4 * 4xN , so there would be a cost in memory and performance.

> This approach requires the backends to be smarter, but they have to
> handle fewer entities.

How many back-ends does the future hold? It seems if the GUI toolkits
all use *Agg, then that's only one render for all of them. Then we need:

SVG
PDF
PS
Cairo would be nice, as it gives us almost all of them at once, but I
guess licensing keeps that a non-starter. Oh well.

Not at all, we want to fully support Cairo. We just want to have some
fully BSD compliant backends as well. agg 2.4 will remain BSD and I
don't have too much of a problem relying on it. We are not alone in
needing a BSD agg. I think the 4 you mentioned plus *Agg are the
ones we should target. The goal is to get all the GUIs to work with a
python buffer object or a numpy pixel buffer array -- if Agg and Cairo
can provide the same buffer or numpy format, then we would
automagically get *Agg and *Cairo across the GUIs.

JDH

···

On 7/19/07, Christopher Barker <Chris.Barker@...236...> wrote:

_Darren_Dale · July 19, 2007, 11:39pm

I also think we should use 4x4 affines, and ignore the third dimension. 10
years down the line, we might look back and regret not taking advantage of
this opportunity.

···

On Thursday 19 July 2007 6:31:26 pm Christopher Barker wrote:

> There is also the question of whether
> we want to pay up and use 4x4 from the ground up and just ignore the
> 3rd dimension to open the door for 3D support.

I say yes! 3-d really is a very often needed and requested feature.
Sure, we can go to VTK or something for really sophisticated 3-d work,
but being able to do the basic stuff with MPL would be wonderful.

If the framework supports it cleanly internally, it's much more likely
that the 3-d stuff will get written.

_Darren_Dale · July 19, 2007, 11:42pm

Is there much demand for BSD-compliant svg, pdf, and ps backends?

···

On Thursday 19 July 2007 7:31:11 pm John Hunter wrote:

On 7/19/07, Christopher Barker <Chris.Barker@...236...> wrote:
> How many back-ends does the future hold? It seems if the GUI toolkits
> all use *Agg, then that's only one render for all of them. Then we need:
>
> SVG
> PDF
> PS
> Cairo would be nice, as it gives us almost all of them at once, but I
> guess licensing keeps that a non-starter. Oh well.

Not at all, we want to fully support Cairo. We just want to have some
fully BSD compliant backends as well.

system · July 20, 2007, 2:38am

(oops, I meant to send that to the matplotlib list)

Hi,

I was looking at the transform code recently..

The potential cost is not in the 3x3 vs 4x4, but in the extra row of
junk data you would store in the data matrix, which is N extra values for
plotting N points . The matrix multiplication would be 3x3 * 3xN vs 4x4 *
4xN , so there would be a cost in memory and performance.

I'm not so clear about what you are planning for the transforms and
matrices in mpl1, especially in relation to the 4x4 to 3x3 matrices.
Couldn't you just pass around 4x4 matrices, but then truncate them to 3x3
right before you apply them? If you are passing any affine transforms to
the backend, you are going to be breaking apart your matrix anyway. (agg
accepts the affine transform like tuple a,b,c,d,tx,ty)

Also, my impression is that the matrix multiplication strategy in numpy is
going to be slow if it happens a lot. I am guessing what you are going to
do is do a matrix mult once just for the nonlinear transform when show()
is called, but it will not happen for redraws (due to panning etc). When
panning, only the affine part is changed, and the backend takes care of
that efficiently (in C, for agg). Therefore the matrix mult is very rare.
Is that correct?

Allan

···

On Thu, July 19, 2007 7:31 pm, John Hunter wrote:

Andrew_Straw5 · July 20, 2007, 2:47am

John Hunter wrote:

Do we want to use 3x3 or 4x4 to leave the door open for 3D developers?

4X4 -- is there much cost?

The potential cost is not in the 3x3 vs 4x4, but in the extra row of
junk data you would store in the data matrix, which is N extra values
for plotting N points . The matrix multiplication would be 3x3 * 3xN
vs 4x4 * 4xN , so there would be a cost in memory and performance.

I'm not sure this is the case -- there are still going to have to be
transforms which aren't 4x4 projections (e.g. polar transforms), so
presumably we keep the 2D affine transform as another case, just the one
that gets used 99% of the time. Perhaps it could even subclass the 4x4
projection transform but optimize needing 4D data on backends where this
is an optimization (all the 2D ones) and keep the 3rd and 4th dimensions
when sending to OpenGL or whereever. Or am I missing something?

My biggest mental stumbling block (which is IIUC is already solved for
mpl0, so it is really just my stumbling block) doing placements like
"OK, here are 10 million data points which need to go through this polar
transform, but also, plot this text such that the anchor is 2 pixels
above the 439th point." mpl can clearly do this by keeping a copy of the
transform before it pushes it off to the backend, but this gets tricker
when you need reference coordinates that only the backend can compute,
such as relative to rendered strings. Perhaps an explicit multipass
system is necessary? (This whole kettle of fish is actually the part of
mpl that I currently don't understand at all, so I'm probably
mischaracterizing the situation.)

Anyhow this is exciting, and I wish I had more time to jump in and help
code...

Ken_McIvor2 · July 20, 2007, 3:42am

Wow, lots of food for thought. Thanks John!

= Objects that talk to the backend "primitives" =

Have just a few, fairly rich obects, that the backends need to
understand. Clear candidates are a Path, Text and Image, but despite
their names, don't confuse these with the eponymous matplotlib
matplotlib Artists, which are higher level than what I'm thinking of
here (eg matplotlib.text.Text does *a lot* of layout, and this would
be offloaded ot the backend in this conception of the Text primitive).

This sounds like a great idea. I think you should consider making transformations a first-class citizen as well, to improve code readability. Some of us have to think really hard to translate "[[width, 0, 0], [0, -height, height], [0, 0, 1]]" into English!

= Where do the plot functions live? =

In matplotlib, the plot functions are matplotlib.axes.Axes methods and
I think there is consensus that this is a poor design. Where should
these live, what should they create, etc?

I propose that the plot functions be replaced by plot classes that implement whatever the new Artist API ends up being. I'm not sure where I'd put them (grouped into modules by type?) or how they should be added to an Axes (add_artist()?).

= How much of an intermediate artist layer do we need? =

Do we want to create high level objects like Circle, Rectangle and
Line, each of which manage a Path object under the hood?

Yes, but I don't think they should be subclassed from Path. I found that rather confusing because I read that as "Rectangle is-a Path".

= Z-ordering, containers, etc =

Peter has been doing a lot of nice work on z-order and layers for
chaco, stuff that looks really useful for picking, interaction, etc...
We should look at this approach, and think carefully about how this
should be handled.

Is there somewhere in particular that I can look to see what Peter's been working on? Enthought's svn repositories?

I also plan to use the SWIG agg wrapper, so this gets rid of _backend_agg. If we can enhance the SWIG agg wrapper, we can also do images through there, getting rid of _image.cpp. Having a fully featured, python-exposed agg wrapper will be a plus in mpl and beyond. But with the agg license change, I'm open to discussion of other approaches.

Have you looked into Fredrik Lundh's aggdraw module? Apart from not yet supporting image drawing/blitting, I think it might do the trick. I've attached an aggdraw version your mpl1.py that I wrote as proof of concept. I opted to start hacking instead of installing the traits package, so there's just a basic demo of the renderer for now.

Since aggdraw can work natively with the Python Imaging Library we'd be able to support every raster format under the sun *and* save those images to Python strings. That would make the webapps people very happy.

I want to do away with *all* GUI extension code.

Hurrah!

This means someone needs to figure out how to get TkInter talking to a python
buffer object or a numpy array.

I think PIL's ImageTk module would do the trick for converting RGBA - > PIL Image -> Tk Bitmap/PhotoImage.

= Traits =

I think we should make a major committment to traits and use them from
the ground up. Even without the UI stuff, they add plenty to make
them worthwhile, especially the validation and notification features.

I hate to be the first one to disagree, but here goes: traits give me the heebie-jeebies. I agree that matplotlib 1.0/2.0 needs to validate all user-settable parameters. However, I'm concerned about the development overhead that might result from making traits as a core dependency. Code readability is also a concern to me -- the experience of reading mpl1.py suggests to me that newcomers might find traits a bit too "voodoo". I'm confident that the same thing could be achieved using Python properties to validate attributes. Change notification is another matter, granted, but I think that a major rewrite will provide the opportunity to better design for those situations.

Ken

aggdraw_mpl1.py (5.13 KB)

···

On Jul 19, 2007, at 12:18 PM, John Hunter wrote:

Gael_Varoquaux1 · July 20, 2007, 6:37am

Well, I am a stupid noob as far as coding goes (hell, I am an
experimentalist, I work with a soldering iron and a mill, not a
computer), but I can tell you I find traited code very readable. It is
just that the API is very well done. It could have indeed been a
nightmare.

I think Python properties would be harder to read. Traits simply look
like type checking (that's prety much what it is, though), so people get
it really quickly. Besides, the manual is very well written.

But I am biased, Traits have gotten my out of a difficult moment, not
long ago, and I fell in love with them.

My 2 cents,

Ga�l

···

On Thu, Jul 19, 2007 at 10:42:56PM -0500, Ken McIvor wrote:

>= Traits =

>I think we should make a major committment to traits and use them from
>the ground up. Even without the UI stuff, they add plenty to make
>them worthwhile, especially the validation and notification features.

Code readability is also a concern to me -- the experience of reading
mpl1.py suggests to me that newcomers might find traits a bit too
"voodoo". I'm confident that the same thing could be achieved using
Python properties to validate attributes.