overridding builtin variables in pylab

John_Hunter1 · January 9, 2005, 9:16pm

I fear that this "fix" will rather add to the general

    > confusion than lighten it. First doing "from ... import
    > *" and then reverting it partly really seems like a bad
    > idea. The cleaner solution would be to fix
    > Numeric/numarray in such a way that the replacement
    > min/max/round/... retain the complete original
    > functionality of the builtins only extending them to new
    > types. Independent of pylab, there obviously is a bug in
    > the underlying libraries which should not be obscured but
    > solved.

This is not a bug in either matplotlib, Numeric or numarray. The min
and max in question were originally defined in Numeric's MLab module.
If you read that module's docstring

Matlab(tm) compatibility functions.

This will hopefully become a complete set of the basic functions available in
matlab. The syntax is kept as close to the matlab syntax as possible.

So the functions in question were explicitly designed to be compatible
with matlab, not python. The pylab interface of matplotlib was
designed to "finish the job" that MLab started, and extend the matlab
compatibility into the plotting arena, which is does quite faithfully.
Hence the pylab module imports the MLab symbols /
numarray.linear_algebra.mlab.

Because python handles namespaces well, there is no problem with MLab
defining a min with a different call syntax than the built-in. The
informed user can choose between them by avoiding the 'from MLab
import *' or 'from pylab import *' and using namespaces instead, eg
MLab.min or pylab.min.

In an effort to provide a matlab compatible environment, I have
encouraged the use of 'from pylab import *', which like matlab gives
you everything at your fingertips. Mainly this is for convenience to
the newbie and interactive user, who doesn't want to have to sort out
where all the symbols are. If you take a quick look at na_imports,
which imports the numarray symbols for matploltib.numerix, you'll see
this is nontrivial, with functions coming from a handful of different
places.

So I feel the current situation is reasonably coherent: pylab provides
matlab compatible functions. That said, I agree this is a problem,
and part of the migration from matplotlib.matlab to matplotlib.pylab
is made with the sense that this is ultimately a *python* library that
strives for matlab compatibility but not at all costs. I myself have
been bitten by the min/max confusion on more than one occasion, as
have other mpl developers.

So I see the merits of Andrew's proposal. If adopted, we should
provide nxmin, nxmax, nxround, etc in the pylab namespace and
advertise the changes widely in API_CHANGES, the CHANGELOG, the
tutorial and user's guide. We should be mindful that changing the
current behavior probably will break some mpl scripts and can result
in a *substantial* performance hit for users who use min and max in
place of nxmin and nxmax for numerix arrays because those functions
rely on the python sequence protocol. This was the source of a
considerable performance hit in the colormapping that was recently
fixed.

So newbies and naive users are going to get screwed one way or the
other. If we retain MLab.min, people who expect the python min will
sometimes get a signature error. If we instead provide nxmin, they'll
inadvertently take a performance hit if they use python min naively.

I'm weakly inclined to leave the situation as it is: it's compatible
with matlab which is essentially the pylab mission, and it's worked
for 10 or so years for Numeric's MLab. Cautious users and power users
have a clear alternative. If we leave it as in, we can easily provide
pymin, pymax, pyround, etc, for users who want the python version. I
am open to the proposal, but I think we should frame the argument as
one of performance versus convenience versus
least-likely-to-bite-your-ass versus matlab-compatibility rather than
fixing a bug.

JDH

_Fernando_Perez · January 10, 2005, 1:38am

John Hunter wrote:

I'm weakly inclined to leave the situation as it is: it's compatible
with matlab which is essentially the pylab mission, and it's worked
for 10 or so years for Numeric's MLab. Cautious users and power users
have a clear alternative. If we leave it as in, we can easily provide
pymin, pymax, pyround, etc, for users who want the python version. I
am open to the proposal, but I think we should frame the argument as
one of performance versus convenience versus
least-likely-to-bite-your-ass versus matlab-compatibility rather than
fixing a bug.

While pylab's mission is indeed is matlab compatibility, you already point out that this is not an 'at all costs' objective. This is one case where I really think that breaking compatibility with the base python language is a too high price to pay. I'm having a hard time justifying my position in a clear manner, but I have a strong 'gut feeling' about it. I'll try to provide some rational points, though:

One of pylab's, objectives is to help matlab users move over to python. While initially they will naturally only use the compatible functions, we hope they will grow out into using all the things python offers which matlab lacks (nice OO, listcomps, generator expressions, the great generic standard library, etc.). This means that ultimately, we hope they will really use the python language to its fullest. At that point, if they begin using pyton code from 'the wild', they are very likely to be bitten by this kind of incompatibility (as we all have).

The result: a decision made to ease the initial stages of a transition, ends up causing an everlasting headache. And it's not like we can guarantee 100% source compatibility, since they are after all different languages. I think it's much better to add min, max & friends to the few things matlab users need to learn in the transition, rather than have everyone pay for this from now on.

You also need to keep in mind that pylab is likely to be used by _python users_ who have no matlab experience (I am such a person). For this group, the change of a builtin in this manner is very unexpected, and the source of all sorts of problems. As anecdotal evidence, it was precisely this particular problem with MLab which convinced me, a few years ago, to _never_ use 'from foo import *'. Even though I was not a matlab user, I thought the MLab names were nice and short, and for a while imported it wholesale. Until I wasted a lot of time tracking the min/max bug one day. When I found it, I felt like screaming at the MLab writers, and decided never again to trust a third party library with a * import. To this day, I use Numeric and Scipy always with qualified imports.

IMHO, MLab simply got this one wrong 10 years ago, and pylab should not repeat their mistake. In my own code, I have often written simple a* routines: amap, amin, amax, around, short for arraymap, arraymin, etc. I think it's short and clear, and provides a nice distinction of their functionality versus the builtins (it's quicker to type amin than nxmin, esp. on a qwerty keyboard where nx is an off-home-row chord).

Anyway, this is as much as I'll say on the topic. It's ultimately your choice. But if I had my way, pylab would just provide a set of a*foo routines as array-based counterparts to the builtins, and it would document such a feature very prominently.

Cheers,

f

_Perry_Greenfield · January 10, 2005, 3:23am

Fernando Perez wrote:

While pylab's mission is indeed is matlab compatibility, you
already point out
that this is not an 'at all costs' objective. This is one case
where I really
think that breaking compatibility with the base python language
is a too high
price to pay. I'm having a hard time justifying my position in a clear
manner, but I have a strong 'gut feeling' about it. I'll try to
provide some
rational points, though:

[...]

My 2 cents is that I think Fernando is right on this issue. I'd rather
go with a solution that causes temporary pain for matlab users rather
than one that causes lingering, long-term irritations.

Perry