overridding builtin variables in pylab

John_Hunter1 · January 10, 2005, 10:46am

My 2 cents is that I think Fernando is right on this

    > issue. I'd rather go with a solution that causes temporary
    > pain for matlab users rather than one that causes
    > lingering, long-term irritations.

OK, looks like a consensus to me

I'm happy with Fernando's proposed names amin, amax, around, etc. If
everyone else is too, I propose Andrew implement his patch, provide
the compatibility names, and update the relevant docs to advertise
this prominently: API_CHANGES, CHANGELOG, tutorial and users guide.
Particularly in the latter two, I think we should warn people about
the potential performance hit of using the builtin min, max and
friends on large arrays.

I'll put this on the new "News Flash" section of the web site with the
next release, which at least should get people's attention.

A-foolish-consistency-is-the-hobgobblin-of-a-small-mindly-yours,
JDH

Norbert_Nemec · January 10, 2005, 1:29pm

There might be a solution that avoids the performance hit: there should not be
any problem with pylab offering an optimized set of min, max, etc. as long as
their signature is identical to the builtins and the behavior only extends
them. Something along the line of:

def min(*args, **kwargs):
if args == ():
  raise TypeError, "min() takes at least 1 argument (0 given)"
if len(args) == 1 and type(args[0]) is ArrayType:
  axis=kwargs.pop('axis',0)
  res = minimum.reduce(args[0],axis)
else:
  res = __builtin__.min(*args)
if len(kwargs)>0:
  raise TypeError, (
   "min() got an unexpected keyword argument '%s'"
   %kwargs.keys()[0]
  )
return res

Probably, one could even avoid separate amin, amax, etc. functions. The user
just has to be aware that the axis can only be given as keyword argument.

···

Am Montag, 10. Januar 2005 11:46 schrieb John Hunter:

I'm happy with Fernando's proposed names amin, amax, around, etc. If
everyone else is too, I propose Andrew implement his patch, provide
the compatibility names, and update the relevant docs to advertise
this prominently: API_CHANGES, CHANGELOG, tutorial and users guide.
Particularly in the latter two, I think we should warn people about
the potential performance hit of using the builtin min, max and
friends on large arrays.

--
_________________________________________Norbert Nemec
         Bernhardstr. 2 ... D-93053 Regensburg
     Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199
           eMail: <Norbert@...160...>

_Fernando_Perez · January 10, 2005, 7:49pm

Norbert Nemec wrote:

···

Am Montag, 10. Januar 2005 11:46 schrieb John Hunter:

I'm happy with Fernando's proposed names amin, amax, around, etc. If
everyone else is too, I propose Andrew implement his patch, provide
the compatibility names, and update the relevant docs to advertise
this prominently: API_CHANGES, CHANGELOG, tutorial and users guide.
Particularly in the latter two, I think we should warn people about
the potential performance hit of using the builtin min, max and
friends on large arrays.

There might be a solution that avoids the performance hit: there should not be any problem with pylab offering an optimized set of min, max, etc. as long as their signature is identical to the builtins and the behavior only extends them. Something along the line of:

Hmm. Those extra checks in your code don't come for free... I'd rather leave the builtins alone (many of them are C-coded, hence quite fast), and just provide array versions where needed.

Just my 1e-2

Best,

f

Andrew_Straw3 · January 10, 2005, 9:38pm

John Hunter wrote:

"Perry" == Perry Greenfield <perry@...31...> writes:

   > My 2 cents is that I think Fernando is right on this
   > issue. I'd rather go with a solution that causes temporary
   > pain for matlab users rather than one that causes
   > lingering, long-term irritations.

OK, looks like a consensus to me

I guess my opinion on this is already clear :).

I'm happy with Fernando's proposed names amin, amax, around, etc. If
everyone else is too, I propose Andrew implement his patch, provide
the compatibility names, and update the relevant docs to advertise
this prominently: API_CHANGES, CHANGELOG, tutorial and users guide.
Particularly in the latter two, I think we should warn people about
the potential performance hit of using the builtin min, max and
friends on large arrays.

I'm happy to implement whatever the consensus is, but I'm quite busy this week and away this weekend, so it'll be next week until I can do anything. If someone else wants to jump in and do it, I certainly won't mind.

Norbert Nemec wrote:

There might be a solution that avoids the performance hit: there should not be any problem with pylab offering an optimized set of min, max, etc. as long as their signature is identical to the builtins and the behavior only extends them. Something along the line of:

def min(*args, **kwargs):
if args == ():
  raise TypeError, "min() takes at least 1 argument (0 given)"
if len(args) == 1 and type(args[0]) is ArrayType:
  axis=kwargs.pop('axis',0)
  res = minimum.reduce(args[0],axis)
else:
  res = __builtin__.min(*args)
if len(kwargs)>0:
  raise TypeError, (
   "min() got an unexpected keyword argument '%s'"
   %kwargs.keys()[0]
  )
return res

What do people think about Norbert's "best of both worlds" approach? Although it seems great in theory, I'm disinclined to use it simply because it does override the builtin. Although he's doubtlessly constructed this with the greatest of care to perform exactly as the builtin, I wonder about obscure corner cases which won't behave exactly the same and may result in even more obscure bugs. Maybe my misgivings are undue paranoia on my part, and his 3rd way really is best. I suppose I'd want to throw lots of tests at it before I pronounce my final judgement on it, which I don't have time to do at the moment. (Next week, if need be...)

Just a thought for consideration: perhaps Norbert's code could actually be used by the underlying mlab.py modules? I guess some code in the wild uses the axis argument not as a keyword, so there would be a backwards incompatible change in that regard... Other than that, though, this kind of behavior from the mlab.py modules would probably have resulted in a less serious conundrum than what we now face.

Also, we must not forget about round, sum, and abs (and any others I have missed). For example, abs() caught me because I use the cgkit quaternion type, which overrides the __abs__ method and thus fails to work properly with the mlab.py implementation of abs().

Cheers!
Andrew

Andrew_Straw5 · January 10, 2005, 9:40pm

John Hunter wrote:

   > My 2 cents is that I think Fernando is right on this
   > issue. I'd rather go with a solution that causes temporary
   > pain for matlab users rather than one that causes
   > lingering, long-term irritations.

OK, looks like a consensus to me

I guess my opinion on this is already clear :).

I'm happy with Fernando's proposed names amin, amax, around, etc. If
everyone else is too, I propose Andrew implement his patch, provide
the compatibility names, and update the relevant docs to advertise
this prominently: API_CHANGES, CHANGELOG, tutorial and users guide.
Particularly in the latter two, I think we should warn people about
the potential performance hit of using the builtin min, max and
friends on large arrays.

I'm happy to implement whatever the consensus is, but I'm quite busy this week and away this weekend, so it'll be next week until I can do anything. If someone else wants to jump in and do it, I certainly won't mind.

Norbert Nemec wrote:

There might be a solution that avoids the performance hit: there should not be any problem with pylab offering an optimized set of min, max, etc. as long as their signature is identical to the builtins and the behavior only extends them. Something along the line of:

def min(*args, **kwargs):
if args == ():
  raise TypeError, "min() takes at least 1 argument (0 given)"
if len(args) == 1 and type(args[0]) is ArrayType:
  axis=kwargs.pop('axis',0)
  res = minimum.reduce(args[0],axis)
else:
  res = __builtin__.min(*args)
if len(kwargs)>0:
  raise TypeError, (
   "min() got an unexpected keyword argument '%s'"
   %kwargs.keys()[0]
  )
return res

What do people think about Norbert's "best of both worlds" approach? Although it seems great in theory, I'm disinclined to use it simply because it does override the builtin. Although he's doubtlessly constructed this with the greatest of care to perform exactly as the builtin, I wonder about obscure corner cases which won't behave exactly the same and may result in even more obscure bugs. Maybe my misgivings are undue paranoia on my part, and his 3rd way really is best. I suppose I'd want to throw lots of tests at it before I pronounce my final judgement on it, which I don't have time to do at the moment. (Next week, if need be...)

Just a thought for consideration: perhaps Norbert's code could actually be used by the underlying mlab.py modules? I guess some code in the wild uses the axis argument not as a keyword, so there would be a backwards incompatible change in that regard... Other than that, though, this kind of behavior from the mlab.py modules would probably have resulted in a less serious conundrum than what we now face.

Also, we must not forget about round, sum, and abs (and any others I have missed). For example, abs() caught me because I use the cgkit quaternion type, which overrides the __abs__ method and thus fails to work properly with the mlab.py implementation of abs().

Cheers!
Andrew

Norbert_Nemec · January 11, 2005, 9:00am

OK, I understand that the overhead added by my routine is a bit much. Maybe,
we could just go half-way? Overriding min and max for arrays, but leaving the
axis-argument for more specialized amin and amax routines?

Also, we must not forget about round, sum, and abs (and any others I have
missed). For example, abs() caught me because I use the cgkit quaternion
type, which overrides the __abs__ method and thus fails to work properly
with the mlab.py implementation of abs().

Look at these one by one:
* abs() already calls an __abs__() method. The clean way to extend it, would
therefore be to give arrays such a method. This should solve the problem
completely.

* round() does not seem to be extendable in Python. Maybe we should propose to
change Python itself to introduce a __round__ method? That would only be
straightforward.

* min, max and sum are all based on iterating over a sequence. Maybe, one
should again have __min__, __max__ and __sum__ which should then be checked
by the builtin before falling back to iterating over the sequence? I could
imagine many kinds of containers that could optimize these operations. So
this would again be a detail that should be changed in Python itself. If the
builtin min() function would then also pass on keyword arguments, that would
solve our problem completely and thoroughly.

Does anybody have experience with discussions in the Python forum to estimate
how realistic such a PEP would be?

Ciao,
Norbert

···

Am Montag, 10. Januar 2005 22:38 schrieb Andrew Straw:

--
_________________________________________Norbert Nemec
         Bernhardstr. 2 ... D-93053 Regensburg
     Tel: 0941 - 2009638 ... Mobil: 0179 - 7475199
           eMail: <Norbert@...160...>