mpl.math namespace [was: Polygon examples broken]

> I used the following list:
>
> symlist=`cat <<EOF
> pi inf Inf nan NaN
> isfinite isnormal isnan isinf
> arccos arcsin arctan arctan2 cos sin tan
> arccosh arcsinh arctanh cosh sinh tanh
> exp log log10 expm1 log1p exp2 log2
> pow sqrt cbrt erf erfc lgamma tgamma hypot
> fmod remainder remquo
> fabs fdim fmax fmin
> copysign signbit frexp ldexp logb modf scalbn
> ceil floor rint nexttoward nearbyingt round trunc
> conj cproj abs arg imag real
> min max minimum maximum
> EOF`
>
> This measure doesn't distinguish between comments and
> code, but it should still be good enough for the purposes

As far as namespaecs are concerned, I agree they are a good idea and
should be used in almost all places. I also don't want the perfect to
be the enemy of the good, or succumb to a foolish consistency, so I
think is is OK to have some very common names that have a clear
meaning to be used w/o a namespace. I have been following your
discussion at a bit of a distance: are you talking about using scalar
functions or array functions here, eg math.sqrt vs numpy.sqrt? Also,

Since numpy.* handles scalars but math.* doesn't handle vectors, I
suggest importing from numpy.

a few of your symbols clash with python builtins (min, max, abs) which
is best avoided.

Feel free to tune the list appropriately. Particularly since max/min/abs
already do the right things with vectors:

>>> import numpy
>>> a = numpy.array([1,2,3,4])
>>> b = numpy.array([4,3,-2,-1])
>>> abs(b)
array([4, 3, 2, 1])
>>> isinstance(abs(b),numpy.ndarray)
True
>>> min(a)
1
>>> min(b)
-2

Well, mostly :sunglasses:

>>> min(a,b)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Finally, how would you feel about allowing these
symbols in the module namespace, but w/o the import * semantics, eg,
for these symbols we do

from mpl.math import exp, sin, pi, sin, cos, ...

it does defeat the purpose of your idea a bit, which is to have a set
of commonly agreed on math symbols that everyone agrees on and we can
always rely on with easy convenience. On the other hand, I am more
comfortable being explicit here.

That's acceptable.

If the list of common items were shorter it would be easier. Now
whenever I use an expression I have to search the file for the math
import statement and check whether the particular symbol I need has
already been added to the list. For my own projects I started making
a standard import line I could cut and paste into every file, but it
came to more than 80 characters.

- Paul

···

On Sat, Jul 21, 2007 at 09:42:16AM -0500, John Hunter wrote:

On 7/21/07, Paul Kienzle <pkienzle@...537...> wrote:

Hell, not with anything that more than 1D ! Beware.

Ga�l

···

On Sat, Jul 21, 2007 at 11:34:07AM -0400, Paul Kienzle wrote:

Feel free to tune the list appropriately. Particularly since max/min/abs
already do the right things with vectors:

True
>>> min(a)
1
>>> min(b)

We should prefer the numpyisms anyway, a.max(), a.min(), a.absolute().

> Finally, how would you feel about allowing these
> symbols in the module namespace, but w/o the import * semantics, eg,
> for these symbols we do
>
> from mpl.math import exp, sin, pi, sin, cos, ...
>
> it does defeat the purpose of your idea a bit, which is to have a set
> of commonly agreed on math symbols that everyone agrees on and we can
> always rely on with easy convenience. On the other hand, I am more
> comfortable being explicit here.

That's acceptable.

If the list of common items were shorter it would be easier. Now
whenever I use an expression I have to search the file for the math
import statement and check whether the particular symbol I need has
already been added to the list. For my own projects I started making
a standard import line I could cut and paste into every file, but it
came to more than 80 characters.

I have no problem with multi-line imports, but Eric hates them.
Apparently future python versions will support parentheses in imports
to solve the multiline problem,

I'm starting to come to the conclusion that the only reason to go with
a module with a core set of math/numpy names is if we want to use the
import * semantics, right? Otherwise, we might as well do

from numpy import the, core, list, of, math, symbols

rather than

from matplotlib.math import the, core, list, of, math, symbols

which only confuses the reader (are these symbols from numpy, math or
internal to mpl?)

So if we don't want to do that every time, and do want the convenience
of having a core set of names available in pretty much every mpl
module, we may have well have a "corenamespace" module that utilizes
__all__ . Basically, we would be making a statement that we are
flying w/o a net here but we know what we are doing. This would be
documented in the CODING_GUIDE in the import section. The reason I
bring up the idea of a core namespace is there are other functions
that belong there along side asarray and pi (iterable and
is_string_like come to mind).

I'm not advocating this approach -- I'm just pointing out that I see
no reason to have a separate module of numpy names if we are going to
be explicit about the imports from there. In that case, we may as
well explicitly import from numpy.

JDH

···

On 7/21/07, Paul Kienzle <pkienzle@...537...> wrote:

John Hunter wrote:

We should prefer the numpyisms anyway, a.max(), a.min(), a.absolute().

Exactly -- OO semantics make the whole namespace thing much more workable.

To sum up a bit -- This all started with a comment about how some of the pylab names clash with numpy names, so that:

from pylab import *
from numpy import *

doesn't quite work.

I waded in with my usual obsessive advocacy for namespaces, and then it was pointed out that in math expressions, all those name space prefixes made things look pretty ugly. I conceded that is the case, and that in the interests of "practicality beats purity", that maybe having a standard set of math symbols in a module to be import *'ed in could be a good idea.

Now to my personal advocacy about this idea:

I think that convenience overriding the clarity of namespaces ONLY applies to math expressions, like the example given:

> res = sqrt(2*sin(pi*x**2) + cos(x**2) - exp(2*pi*1j))

really is easier to deal with than:

  res = npy.sqrt(2*npy.sin(npy.pi*x**2) + npy.cos(x**2) - npy.exp(2*npy.pi*1j))

But it's not about saving typing -- it's about having math look like math. So the measure of whether a given name should be in the "math" namespace should be whether it's a "math" function, vs. a programming function.

Now some data:

Paul Kienzle wrote:

I'll let the code speak for itself:

~/src/matplotlib/lib/matplotlib pkienzle$ for sym in $symlist; do

    echo `grep "[^A-Za-z0-9_]$sym[^A-Za-z0-9_]" *.py | wc -l` $sym; done | sort -n -r | column -c 75

163 max 7 remainder 1 cosh 0 isnormal
136 arg 7 pow 1 arctanh 0 isinf
109 min 7 inf 1 arcsinh 0 isfinite
102 log 6 arctan2 1 arccosh 0 frexp
64 pi 5 fabs 0 trunc 0 fmin
56 sqrt 4 imag 0 tgamma 0 fmax
44 abs 3 tan 0 signbit 0 fdim
38 sin 3 nan 0 scalbn 0 expm1
28 cos 3 log2 0 rint 0 exp2
23 minimum 3 hypot 0 remquo 0 erfc
22 round 2 isnan 0 nexttoward 0 erf
19 maximum 2 arctan 0 nearbyingt 0 cproj
19 floor 2 arcsin 0 modf 0 copysign
18 log10 2 arccos 0 logb 0 conj
18 ceil 1 tanh 0 log1p 0 cbrt
13 real 1 sinh 0 lgamma 0 NaN
12 exp 1 fmod 0 ldexp 0 Inf

Now we're looking at the same data -- but what conclusions do we draw?

I think the top three: max, arg, min, don't pass the test of being "math functions", so we have log, pi, sqrt, abs, sin, cos ... that see quite a bit of use, so, may it's worth it, but in all of the MPL code, even 102 uses isn't that much -- and how many of those are buried in larger expressions? i.e.:

val = log(x)

isn't really that much better than:

val = npx.log(x)

It only really pays off in something contrived like:

z = sin(log(x) * exp(y) + log(n))**(1/log(t))

Now consider:

Eric Firing wrote:

For many of these things there are up to 5 different possible sources:

(builtin, if not math or cmath)
math
cmath
numpy
numpy.ma
maskedarray

I'd argue that for MPL, math and cmath are rarely needed, and we hope that soon there will only be one of numpy.ma and maskedarray, but nevertheless, it is an issue.

So my conclusion is that's it's not worth it, but reasonable people may disagree, of course!

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception