numpy dtype argument

When looking, e.g. at axes.py, I see 3 different arguments passed to numpy astype()/array()/zero() and friends:

   x = np.asarray(x).astype(np.float32)
   x = np.zeros( x, np.float_ )
   x = np.ones((col,), float)

Is there a preferred one to stick to ?!

Manuel

2008/6/12 Manuel Metz <mmetz@...459...>:

When looking, e.g. at axes.py, I see 3 different arguments passed to
numpy astype()/array()/zero() and friends:

  x = np.asarray(x).astype(np.float32)
  x = np.zeros( x, np.float_ )
  x = np.ones((col,), float)

Is there a preferred one to stick to ?!

Both `np.float_` and the Python `float` are interpreted as
`np.float64`. The only time you really need something other than
`float` is if you require a width other than 64 (like in the first
line you showed).

Regards
Stéfan

This suggests that maybe the first line is a buglet (without any real consequence), since there happens to be no good reason to require that array to be single precision. I think it's reasonable to say that we should use double precision (float/float_/float64) everywhere floating point is needed, since a) that's what Python floats are and b) it's on the safe side precision-wise, and c) that's what most (if not all) of the C++ extensions, such as _backend_agg.cpp expect. Exceptions would be when some binary interface or file format etc. requires otherwise.

Note, the first line also has a moderate performance penalty since the array might be converted twice (once for asarray, and once for astype). It would be better to do:

  x = np.asarray(x, np.float_)

As for the distinction between the last two, I think it is primarily a style issue. I don't have a strong feeling either way, but consistency would be an improvement, I suppose.

Cheers,
Mike

St�fan van der Walt wrote:

···

2008/6/12 Manuel Metz <mmetz@...459...>:
  

When looking, e.g. at axes.py, I see 3 different arguments passed to
numpy astype()/array()/zero() and friends:

  x = np.asarray(x).astype(np.float32)
  x = np.zeros( x, np.float_ )
  x = np.ones((col,), float)

Is there a preferred one to stick to ?!
    
Both `np.float_` and the Python `float` are interpreted as
`np.float64`. The only time you really need something other than
`float` is if you require a width other than 64 (like in the first
line you showed).

Regards
St�fan

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
  
--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

One rule to remember is don't cast user input to float unless there is
no alternative.
Consider that input might be dates::

  In [9]: import numpy as np

  In [10]: import datetime

  In [11]: x = [datetime.date(2007,1,1), datetime.date(2008,1,1),
datetime.date(2009,1,1)]

  In [12]: xa = np.asarray(x)

  In [13]: print xa[1:]-xa[:-1]
  [365 days, 0:00:00 366 days, 0:00:00]

So many operations may be valid w/o an explicit cast to floats. Since
we are supporting custom user types, albeit imperfectly, this is
something to bear in mind. In particular, I find the native datetime
plotting invaluable, and the JPL uses custom unit types extensively.
All of this happens via the unit converter infrastructure in
matplotlib.units and there is machinery there to do the case to
floats. Last time this came up, Eric I think requested a document
specifying what operations can be assumed on input sequences. I
haven't done that yet, but I am mentioning this again here as a
reminder. When I have some time I'll try to get a document together
as well as some test code to test custom input types.

JDH

···

On Thu, Jun 12, 2008 at 8:20 AM, Michael Droettboom <mdroe@...31...> wrote:

Both `np.float_` and the Python `float` are interpreted as
`np.float64`. The only time you really need something other than
`float` is if you require a width other than 64 (like in the first
line you showed).

Manuel Metz wrote:

   x = np.asarray(x).astype(np.float32)
   x = np.zeros( x, np.float_ )
   x = np.ones((col,), float)

Is there a preferred one to stick to ?!

Michael Droettboom wrote:

  x = np.asarray(x, np.float_)

I'd vote for:

x = np.asarray(x, np.float)

It ends up resulting in the same thing, but I prefer not to have magic-looking underscores in names. There is a slight difference:

np.float is the python float type, whereas np.float_ is the numpy float64 type. I suppose in some future 128 bit OS, the python float may be something else. I like saying: "make this the standard python float type", rather than "make this a 64 bit float", even though it means the same thing in all versions of python I know of.

This, or course, only when casting really is required, as John H. pointed out.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...