Date overhaul?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Recently I took a crack at fixing some of the bugs in dates.py, and it
seems like there's been some talk of overhauling how dates are handled.
I don't see an MEP for that, so I'm wondering if anyone can give me some
more details about what the impetus was for overhauling date handling
and just in general what needs to be done. I wouldn't mind taking a
crack at the date handling stuff while it's still fresh in my mind.

Paul,

I think the main thing is supporting, and taking advantage of, the numpy datetime64 dtype. One thing that has held this up is that datetime64 came into numpy half-baked, and has remained experimental with known problems that need to be fixed. It looks like the core of datetime64, ignoring timezone problems, isn't going to change, so it should be possible to work with that in matplotlib.

Eric

···

On 2015/01/07 11:48 AM, Paul Ganssle wrote:

Recently I took a crack at fixing some of the bugs in dates.py, and it
seems like there's been some talk of overhauling how dates are handled.
I don't see an MEP for that, so I'm wondering if anyone can give me some
more details about what the impetus was for overhauling date handling
and just in general what needs to be done. I wouldn't mind taking a
crack at the date handling stuff while it's still fresh in my mind.

you can do some googling, but the issue with timezones in datetime64 is
that is _always_ uses the system timezone to translate when parsing iso
strings (and bare datetime.datetime objects) without a timezone, and I'm
pretty sure does somethign like that when formatting string output, too.

It can be worked around if you are careful to always make it think you are
working in UTC.

This should change in a release or two (and I'm sorry to say that I've held
that up by stalling on getting proposals properly written up), but Eric's
right, the internals should stay close enough that it's worth using.

-Chris

···

On Wed, Jan 7, 2015 at 2:10 PM, Eric Firing <efiring@...229...> wrote:

  One thing that has held this up is that datetime64
came into numpy half-baked, and has remained experimental with known
problems that need to be fixed. It looks like the core of datetime64,
ignoring timezone problems, isn't going to change, so it should be
possible to work with that in matplotlib.

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

I'm real naive about this stuff, but I have always wondered why
matplotlib didn't just use datetime objects, or at least use
timezone-aware datetime objects as an "interchange" format to get the
timezone stuff right.

Skip

Time zone handling is a pain in the %}€*

And the definitions keep changing.

So you need a complex DB and library that needs frequent updating.

This is why neither the standard library nor numpy support time zone
handling out of the box.

But the datetime object does support a hook to add timezone info.

The numpy datetime64 may implementation _may_ provide a similar hook in
the future.

There is the pytz package, which MPL could choose to depend on.

But even that is a bit ugly--e.g. from the pytz docs:

"""Unfortunately using the tzinfo argument of the standard datetime
constructors ‘’does not work’’ with pytz for many timezones."""

So my suggestion is that MPL punts, and stick with leaving time zone
handling up to the user, I.e only use datetimes that are timezone "naive".
What this means is that MPL would always a assume all datetimes interacting
with each other are in the same time zone (including same DST status).

Anyway, I'm being a bit lazy here, so I may be wrong, but I think the issue
at hand is that MPL currently uses a float array to store and manipulate
datetimes, and the thought is that it may be better to use numpy datetime64
arrays -- that would give us more consistent precision, and less code to
convert to/from various datetime formats.
I'm a bit on the fence about whether it's time to do it, as datetime64 does
have issues with the locale timezone, but as any implementation would want
to work with not-just-the-latest numpy anyway, it may make sense to start
now.

-Chris

···

On Thu, Jan 8, 2015 at 7:04 AM, Skip Montanaro <skip@...503...> wrote:

I'm real naive about this stuff, but I have always wondered why
matplotlib didn't just use datetime objects, or at least use
timezone-aware datetime objects as an "interchange" format to get the
timezone stuff right.

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

I agree w/ the original poster that it would help to have a MEP to clearly define what the goals of the overhaul are

Something else to keep in mind: we at least don’t normally plot dates in “earth” based time systems. ~10 years ago we contracted with John Hunter to add the arbitrary unit system to MPL. This allows users to plot in their own data types and define a converter
to handle the conversion to MPL types and labeling. We have our own “date time” like class which handles relativistic time scales (TDB, TT, TAI, GPS, Mars time, etc) at extremely high precision. We register a unit converter w/ MPL which allows our users
to plot these types natively and use the xunits keyword (or yunits) to control how the plot looks. So we can do this:

plot( x, y, xunits=“GPS”, yunits=“km/s” )

plot( x, y, xunits=“PST”, yunits=“mph” )

It would also be pretty easy to add a time zone aware unit converter with the existing MPL code which would allow you to do things w/ datetime like this:

plot( x, y, xunits=“UTC+8” )

plot( x, y, xunits=“EST” )

I guess the point of this is to remind folks that not everyone plots dates in time zone based systems…

Ted

···

On Thu, Jan 8, 2015 at 7:04 AM, Skip Montanaro
<skip@…503…> wrote:

I’m real naive about this stuff, but I have always wondered why

matplotlib didn’t just use datetime objects, or at least use

timezone-aware datetime objects as an “interchange” format to get the

timezone stuff right.

Time zone handling is a pain in the %}€*

And the definitions keep changing.

So you need a complex DB and library that needs frequent updating.

This is why neither the standard library nor numpy support time zone handling out of the box.

But the datetime object does support a hook to add timezone info.

The numpy datetime64 may implementation may provide a similar hook in the future.

There is the pytz package, which MPL could choose to depend on.

But even that is a bit ugly–e.g. from the pytz docs:

“”“Unfortunately using the tzinfo argument of the standard datetime constructors ‘’does not work’’ with pytz for many timezones.”""

So my suggestion is that MPL punts, and stick with leaving time zone handling up to the user, I.e only use datetimes that are timezone “naive”. What this means is that MPL would always a assume all datetimes interacting with
each other are in the same time zone (including same DST status).

Anyway, I’m being a bit lazy here, so I may be wrong, but I think the issue at hand is that MPL currently uses a float array to store and manipulate datetimes, and the thought is that it may be better to use
numpy datetime64 arrays – that would give us more consistent precision, and less code to convert to/from various datetime formats.

I’m a bit on the fence about whether it’s time to do it, as datetime64 does have issues with the locale timezone, but as any implementation would want to work with not-just-the-latest numpy anyway, it may make sense to start now.
-Chris

Christopher Barker, Ph.D.

Oceanographer

Emergency Response Division

NOAA/NOS/OR&R (206) 526-6959 voice

7600 Sand Point Way NE (206) 526-6329 fax

Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@…272…236…