date proposal / dropping python2.2 support

First, a general note about python2.2. It is becoming difficult to
maintain adequate support for python2.2. The pyparsing module, on
which mathtext relies, currently requires 2.3. Handling dates
properly requires the datetime module (or mx.datetime but I'm not
inclined to impose an external dependency). I am inclined to gently
drop support for 2.2. By gently, I mean that some features will no
longer work (mathtext and dates) but the core should, at least for the
near future. How many people would this adversely affect?

The dates module, aside from a bug that I patched yesterday in
response to a post by Jim Boyle, has two fundamental problems: no
timezone support and the date range supported by the built-in time
functions (the 1972 epoch) is too narrow. Both of these limitations
are imposed by trying to support python2.2.

I would like to rewrite the dates module, and the ticker functions for
dates, to use the python datetime module. Getting dates, timezones,
and daylight savings time right is non-trivial, and I think the
cleanest approach is to require python2.3 and datetime. Ie, I would
jettison support for epoch times and mx datetimes, as well as the
converter stuff.

The new plot_date signature would be

    def plot_date(self, d, y, fmt='bo', **kwargs):

where d would be an array of floats (no converter) and the floats
would be the number of days since 1,1,1 (Gregorian calendar). The
supported date range would be datetime.min to datetime.max (years 0001
- 9999).

The dates module would provide some helper functions so that you could
use to build date arrays from datetime and timedelta instances. It
not be too hard to add some helper functions to convert existing
epoch, mx, or datetime arrays to the required array of days floats.

timezones, including timezones other than the local one, would be
supported. Ie, if you are a financial guru in California, you could
work with Eastern time zone stock quotes or Central time zone pork
belly quotes. daylight savings time, etc, would be handled by the
datetime module.

The datetime module has functions toordinal and fromordinal to convert
to integer number of days since the start of the Gregorian calendar,
but not floating point. Ie, hours minutes, seconds, etc are lost. My
guess is that it is done this way to avoid imprecisions in floating
point, but am not sure. I have implemented to_ordinalf and
from_ordinalf to do these conversions preserving the hours, etc. They
seem to work. I occasionally get rounding error on the order of a
couple microseconds, which I think should be tolerable for the vast
majority of cases. If you need microsecond precision, you can
use plot and not plot_date in any case.

Below, I'm including some prototype code which does these conversions
- if you have interest or experience with dates and timezones, please
look over it to see if I'm making any fundamental or conceptual
errors.

There is also a function drange, which can be used to construct the
floating point days arrays plot_date would require.

Any other suggestions for improvement or changes to date handling
welcome. Speak now, or forever hold your peace!

JDH

import sys, datetime
from matplotlib.numerix import arange
from matplotlib.dates import Central, Pacific, Eastern, UTC

HOURS_PER_DAY = 24.
MINUTES_PER_DAY = 60.*HOURS_PER_DAY
SECONDS_PER_DAY = 60.*MINUTES_PER_DAY
MUSECONDS_PER_DAY = 1e6*SECONDS_PER_DAY

#tz = None
tz = Pacific
#tz = UTC

def close_to_dt(d1, d2, epsilon=5):
   'assert that datetimes d1 and d2 are within epsilon microseconds'
   delta = d2-d1
   mus = abs(delta.days*MUSECONDS_PER_DAY + delta.seconds*1e6 + delta.microseconds)
   assert(mus<epsilon)

def close_to_ordinalf(o1, o2, epsilon=5):
   'assert that float ordinals o1 and o2 are within epsilon microseconds'
   delta = abs((o2-o1)*MUSECONDS_PER_DAY)
   assert(delta<epsilon)
   
def to_ordinalf(dt):
    """
    convert datetime to the Gregorian date as UTC float days,
    preserving hours, minutes, seconds and microseconds. return value
    is a float
    """

    if dt.tzinfo is not None:
        delta = dt.tzinfo.utcoffset(dt)
        if delta is not None:
            dt -= delta
    
    base = dt.toordinal()
    return (base + dt.hour/HOURS_PER_DAY + dt.minute/MINUTES_PER_DAY +
            dt.second/SECONDS_PER_DAY + dt.microsecond/MUSECONDS_PER_DAY
            )

def from_ordinalf(x, tz=None):
    """
    convert Gregorian float of the date, preserving hours, minutes,
    seconds and microseconds. return value is a datetime
    """
    ix = int(x)
    dt = datetime.datetime.fromordinal(ix)
    remainder = x - ix
    hour, remainder = divmod(24*remainder, 1)
    minute, remainder = divmod(60*remainder, 1)
    second, remainder = divmod(60*remainder, 1)
    microsecond = int(1e6*remainder)
    dt = datetime.datetime(dt.year, dt.month, dt.day, int(hour), int(minute), int(second), microsecond, tzinfo=UTC())

    if tz is not None:
        return dt.astimezone(tz)
    else: return dt
    
def drange(dstart, dend, delta):
    """
    Return a date range as float gregorian ordinals. dstart and dend
    are datetime instances. delta is a datetime.timedelta instance
    """
    step = delta.days + delta.seconds/SECONDS_PER_DAY + delta.microseconds/MUSECONDS_PER_DAY
    f1 = to_ordinalf(dstart)
    f2 = to_ordinalf(dend)
    return arange(f1, f2, step)
    
dt = datetime.datetime(1011, 10, 9, 13, 44, 22, 101010, tzinfo=tz)

x = to_ordinalf(dt)
newdt = from_ordinalf(x, tz)
close_to_dt(dt, newdt)

date1 = datetime.datetime( 2000, 3, 2, tzinfo=tz)
date2 = datetime.datetime( 2000, 3, 5, tzinfo=tz)
delta = datetime.timedelta(hours=8)
print drange(date1, date2, delta)

d1 = datetime.datetime( 2000, 3, 2, 4, tzinfo=tz)
d2 = datetime.datetime( 2000, 3, 2, 12, tzinfo=UTC())
o1 = to_ordinalf(d1)
o2 = to_ordinalf(d2)
close_to_ordinalf(o1, o2)

print 'all tests passed'

I would like to rewrite the dates module, and the ticker functions for
dates, to use the python datetime module. Getting dates, timezones,
and daylight savings time right is non-trivial, and I think the
cleanest approach is to require python2.3 and datetime. Ie, I would
jettison support for epoch times and mx datetimes, as well as the
converter stuff.

I agree that datetime is the cleanest approach.
Always, better to choose standard libraries than 3rd party ones.

The new plot_date signature would be

    def plot_date(self, d, y, fmt='bo', **kwargs):

where d would be an array of floats (no converter) and the floats
would be the number of days since 1,1,1 (Gregorian calendar). The
supported date range would be datetime.min to datetime.max (years 0001
- 9999).

Actually, MATLAB adopts the same approach using floats for date and time.
There, the epoch is 0000/00/00. The datenum and datestr functions provide
the conversion
between floats and strings.

The dates module would provide some helper functions so that you could
use to build date arrays from datetime and timedelta instances. It
not be too hard to add some helper functions to convert existing
epoch, mx, or datetime arrays to the required array of days floats.

timezones, including timezones other than the local one, would be
supported. Ie, if you are a financial guru in California, you could
work with Eastern time zone stock quotes or Central time zone pork
belly quotes. daylight savings time, etc, would be handled by the
datetime module.

I have developed similar functions to convert an array of date or datetime
to array of numbers. To avoid imprecision of floating points, I used seconds
since 1900/1/1
without considering time zone. For my own domain, hydrologic modeling, I am
rarely concerned about time zone.
Based on my experience, I noticed the following things.

The function names to_ordinalf and from_ordinalf are difficult to remember.
How about just time2num and num2time?
In addition, how about modifying the functions to handle an array of numbers
or datetimes directly using map function?
And the from_ordinalf function will produce an error message if date object,
not datetime object, is tossed.

I also recommend to provide strftime and isoformat functions to handle array
of floating points directly.
Thanks for your effort.

Daehyok Shin

I think the
cleanest approach is to require python2.3 and datetime. Ie, I would
jettison support for epoch times and mx datetimes, as well as the
converter stuff.

+1 on this.

   def plot_date(self, d, y, fmt='bo', **kwargs):

where d would be an array of floats (no converter) and the floats
would be the number of days since 1,1,1 (Gregorian calendar). The
supported date range would be datetime.min to datetime.max (years 0001
- 9999).

Actually, MATLAB adopts the same approach using floats for date and time.

I think you should generally not blindly apply the MATLAB approach. Python is a more powerful language than MATLAB, and support for more than just doubles is one it's features!

How does the datetime module store the values in a datetime object? My first inclination would be to follow it's approach, but it may not be suited to arrays ( :frowning: )

A couple projects you might want to make use of:

http://pytz.sourceforge.net/

and:

https://moin.conectiva.com.br/DateUtil

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer
                                         
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...259...