empty date formatter unit tests

_John_Hunter · September 20, 2009, 10:26pm

I spent a long time working on

https://sourceforge.net/tracker/index.php?func=detail&aid=2861426&group_id=80706&atid=560720

and the associated unit test. I learned a lot and the unit tests really helped.

First, I decided that if someone sets the formatter or locator to be a
DateFormatter or DateLocator, then they are expressing their intention
to plot dates, so I triggered the axis unit conversion pipeline on
set_major_formatter(aDateFormatter) to use a DateConverter. This
worked fine, but broke the JPL unit tests, because in their
EpochConverter, they have a *different* class that nonetheless
converts to the "datenum" floats, ie days since 0000-00-00, and thus
they also use the AutoDateFormatter and AutoDateLocator in their
AxisInfo converter info.

So what was happening is that the unit tests for the Epochs passed in
an Epoch instance, this triggered the unit conversion interface which
set the formatter to be an AutoDateFormatter, which triggered my new
code to use a DateConverter, which in turn broke the Epochs since the
DateConverter doesn't know how to hand;e an Epoch. This is quite
subtle, but what is happening is that two different classes are
converting to the same floating point representation, and so both can
use the same Formatter and Locator, but we can't invert -- just
because we see a Formatter or Locator of a certain type, we can't
infer the class.

I would have never caught this w/o the unit test, would have happily
committed my "fix" while screwing up all the JPL stuff.

At the end of the day, after realizing all this, I decided the current
behavior was not a bug at all, and that mpl was doing what it was told
to do. And there is no way to be smart here, since the use of a
DateFormatter does not imply you want a DateConverter. So I simply
modified the DateFormatter code to raise with an intelligible message.

The question is: what to do with the unit test I wrote to expose the :
test_dates.test_empty_date_with_year_formatter

I can either leave it as a knownfailure, since that is what it is, or
modify it to set ax.xaxis_date as the traceback advises, and then add
the image comparison.

What do you think?

Another win for the unit tests, though they caused me to spend a lot
more time "fixing" this bug than I would have without them <wink>

JDH

_Drain_Theodore_R_34 · September 20, 2009, 11:50pm

John,
I've run into this problem quite a few times and I'd love to figure out some way to fix it. As an example, here's the kind of scenario this occurs in:

I embed MPL in a few different GUI's that plot data either in real-time or via the user selecting things. There is a saved state which contains preferences like auto-scaling, legend on/off, axis formatting, etc. When the app starts up, I need to create a plot to put on the screen and configure it. What I'd like to do is this:

- create widget
- apply format (date formatter, etc)
- apply settings (autoscale, etc)
- wait for data (either via real time feed or user clicking on things)

But this is impossible because of this kind of bug. Instead, I have to create a plot with a fake date range and test every operation to see if it's actually setting data before applying the settings like autoscale. In addition, if the user removes data from the plot (via menu or selectable lists), I have to either start over or "unset" the settings back to something safe so this error won't occur. It really makes coding something like this a royal pain.

I don't have a suggestion as of yet... Perhaps it could just return "N/A" or something like that.

I think part of the problem might be the default ranges used by the autoscaling algorithm when there is no data are invalid for certain formatters and locators. That suggests that possible solutions might be one of:

1) require autoscaling or scaling algorithms to return ranges that will be OK for known scalers/formatters. Perhaps some system that allows different autoscaling algorithms to be set which can configure the default?
2) require scalers/formatters to be robust for any range or engineer the system to allow them to report "errors" in a way that allows the plot do something reasonable and not trigger an exception (perhaps some changeable behavior w/ the default as an exception?).

I'll think about this a little this week and see if any other ideas come to mind.

Ted

···

-----Original Message-----
From: John Hunter [mailto:jdh2358@…149…]
Sent: Sunday, September 20, 2009 3:26 PM
To: Andrew Straw; matplotlib development list
Subject: [matplotlib-devel] empty date formatter unit tests

I spent a long time working on

https://sourceforge.net/tracker/index.php?func=detail&aid=2861426&group
_id=80706&atid=560720

and the associated unit test. I learned a lot and the unit tests
really helped.

First, I decided that if someone sets the formatter or locator to be a
DateFormatter or DateLocator, then they are expressing their intention
to plot dates, so I triggered the axis unit conversion pipeline on
set_major_formatter(aDateFormatter) to use a DateConverter. This
worked fine, but broke the JPL unit tests, because in their
EpochConverter, they have a *different* class that nonetheless
converts to the "datenum" floats, ie days since 0000-00-00, and thus
they also use the AutoDateFormatter and AutoDateLocator in their
AxisInfo converter info.

So what was happening is that the unit tests for the Epochs passed in
an Epoch instance, this triggered the unit conversion interface which
set the formatter to be an AutoDateFormatter, which triggered my new
code to use a DateConverter, which in turn broke the Epochs since the
DateConverter doesn't know how to hand;e an Epoch. This is quite
subtle, but what is happening is that two different classes are
converting to the same floating point representation, and so both can
use the same Formatter and Locator, but we can't invert -- just
because we see a Formatter or Locator of a certain type, we can't
infer the class.

I would have never caught this w/o the unit test, would have happily
committed my "fix" while screwing up all the JPL stuff.

At the end of the day, after realizing all this, I decided the current
behavior was not a bug at all, and that mpl was doing what it was told
to do. And there is no way to be smart here, since the use of a
DateFormatter does not imply you want a DateConverter. So I simply
modified the DateFormatter code to raise with an intelligible message.

The question is: what to do with the unit test I wrote to expose the :
test_dates.test_empty_date_with_year_formatter

I can either leave it as a knownfailure, since that is what it is, or
modify it to set ax.xaxis_date as the traceback advises, and then add
the image comparison.

What do you think?

Another win for the unit tests, though they caused me to spend a lot
more time "fixing" this bug than I would have without them <wink>

JDH

-----------------------------------------------------------------------
-------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart
your
developing skills, take BlackBerry mobile applications to market and
stay
ahead of the curve. Join us from November 9-12, 2009. Register
now!
http://p.sf.net/sfu/devconf
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.409 / Virus Database: 270.13.109/2384 - Release Date:
09/20/09 06:22:00

_John_Hunter · September 21, 2009, 2:35am

I think we have this problem mostly licked. The problem I was writing
about in my email is a 2nd tier problem. For example, in svn HEAD,
you can specify an "empty" date plot as long as you inform mpl of you
intentions.. From the test_date_empty unit test::

    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    ax.xaxis_date()
    fig.savefig('date_empty')

Here we are fine, because we call ax.axaxis_date, which informs mpl
that you intend to pass in datetime instances. The key piece which
makes this possible, which you allude to in your post, is the default
xlimits, which is new in svn HEAD. In particular, the default
converter provides an AxisInfo which now supports an optional
attribute

default_limits: the default min, max of the axis if no data is present

which is overridden in the DateConverter:

    def axisinfo(unit, axis):
        'return the unit AxisInfo'
        # make sure that the axis does not start at 0

        majloc = AutoDateLocator(tz=unit)
        majfmt = AutoDateFormatter(majloc, tz=unit)
        datemin = datetime.date(2000, 1, 1)
        datemax = datetime.date(2010, 1, 1)

return units.AxisInfo( majloc=majloc, majfmt=majfmt, label='',
default_limits=(datemin, datemax))

while the min/max are arbitrary, the important thing is that custom
types can now handle the default min/max limits, so when you present a
new type to mpl, the type can request a certain default view/data lim
if no data are presented. Additionally, because of the
"ignore"setting on the limts argument, we can detect whether the
limits we are applying are defaults or actively set by the user.

The complication that motivated the sf bug
http://sourceforge.net/tracker/?func=detail&aid=2861426&group_id=80706&atid=560720
is a bit more subtle. Here no data type is presented to mpl -- either
through "plot" or "fill" or "set_xlim" or whatever. If the user had
passed any data in, or manually expressed their intent through
"ax.xaxis_date" we would be fine. The difficulty is that they passed
no data in but declared their intention to use a "YearFormatter". My
original inclination, and the one that failed the unit tests, was to
trigger a call to Axis.axis_date (a new method) on a call to
ax.xaxis.set_major_formatter (or locator) where the argument was a
DateLocator or DateFormatter. This seemed to be an imminently
reasonable and helpful thing to do -- if they want a date locator or
formatter presumably they will be passing in dates -- but the unit
tests told me this was wrong.

The locators and formatters work on *converted* units. The
EpochConverter and DateConverter both convert their native types to
floating point days since 0000-00-00. So here are two custom
converter interfaces which both end up with the same floating point
representation. The conclusion is: mpl cannot use the
locator/formatter type to infer what the basic type that users will be
passing in. Just because two classes end up with the same floating
point representation does not indicate that they want the same
conversion pipeline from type -> float.

Nonetheless, we can, and already do in svn HEAD, handle the cases that
I think you are worried about in this email. As long as you know what
type you will be passing into mpl (regardless of whether you have any
data available right now) you can inform the units interface with

ax.xaxis.update_units(someval)

where someval is an instance of the type you plan to pass in. Doing
so will not affect your current data or view limits, but will trigger
the conversion interface and importantly will trigger the
units.AxisInfo.default_limits scaling which was recently added to
avoid the kinds of problems we have been seeing with date conversion
when no data is passed in.

So despite this long winded email, the current infrastructure should support

* create axes, etc

* set your current formatter/locator

* ax.xaxis.update_units(myInstance)

where myInstance is an object of the type you expect to pass in. As
long as you have registered a converter from type(myInstance) ->
ConversionInterface, you can now specify the default limits through
the ConversionInterface.default_limits method::

    @staticmethod
    def default_units(x, axis):
        'return the default unit for x or None for the given axis'
        return None

As an example in matplotlib.dates, we choose an arbitrary interval,
which while arbitrary avoids the 0..1 problem we have been having::

class DateConverter(units.ConversionInterface):
"""The units are equivalent to the timezone."""

    @staticmethod
    def axisinfo(unit, axis):
        'return the unit AxisInfo'
        # make sure that the axis does not start at 0

        majloc = AutoDateLocator(tz=unit)
        majfmt = AutoDateFormatter(majloc, tz=unit)
        datemin = datetime.date(2000, 1, 1)
        datemax = datetime.date(2010, 1, 1)

return units.AxisInfo( majloc=majloc, majfmt=majfmt, label='',
default_limits=(datemin, datemax))

JDH

···

On Sun, Sep 20, 2009 at 6:50 PM, Drain, Theodore R (343P) <theodore.r.drain@...179...> wrote:

I've run into this problem quite a few times and I'd love to figure out some way to fix it. As an example, here's the kind of scenario this occurs in:

I embed MPL in a few different GUI's that plot data either in real-time or via the user selecting things. There is a saved state which contains preferences like auto-scaling, legend on/off, axis formatting, etc. When the app starts up, I need to create a plot to put on the screen and configure it. What I'd like to do is this:

- create widget
- apply format (date formatter, etc)
- apply settings (autoscale, etc)
- wait for data (either via real time feed or user clicking on things)

But this is impossible because of this kind of bug. Instead, I have to create a plot with a fake date range and test every operation to see if it's actually setting data before applying the settings like autoscale. In addition, if the user removes data from the plot (via menu or selectable lists), I have to either start over or "unset" the settings back to something safe so this error won't occur. It really makes coding something like this a royal pain.

I don't have a suggestion as of yet... Perhaps it could just return "N/A" or something like that.

I think part of the problem might be the default ranges used by the autoscaling algorithm when there is no data are invalid for certain formatters and locators. That suggests that possible solutions might be one of:

1) require autoscaling or scaling algorithms to return ranges that will be OK for known scalers/formatters. Perhaps some system that allows different autoscaling algorithms to be set which can configure the default?
2) require scalers/formatters to be robust for any range or engineer the system to allow them to report "errors" in a way that allows the plot do something reasonable and not trigger an exception (perhaps some changeable behavior w/ the default as an exception?).

I'll think about this a little this week and see if any other ideas come to mind.

_John_Hunter · September 21, 2009, 1:29pm

where myInstance is an object of the type you expect to pass in. As
long as you have registered a converter from type(myInstance) ->
ConversionInterface, you can now specify the default limits through
the ConversionInterface.default_limits method::

@...769...
def default_units(x, axis):
'return the default unit for x or None for the given axis'
return None

This is a typo: I should be referring to the default_limits attribute
of AxisInfo. Instead, I pasted in the default_units method of the
conversion interface

As an example in matplotlib.dates, we choose an arbitrary interval,
which while arbitrary avoids the 0..1 problem we have been having::

class DateConverter(units.ConversionInterface):
"""The units are equivalent to the timezone."""

@...769...
def axisinfo(unit, axis):
'return the unit AxisInfo'
# make sure that the axis does not start at 0
   majloc = AutoDateLocator\(tz=unit\)
   majfmt = AutoDateFormatter\(majloc, tz=unit\)
   datemin = datetime\.date\(2000, 1, 1\)
   datemax = datetime\.date\(2010, 1, 1\)

   return units\.AxisInfo\( majloc=majloc, majfmt=majfmt, label=&#39;&#39;,
                          default\_limits=\(datemin, datemax\)\)

This is the correct way to specify default_limits

JDH

···

On Sun, Sep 20, 2009 at 9:35 PM, John Hunter <jdh2358@...149...> wrote:

_Drain_Theodore_R_34 · September 21, 2009, 2:34pm

Wow - thanks for the detailed update. I feel bad for making you type that much

Thanks for fixing that problem.

Ted

···

-----Original Message-----
From: John Hunter [mailto:jdh2358@…149…]
Sent: Sunday, September 20, 2009 7:35 PM
To: Drain, Theodore R (343P)
Cc: Andrew Straw; matplotlib development list
Subject: Re: [matplotlib-devel] empty date formatter unit tests

On Sun, Sep 20, 2009 at 6:50 PM, Drain, Theodore R (343P) > <theodore.r.drain@...179...> wrote:

> I've run into this problem quite a few times and I'd love to figure
out some way to fix it. As an example, here's the kind of scenario
this occurs in:
>
> I embed MPL in a few different GUI's that plot data either in real-
time or via the user selecting things. There is a saved state which
contains preferences like auto-scaling, legend on/off, axis formatting,
etc. When the app starts up, I need to create a plot to put on the
screen and configure it. What I'd like to do is this:
>
> - create widget
> - apply format (date formatter, etc)
> - apply settings (autoscale, etc)
> - wait for data (either via real time feed or user clicking on
things)
>
> But this is impossible because of this kind of bug. Instead, I have
to create a plot with a fake date range and test every operation to see
if it's actually setting data before applying the settings like
autoscale. In addition, if the user removes data from the plot (via
menu or selectable lists), I have to either start over or "unset" the
settings back to something safe so this error won't occur. It really
makes coding something like this a royal pain.
>
> I don't have a suggestion as of yet... Perhaps it could just return
"N/A" or something like that.
>
> I think part of the problem might be the default ranges used by the
autoscaling algorithm when there is no data are invalid for certain
formatters and locators. That suggests that possible solutions might be
one of:
>
> 1) require autoscaling or scaling algorithms to return ranges that
will be OK for known scalers/formatters. Perhaps some system that
allows different autoscaling algorithms to be set which can configure
the default?
> 2) require scalers/formatters to be robust for any range or engineer
the system to allow them to report "errors" in a way that allows the
plot do something reasonable and not trigger an exception (perhaps some
changeable behavior w/ the default as an exception?).
>
> I'll think about this a little this week and see if any other ideas
come to mind.

I think we have this problem mostly licked. The problem I was writing
about in my email is a 2nd tier problem. For example, in svn HEAD,
you can specify an "empty" date plot as long as you inform mpl of you
intentions.. From the test_date_empty unit test::

    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    ax.xaxis_date()
    fig.savefig('date_empty')

Here we are fine, because we call ax.axaxis_date, which informs mpl
that you intend to pass in datetime instances. The key piece which
makes this possible, which you allude to in your post, is the default
xlimits, which is new in svn HEAD. In particular, the default
converter provides an AxisInfo which now supports an optional
attribute

        default_limits: the default min, max of the axis if no data is
present

which is overridden in the DateConverter:

    def axisinfo(unit, axis):
        'return the unit AxisInfo'
        # make sure that the axis does not start at 0

        majloc = AutoDateLocator(tz=unit)
        majfmt = AutoDateFormatter(majloc, tz=unit)
        datemin = datetime.date(2000, 1, 1)
        datemax = datetime.date(2010, 1, 1)

        return units.AxisInfo( majloc=majloc, majfmt=majfmt, label='',
                               default_limits=(datemin, datemax))

while the min/max are arbitrary, the important thing is that custom
types can now handle the default min/max limits, so when you present a
new type to mpl, the type can request a certain default view/data lim
if no data are presented. Additionally, because of the
"ignore"setting on the limts argument, we can detect whether the
limits we are applying are defaults or actively set by the user.

The complication that motivated the sf bug
http://sourceforge.net/tracker/?func=detail&aid=2861426&group_id=80706&
atid=560720
is a bit more subtle. Here no data type is presented to mpl -- either
through "plot" or "fill" or "set_xlim" or whatever. If the user had
passed any data in, or manually expressed their intent through
"ax.xaxis_date" we would be fine. The difficulty is that they passed
no data in but declared their intention to use a "YearFormatter". My
original inclination, and the one that failed the unit tests, was to
trigger a call to Axis.axis_date (a new method) on a call to
ax.xaxis.set_major_formatter (or locator) where the argument was a
DateLocator or DateFormatter. This seemed to be an imminently
reasonable and helpful thing to do -- if they want a date locator or
formatter presumably they will be passing in dates -- but the unit
tests told me this was wrong.

The locators and formatters work on *converted* units. The
EpochConverter and DateConverter both convert their native types to
floating point days since 0000-00-00. So here are two custom
converter interfaces which both end up with the same floating point
representation. The conclusion is: mpl cannot use the
locator/formatter type to infer what the basic type that users will be
passing in. Just because two classes end up with the same floating
point representation does not indicate that they want the same
conversion pipeline from type -> float.

Nonetheless, we can, and already do in svn HEAD, handle the cases that
I think you are worried about in this email. As long as you know what
type you will be passing into mpl (regardless of whether you have any
data available right now) you can inform the units interface with

  ax.xaxis.update_units(someval)

where someval is an instance of the type you plan to pass in. Doing
so will not affect your current data or view limits, but will trigger
the conversion interface and importantly will trigger the
units.AxisInfo.default_limits scaling which was recently added to
avoid the kinds of problems we have been seeing with date conversion
when no data is passed in.

So despite this long winded email, the current infrastructure should
support

  * create axes, etc

  * set your current formatter/locator

  * ax.xaxis.update_units(myInstance)

where myInstance is an object of the type you expect to pass in. As
long as you have registered a converter from type(myInstance) ->
ConversionInterface, you can now specify the default limits through
the ConversionInterface.default_limits method::

    @staticmethod
    def default_units(x, axis):
        'return the default unit for x or None for the given axis'
        return None

As an example in matplotlib.dates, we choose an arbitrary interval,
which while arbitrary avoids the 0..1 problem we have been having::

class DateConverter(units.ConversionInterface):
    """The units are equivalent to the timezone."""

    @staticmethod
    def axisinfo(unit, axis):
        'return the unit AxisInfo'
        # make sure that the axis does not start at 0

        majloc = AutoDateLocator(tz=unit)
        majfmt = AutoDateFormatter(majloc, tz=unit)
        datemin = datetime.date(2000, 1, 1)
        datemax = datetime.date(2010, 1, 1)

        return units.AxisInfo( majloc=majloc, majfmt=majfmt, label='',
                               default_limits=(datemin, datemax))

JDH

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.409 / Virus Database: 270.13.110/2385 - Release Date:
09/20/09 17:51:00