Units issue

_Ryan_May · May 20, 2009, 4:38pm

Hi,

In looking over a test failure, I’m seeing some behavior that doesn’t make sense to me. It looks like data passed to a line object is being improperly converted when units are involved. Here’s a version of the code in the test script, but modified to use the units in basic_units.py (in the examples/units directory). You should be able to just drop this script into the examples/units directory and run it:

from basic_units import secs, minutes, cm
import matplotlib.pyplot as plt

xdata = [ xsecs for x in range(10) ]
ydata1 = [ (1.5y - 0.5)cm for y in range(10) ]
ydata2 = [ (1.75y - 1.0)*cm for y in range(10) ]

fig = plt.figure()
ax = plt.subplot( 111 )
l1, = ax.plot( xdata, ydata1, color=‘blue’, xunits=secs )
l2, = ax.plot( xdata, ydata2, color=‘green’, xunits=minutes )

print l1._xorig
print l2._xorig

print ax.lines

plt.show()

Based on the original test, it seems like this behavior should work (just rescale the x-axis without actually changing the plot). Am I missing something, or is this a real bug?

Ryan

···

–
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
Sent from Norman, Oklahoma, United States

_Chris.Barker · May 20, 2009, 4:54pm

Ryan May wrote:

use the units in basic_units.py (in the examples/units directory).

This looks like pretty cool stuff. However, I can't seem to find matplotlib.units or basic_units.py in the online Sphinx docs. Is this a doc bug, or intentional?

There are units examples in the docs.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

_Ryan_May · May 20, 2009, 4:55pm

It looks like revision 7020 broke this in the process of adding units support for fill().

If I change the following lines (in the _xy_from_xy() function):

        if bx:
            x = self.axes.convert_xunits(x)

        if by:
            y = self.axes.convert_yunits(y)

back to:

if bx or by: return x, y, False

the example I posted works and the test failure I was seeing is gone. Of course, this breaks fill() with unit-ed quantities. I’m getting a little over my head here in terms of tracing the flow of units, so I’d love to hear opinions on how to actually fix this. IMHO, we really need to standardize on how units are handled. In some cases the axes method handles converting units, but in this case, the Line2D object also registers for changes to axis units so it can update itself.

Ryan

···

On Wed, May 20, 2009 at 11:38 AM, Ryan May <rmay31@…149…> wrote:

Hi,

In looking over a test failure, I’m seeing some behavior that doesn’t make sense to me. It looks like data passed to a line object is being improperly converted when units are involved. Here’s a version of the code in the test script, but modified to use the units in basic_units.py (in the examples/units directory). You should be able to just drop this script into the examples/units directory and run it:

–
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
Sent from Norman, Oklahoma, United States

_Ryan_May · May 20, 2009, 5:01pm

matplotlib.units maintains the api for registering unit-ed quantities and various other nuts and bolts. It’s another one of those modules whose docs hasn’t been converted to sphinx yet, but it does have doc strings. However, it does not provide any units itself. basic_units.py is an example with just a few basic quantities to show off how support in matplotlib works, but is not itself all that useful.

Darren Dale was working on a full-fledged package for adding units to numpy arrays called quantities (http://packages.python.org/quantities/user/tutorial.html), that would (I think) work with some of this, but last I saw it stalled a little due to issues with subclassing ndarray. I haven’t seen any other simple packages/modules that suppors general units for the simple goal of doing conversions for plotting.

Ryan

···

On Wed, May 20, 2009 at 11:54 AM, Christopher Barker <Chris.Barker@…236…> wrote:

Ryan May wrote:

use the units in basic_units.py (in the examples/units directory).

This looks like pretty cool stuff. However, I can’t seem to find

matplotlib.units or basic_units.py in the online Sphinx docs. Is this a

doc bug, or intentional?

There are units examples in the docs.

–
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
Sent from Norman, Oklahoma, United States

_John_Hunter · May 20, 2009, 5:46pm

The fundamental problem here is that some artists (Line2D) have
support for storing original unitized data (_xorig, _yorig) and
handling the conversion on unit change internally with the callback,
and some artists (eg Patches) do not . axes._process_plot_var_arg
subserves both plot (Line2D) and fill (Polygon), one of which is
expecting possibly unitized data and one which is only capable of
handling already converted data. Hence the fix one problem, create
another bind we are now in.

So yes, we need a standard.

I think the resolution might be in having intermediate higher level
container plot item objects (an ErrorBar, LintPlot, FilledRegion)
which store the original data, manage the units, and pass converted
data back to the primitives. This is obviously a major refactoring,
and would require some thought, but may be the best way to go.
Handling the conversions in the plotting functions (eg fill, errorbar)
is probably not the right way because there is no obvious way to
support unit changes (eg inches to cm) since the data is already
converted, the artists already created.

Having the artist primitives store the original, possibly unitized
data, and register callbacks for unit changes can work, but the
problem is how to create the artist primitives in such a way the unit
data is passed through correctly. The problem here is that some
operations don't make sense for certain unit types -- think addition
with datetimes. Some functions, eg bar or errorbar, which need to do
a lot of arithmetic on the input arrays, may want to do:

xmid = 0.5*(x[1:] + x[:-1])

which would not work for x if x is datetime (can't add two dates).
distance and scaling should always be well defined, so one should be
able to do:

xmid = x[1:] + 0.5*(x[1:]-x[:-1])

So one solution is to require all plotting functions to respect the
"no addition" rule, ie define the set of operations that are allowed
for plotting functions, and all artists to handle original unitized
data with internal conversion. This is a fair amount of work at the
plotting function layer, is invasive to the artist primitives, and
requires the extra storage at the artist layer, but could work.

The other solution, what I referred to as the intermediate plot item
container, is to have a class ErrorBar, eg, which is like the errorbar
method, but has an API like

  class ErrorBar:
    def __init__(self, all the errorbar args, possibly unitized):
      self.store_all_original_data_here()
      self.store_all_primitives_from_converted_data_here()

      def callback():
          self.update_all_stored_primitives_newly_converted_original_data()
      self.connect_callback_to_unit_change(callback)

This has the advantage that the plot item container class can always
work with arrays of floats (removing the onerous restriction on what
kind of binary relations are allowed) and removes the restrictions on
creating artists which are unit aware.

It also makes for a nicer API:

eb = ErrorBar(something)
eb.draw()

  # hmm, the cap widths are too small
  eb.capwidth = 12
  eb.draw()

ie, instead of getting back a bunch of artist primitives from errorbar
which may be difficult to manipulate, you get back an ErrorBar object
that knows how to update and plot itself.

With traits or properties so that the eb.capwidth attr setting
triggers a unitized updating of primitives, then everything is fairly
transparent to the user.

It would also make it easier support containers of artists for logical
groupings during animation, zorder buffering/blitting, etc.

JDH

···

On Wed, May 20, 2009 at 11:55 AM, Ryan May <rmay31@...149...> wrote:

On Wed, May 20, 2009 at 11:38 AM, Ryan May <rmay31@...149...> wrote:

Hi,

In looking over a test failure, I'm seeing some behavior that doesn't make
sense to me. It looks like data passed to a line object is being improperly
converted when units are involved. Here's a version of the code in the test
script, but modified to use the units in basic_units.py (in the
examples/units directory). You should be able to just drop this script into
the examples/units directory and run it:

It looks like revision 7020 broke this in the process of adding units
support for fill().

If I change the following lines (in the _xy_from_xy() function):

            if bx:
                x = self.axes.convert_xunits(x)
            if by:
                y = self.axes.convert_yunits(y)

back to:

            if bx or by: return x, y, False

the example I posted works and the test failure I was seeing is gone. Of
course, this breaks fill() with unit-ed quantities. I'm getting a little
over my head here in terms of tracing the flow of units, so I'd love to hear
opinions on how to actually fix this. IMHO, we *really* need to standardize
on how units are handled. In some cases the axes method handles converting
units, but in this case, the Line2D object also registers for changes to
axis units so it can update itself.

_Chris.Barker · May 20, 2009, 6:10pm

Ryan May wrote:

It's another one of those modules whose docs hasn't been converted to sphinx yet, but it does have doc strings.

Couldn't/shouldn't sphinx just use the docs strings so that there is SOMETHING there? I really love the sphinx docs, but it is frustrating got have a module simply not listed at all.

Darren Dale was working on a full-fledged package for adding units to numpy arrays called quantities (Quick-start tutorial — quantities v0.10.0 documentation),

thanks for the reminder -- that does look like a really nice package. It would be great to have a semi-standard for this stuff in the SciPy world -- and certainly MPL compatible!

last I saw it stalled a little due to issues with subclassing ndarray.

Darn. I hope I'll get a chance to delve into it soon.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

_Ryan_May · May 20, 2009, 6:11pm

That’s not to say that it’s not currently functional, I just believe that some ufuncs don’t work properly and that there are some corner cases that don’t work, which I think is why Darren hasn’t made an official release/announcement. Last time I played with it however, it was quite useful.

Ryan

···

On Wed, May 20, 2009 at 1:10 PM, Christopher Barker <Chris.Barker@…236…> wrote:

Darren Dale was working on a full-fledged package for adding units to

numpy arrays called quantities

(http://packages.python.org/quantities/user/tutorial.html),

thanks for the reminder – that does look like a really nice package. It

would be great to have a semi-standard for this stuff in the SciPy world

– and certainly MPL compatible!

last I saw it stalled a little due to issues with subclassing ndarray.

Darn. I hope I’ll get a chance to delve into it soon.

–
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
Sent from Norman, Oklahoma, United States

Eric_Firing2 · May 20, 2009, 7:36pm

John Hunter wrote:

The fundamental problem here is that some artists (Line2D) have
support for storing original unitized data (_xorig, _yorig) and
handling the conversion on unit change internally with the callback,
and some artists (eg Patches) do not . axes._process_plot_var_arg
subserves both plot (Line2D) and fill (Polygon), one of which is
expecting possibly unitized data and one which is only capable of
handling already converted data. Hence the fix one problem, create
another bind we are now in.

So yes, we need a standard.

John,

As you know, I agree. This has been a frustrating problem for a long time.

I think the resolution might be in having intermediate higher level
container plot item objects (an ErrorBar, LintPlot, FilledRegion)
which store the original data, manage the units, and pass converted
data back to the primitives. This is obviously a major refactoring,
and would require some thought, but may be the best way to go.
Handling the conversions in the plotting functions (eg fill, errorbar)
is probably not the right way because there is no obvious way to
support unit changes (eg inches to cm) since the data is already
converted, the artists already created.

I'm not sure I understand the use case for unit *changes*, as opposed to initial unit specification.

Having the artist primitives store the original, possibly unitized
data, and register callbacks for unit changes can work, but the
problem is how to create the artist primitives in such a way the unit
data is passed through correctly. The problem here is that some
operations don't make sense for certain unit types -- think addition
with datetimes. Some functions, eg bar or errorbar, which need to do
a lot of arithmetic on the input arrays, may want to do:

xmid = 0.5*(x[1:] + x[:-1])

which would not work for x if x is datetime (can't add two dates).
distance and scaling should always be well defined, so one should be
able to do:

xmid = x[1:] + 0.5*(x[1:]-x[:-1])

So one solution is to require all plotting functions to respect the
"no addition" rule, ie define the set of operations that are allowed
for plotting functions, and all artists to handle original unitized
data with internal conversion. This is a fair amount of work at the
plotting function layer, is invasive to the artist primitives, and
requires the extra storage at the artist layer, but could work.

Sounds horrible to me. I would really like to see clear stratification, with all complicated and flexible argument handling restricted to some not-too-low level.

The other solution, what I referred to as the intermediate plot item
container, is to have a class ErrorBar, eg, which is like the errorbar
method, but has an API like

  class ErrorBar:
    def __init__(self, all the errorbar args, possibly unitized):
      self.store_all_original_data_here()
      self.store_all_primitives_from_converted_data_here()

      def callback():
          self.update_all_stored_primitives_newly_converted_original_data()
      self.connect_callback_to_unit_change(callback)

This has the advantage that the plot item container class can always
work with arrays of floats (removing the onerous restriction on what
kind of binary relations are allowed) and removes the restrictions on
creating artists which are unit aware.

I think something like this is the way to go. Even without the problem with units, I would like to see things like the bar family, errorbar, and boxplot moved out into their own classes; and there is no reason not to do the same for simple line plots (which are anything but simple in their input argument handling). Then the Axes class can concentrate on Axes creation and manipulation.

I think there are also opportunities for factoring out common operations involving input parameter handling--not just units conversion, but validation, checking dimensions, generating X and Y with meshgrid when needed, etc. Some of these things are already partly factored out, but helpers are scattered around, and I suspect there is some unproductive duplication.

Of course, the big question is how to get all this done... Fortunately, unless I am missing a key point, this sort of refactoring can be done incrementally; it is not as drastic as the transforms refactoring was.

Eric

···

It also makes for a nicer API:

  eb = ErrorBar(something)
  eb.draw()

  # hmm, the cap widths are too small
  eb.capwidth = 12
  eb.draw()

ie, instead of getting back a bunch of artist primitives from errorbar
which may be difficult to manipulate, you get back an ErrorBar object
that knows how to update and plot itself.

With traits or properties so that the eb.capwidth attr setting
triggers a unitized updating of primitives, then everything is fairly
transparent to the user.

It would also make it easier support containers of artists for logical
groupings during animation, zorder buffering/blitting, etc.

JDH

_John_Hunter · May 20, 2009, 7:49pm

The use case (and we can debate whether this is worth the extra overhead)

ax.plot(inches)
ax.set_xlim(cms)

And the plot will automagically update with the new units. This
worked in the original implementation, but due to some code rot has
breakage somewhere. This was a feature requested by the JPL when I
did the original implementation.

Alternatively if you did

ax.plot(inches)

and later

ax.plot(cms)

the first line would be updated to cm and both would be plotted in cm.
But if you did

ax.plot(seconds)

you would get an error since the inches and cms lines would not be
able to convert.

In the scheme I proposed (plot items with updates on unit change), if
you had a line object contained in a PlotItem class, and the original
units were in inches, the line's xdata would be simple float array in
inches. If you changed the axis units to cm, the line's xdata would
automatically be updated to floats but now in cm.

JDH

···

On Wed, May 20, 2009 at 2:36 PM, Eric Firing <efiring@...229...> wrote:

I'm not sure I understand the use case for unit *changes*, as opposed to
initial unit specification.

_Chris.Barker · May 20, 2009, 9:02pm

John Hunter wrote:

The use case (and we can debate whether this is worth the extra overhead)

ax.plot(inches)
ax.set_xlim(cms)

I'll put my two cents into that debate:

My first thought is: wow! that is putting WAY too much into a plotting routine!

My second thought is: on the other hand, that is very cool.

If it's going to be done, I think it really shouldn't be too MPL specific -- it should be built on a good (and hopefully eventually widely used) unit-array system, perhaps like Darren Dale's Quantities package (there are quite a few other that should be looked at also).

What that means is that the first step is to get that package complete and robust. Using it for this kind of MPL functionality may be a good way to put it to the test.

In between, with a good Quantities package, it's not that big a deal to put the unit conversion in the hands of user code. The user code would simple need to be something like:

ax.plot(values.rescale('cm')
ax.set_xlim(limits.rescale('cm'))

a bit klunkier, but very clear. Explicit is better than implicit...

ax.plot(cms)

the first line would be updated to cm and both would be plotted in cm.

this is a little two implicit for me -- I'd rather specify the units explicitly, rather than have the last data added determine it for me.

ax.set_xunit('cm')

I'd probably have it default to the first unit used.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

_John_Hunter · May 20, 2009, 9:21pm

John Hunter wrote:

The use case (and we can debate whether this is worth the extra overhead)

ax.plot(inches)
ax.set_xlim(cms)

I'll put my two cents into that debate:

My first thought is: wow! that is putting WAY too much into a plotting
routine!

My second thought is: on the other hand, that is very cool.

If it's going to be done, I think it really shouldn't be too MPL
specific -- it should be built on a good (and hopefully eventually
widely used) unit-array system, perhaps like Darren Dale's Quantities
package (there are quite a few other that should be looked at also).

This is not how it works -- we will not be assuming any units package.
Rather, we provide an interface where any units package can be used
with mpl. The original use case is that the JPL has an internal units
package and they want to pass their objects directly to mpl -- they
get handed these objects by custom internal C++ libs with python
wrappers over which they have no control maintained by another group.
So they cannot modify that package. What they can do is access the
matplotlib units registry and register an entry there that maps type
-> a converter class that exposes a certain interface we require. The
converter class not only knows how to convert the units to floats, but
also how to set tick locators, formatters and labels. When we get
passed in a type, eg a datetime, a Quantiles instance, or whatever, we
ask the registry if there is a converter, and if so act appropriately
(though not always, hence the current thread).

One nice thing about this is we were able to extend support to native
datetime objects (which we cannot modify obviously) to mpl, so this
facility works with both proper unit types as well as arbitrary types.
This feature was not part of the original design spec, but fell
naturally out of it, which suggests to me that we are onto something.
So Darren's or anyone else package can be made to work with mpl with
little work (the harder part is getting all of mpl to respect the unit
conversion interface everywhere, which is what we are discussing). To
give you a better idea what this looks like, the *entire* support in
mpl for handling native datetime objects looks like this::

class DateConverter(units.ConversionInterface):
"""The units are equivalent to the timezone."""

        @staticmethod
        def axisinfo(unit, axis):
            'return the unit AxisInfo'
            # make sure that the axis does not start at 0
            if axis:
                ax = axis.axes

                if axis is ax.get_xaxis():
                    xmin, xmax = ax.dataLim.intervalx
                    if xmin==0.:
                        # no data has been added - let's set the
default datalim.
                        # We should probably use a better proxy for the datalim
                        # have been updated than the ignore setting
                        dmax = today = datetime.date.today()
                        dmin = today-datetime.timedelta(days=10)

ax._process_unit_info(xdata=(dmin, dmax))
dmin, dmax = ax.convert_xunits([dmin, dmax])

                        ax.viewLim.intervalx = dmin, dmax
                        ax.dataLim.intervalx = dmin, dmax
                elif axis is ax.get_yaxis():
                    ymin, ymax = ax.dataLim.intervaly
                    if ymin==0.:
                        # no data has been added - let's set the
default datalim.
                        # We should probably use a better proxy for the datalim
                        # have been updated than the ignore setting
                        dmax = today = datetime.date.today()
                        dmin = today-datetime.timedelta(days=10)

ax._process_unit_info(ydata=(dmin, dmax))
dmin, dmax = ax.convert_yunits([dmin, dmax])

ax.viewLim.intervaly = dmin, dmax
ax.dataLim.intervaly = dmin, dmax

            majloc = AutoDateLocator(tz=unit)
            majfmt = AutoDateFormatter(majloc, tz=unit)
            return units.AxisInfo( majloc=majloc, majfmt=majfmt, label='' )

        @staticmethod
        def convert(value, unit, axis):
            if units.ConversionInterface.is_numlike(value): return value
            return date2num(value)

        @staticmethod
        def default_units(x, axis):
            'Return the default unit for *x* or None'
            return None

units.registry[datetime.date] = DateConverter()
units.registry[datetime.datetime] = DateConverter()

See the matplotlib.units module for more info.

ax.plot(values.rescale('cm')
ax.set_xlim(limits.rescale('cm'))

a bit klunkier, but very clear. Explicit is better than implicit...

I'm open to the idea of not supporting post-facto conversions after
data is added, but am mostly minus one on it, and I'd like to hear
from the JPL who requested the ability initially. I think their users
are working with complex plots and might have arrays in different
distance units, and would like to be able to pass in any distance
units as long as conversion is possible. I think having proper units
support kind of implies that you should be able to handle conversion
between compatible units seamlessly. Else we are basically in the
date2num world -- just make all the users convert to floats before
working with mpl, since there is little difference between the code
you suggest::

ax.plot(values.rescale('cm'))
ax.set_xlim(limits.rescale('cm'))

and::

ax.plot(values.rescale('cm').tofloat())
ax.set_xlim(limits.rescale('cm').tofloat())

where the latter means we have no units or custom type support.

JDH

···

On Wed, May 20, 2009 at 4:02 PM, Christopher Barker <Chris.Barker@...236...> wrote:

_Darren_Dale2 · May 20, 2009, 11:27pm

Thanks for the mention, Ryan. The package hasn’t really stalled due to limitation with numpy (although there are some that I would like to address), its just that I have been too busy with other things to work on it. I am planning to continue again in June.

Darren

···

On Wed, May 20, 2009 at 1:01 PM, Ryan May <rmay31@…714…> wrote:

On Wed, May 20, 2009 at 11:54 AM, Christopher Barker <Chris.Barker@…236…> wrote:

Ryan May wrote:

use the units in basic_units.py (in the examples/units directory).

This looks like pretty cool stuff. However, I can’t seem to find

matplotlib.units or basic_units.py in the online Sphinx docs. Is this a

doc bug, or intentional?

There are units examples in the docs.

matplotlib.units maintains the api for registering unit-ed quantities and various other nuts and bolts. It’s another one of those modules whose docs hasn’t been converted to sphinx yet, but it does have doc strings. However, it does not provide any units itself. basic_units.py is an example with just a few basic quantities to show off how support in matplotlib works, but is not itself all that useful.

Darren Dale was working on a full-fledged package for adding units to numpy arrays called quantities (http://packages.python.org/quantities/user/tutorial.html), that would (I think) work with some of this, but last I saw it stalled a little due to issues with subclassing ndarray. I haven’t seen any other simple packages/modules that suppors general units for the simple goal of doing conversions for plotting.

_Darren_Dale2 · May 20, 2009, 11:47pm

I have been waiting to make an announcement because I am in the middle of overhauling the unit tests, I want them to be more robust and cleaner than they are at present. I also wanted to see whether it would be possible to make an addition to numpy’s ufunc mechanism so existing ufuncs can perform a units operation on the way in (so an error can be raised in case of an illegal operation before data is changed in place, for example), rather than on the way out (currently done using ndarray.array_wrap). Aside from this corner case, I think all of the common arithmetic ufuncs already just work, and Quantities should already useable. It needs a couple easy tweaks to make some operations easier, and I need input from the community about how much magic is appropriate (right now inches + feet raises an error, since its not clear what units are desired for the result). It would probably not take much work to implement missing features and ufuncs, especially if a few others were interested in helping out

Darren

···

On Wed, May 20, 2009 at 2:11 PM, Ryan May <rmay31@…149…> wrote:

On Wed, May 20, 2009 at 1:10 PM, Christopher Barker <Chris.Barker@…236…> wrote:

Darren Dale was working on a full-fledged package for adding units to

numpy arrays called quantities

(http://packages.python.org/quantities/user/tutorial.html),

thanks for the reminder – that does look like a really nice package. It

would be great to have a semi-standard for this stuff in the SciPy world

– and certainly MPL compatible!

last I saw it stalled a little due to issues with subclassing ndarray.

Darn. I hope I’ll get a chance to delve into it soon.

That’s not to say that it’s not currently functional, I just believe that some ufuncs don’t work properly and that there are some corner cases that don’t work, which I think is why Darren hasn’t made an official release/announcement. Last time I played with it however, it was quite useful.

_Chris.Barker · May 21, 2009, 7:20pm

John Hunter wrote:

If it's going to be done, I think it really shouldn't be too MPL
specific -- it should be built on a good (and hopefully eventually
widely used) unit-array system, perhaps like Darren Dale's Quantities
package (there are quite a few other that should be looked at also).

This is not how it works -- we will not be assuming any units package.
Rather, we provide an interface where any units package can be used
with mpl.

Fair enough, but you still need to require a particular API to a unit-ed object, which it no so different.

One thing that strikes me is that there is a distinctive difference between something like Darren's Quantities (and other numpy-based packages) and what MPL no supports for DateTimes -- in Quantities, the sequence itself has units, whereas with Datetimes, you use a generic sequence, and each element has units. I suppose that difference can be dealt with in the API, though.

The original use case is that the JPL has an internal units
package and they want to pass their objects directly to mpl

But, of course, the rest of us probably don't want to (or can't) use JPL's package, so we'll want a more generic package to test with and write samples for, etc.

In general, I think it's next to impossible to write a generic API without AT LEAST two use cases -- so maybe JPL's and Quantities would be a good start.

One nice thing about this is we were able to extend support to native
datetime objects (which we cannot modify obviously) to mpl, so this
facility works with both proper unit types as well as arbitrary types.

And I have enjoyed the DateTime support (except when it's not there, natch!). In thinking about this more, I think the real benefit is in the coupling with the units support with nifty things like AutoLocaters and AutoFormatters -- these are great for DateTimes, and my first thought was "who cares" for simpler units like meters. However, in thinking, I realize that I've written a fair bit of code for my data that may be in meters, for instance, that goes like:

if max < 1:
    do_stuff_to display_centimeters.
elif max < 1000:
    do_stuff_to display_meters.
else:
    do_stuff_to display_kilometers.

It would be nice to push that stuff into an MPL locater and formatter, even if I do need to write them myself. And, ideally between us all, a nice collection of generic ones could be written.

I could (and now that I think about it, will) still do that by simply assuring my data are always in a particular unit, but it would be nicer if the locaters could be unit aware, so that one could pass in any length unit, and apply a "SI_length_Formatter" to it. Or just SI_Formatter, now that I think about it.

I'm not sure how to resolve one issue:

If I have a locator/formatter that decides whether to display cm or km, etc, depending on values, I probably want the axis label to reflect that too, but I don't know how one can get all those to communicate.

Also, it sounds like you're talking about converting units to the same something -- but, for length, it might be feet, or miles, or cm, or.... This is a bit different than what is done for time, where datetimes are always converted to the same base -- days since 0001-01-01 00:00:00. Perhaps this convention could be followed with a standard base unit for length, etc. though maybe that wouldn't capture the range of precisions that may be required -- some data in centuries, some in nanoseconds...

(by the way, there was some work on handling datetimes with numpy arrays a while back -- I wonder what came of that?)

I'm open to the idea of not supporting post-facto conversions after
data is added, but am mostly minus one on it, and I'd like to hear
from the JPL who requested the ability initially. I think their users
are working with complex plots and might have arrays in different
distance units, and would like to be able to pass in any distance
units as long as conversion is possible.

I can see that, but suggest that the unit finally displayed by the plot be specified by an axis method, or Locators or Formatters, or ??, but in any case, not change depending on what order you add data to the plot.

It would be pretty cool to be able to do:

ax.plot(x, data_in_feet)
ax.plot(x, data_in_meters)

and get it all scaled right!

there is little difference between the code
you suggest::

  ax.plot(values.rescale('cm'))
  ax.set_xlim(limits.rescale('cm'))

and::

  ax.plot(values.rescale('cm').tofloat())
  ax.set_xlim(limits.rescale('cm').tofloat())

where the latter means we have no units or custom type support.

there are a couple differences:

1) with date2num, we still always use float-days- since-epoc for the actual numbers. That means that there can be one set of formatters. In that example, what units would tofloat() return? If we want formatter to work, some info about the units needs to be passed into mpl.

2) in the second version -- every unit-ed data type would have to have a tofloat() method (and what units would those floats be in?), or it would be:

ax.plot(mpl.length2num(values.rescale('cm')) )
ax.set_xlim(mpl.length2num(limits.rescale('cm')) )

In the end, I think datetimes are easier, not as many options.

I'm not sure all this was very clear, but hopefully it added some signal with the noise!

-Chris

···

<Chris.Barker@...236...> wrote:

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Pierre_GM2 · May 21, 2009, 7:55pm

<push product>
Er... Anybody has tried the plotting capacities of scikits.timeseries (pytseries.sourceforge.net)? In short, the package provides some extensions to matplotlib to plot timeseries. One of these extensions changes the ticks depending on the zoom level: start over a few decades and ticks will be every 5 y or so. Select a smaller area and the ticks will be every quarter, you get the idea. The series associated with the plot (either the first plotted or one given at plot creation) sets the units (frequency) of the xaxis. Afterwards, other series plotted on the same plot are converted to the plot's frequency) with our own conversion routines.

Theses extensions were coded about 18 months ago, at a time where the support for units was inexistent (or hidden somewhere I never fund it). A couple weeks ago I realized that units converting would probably be the way to go (and that in general, our extensions should be rewritten).

Anyway, the zoom-level dependent ticks we implemented might be a good starting point for implementing a "locator/formatter that decides whether to display cm or km"...

I'd be quite happy to get some feedback about these extensions...

Cheers
P.
</push product>

_Robert_Kern · May 21, 2009, 8:05pm

Well, if we're pushing products, Chaco has a subsystem for doing exactly this in a generic fashion for times or anything else:

https://svn.enthought.com/svn/enthought/Chaco/trunk/enthought/chaco/scales/

It was written to be self-contained so that it could be shared with matplotlib or anything else that need it.

···

On 2009-05-21 14:55, Pierre GM wrote:

Anyway, the zoom-level dependent ticks we implemented might be a good
starting point for implementing a "locator/formatter that decides
whether to display cm or km"...

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

_John_Hunter · May 21, 2009, 8:13pm

No, this is incorrect. The object can have any API it wants. The
person who wants to add support for that object registers the object
type with a converter class for that object. The converter class can
be entirely external to the class, as in the datetime example I
posted, so the object's API is not exposed to mpl. This is the
crucial distinction. The converter class at a minimum must know how
to convert the object to a sequence of floats.

JDH

···

On Thu, May 21, 2009 at 2:20 PM, Christopher Barker <Chris.Barker@...236...> wrote:

John Hunter wrote:

<Chris.Barker@...236...> wrote:

If it's going to be done, I think it really shouldn't be too MPL
specific -- it should be built on a good (and hopefully eventually
widely used) unit-array system, perhaps like Darren Dale's Quantities
package (there are quite a few other that should be looked at also).

This is not how it works -- we will not be assuming any units package.
Rather, we provide an interface where any units package can be used
with mpl.

Fair enough, but you still need to require a particular API to a unit-ed
object, which it no so different.