John Hunter wrote:
If it's going to be done, I think it really shouldn't be too MPL
specific -- it should be built on a good (and hopefully eventually
widely used) unit-array system, perhaps like Darren Dale's Quantities
package (there are quite a few other that should be looked at also).
This is not how it works -- we will not be assuming any units package.
Rather, we provide an interface where any units package can be used
with mpl.
Fair enough, but you still need to require a particular API to a unit-ed object, which it no so different.
One thing that strikes me is that there is a distinctive difference between something like Darren's Quantities (and other numpy-based packages) and what MPL no supports for DateTimes -- in Quantities, the sequence itself has units, whereas with Datetimes, you use a generic sequence, and each element has units. I suppose that difference can be dealt with in the API, though.
The original use case is that the JPL has an internal units
package and they want to pass their objects directly to mpl
But, of course, the rest of us probably don't want to (or can't) use JPL's package, so we'll want a more generic package to test with and write samples for, etc.
In general, I think it's next to impossible to write a generic API without AT LEAST two use cases -- so maybe JPL's and Quantities would be a good start.
One nice thing about this is we were able to extend support to native
datetime objects (which we cannot modify obviously) to mpl, so this
facility works with both proper unit types as well as arbitrary types.
And I have enjoyed the DateTime support (except when it's not there, natch!). In thinking about this more, I think the real benefit is in the coupling with the units support with nifty things like AutoLocaters and AutoFormatters -- these are great for DateTimes, and my first thought was "who cares" for simpler units like meters. However, in thinking, I realize that I've written a fair bit of code for my data that may be in meters, for instance, that goes like:
if max < 1:
do_stuff_to display_centimeters.
elif max < 1000:
do_stuff_to display_meters.
else:
do_stuff_to display_kilometers.
It would be nice to push that stuff into an MPL locater and formatter, even if I do need to write them myself. And, ideally between us all, a nice collection of generic ones could be written.
I could (and now that I think about it, will) still do that by simply assuring my data are always in a particular unit, but it would be nicer if the locaters could be unit aware, so that one could pass in any length unit, and apply a "SI_length_Formatter" to it. Or just SI_Formatter, now that I think about it.
I'm not sure how to resolve one issue:
If I have a locator/formatter that decides whether to display cm or km, etc, depending on values, I probably want the axis label to reflect that too, but I don't know how one can get all those to communicate.
Also, it sounds like you're talking about converting units to the same something -- but, for length, it might be feet, or miles, or cm, or.... This is a bit different than what is done for time, where datetimes are always converted to the same base -- days since 0001-01-01 00:00:00. Perhaps this convention could be followed with a standard base unit for length, etc. though maybe that wouldn't capture the range of precisions that may be required -- some data in centuries, some in nanoseconds...
(by the way, there was some work on handling datetimes with numpy arrays a while back -- I wonder what came of that?)
I'm open to the idea of not supporting post-facto conversions after
data is added, but am mostly minus one on it, and I'd like to hear
from the JPL who requested the ability initially. I think their users
are working with complex plots and might have arrays in different
distance units, and would like to be able to pass in any distance
units as long as conversion is possible.
I can see that, but suggest that the unit finally displayed by the plot be specified by an axis method, or Locators or Formatters, or ??, but in any case, not change depending on what order you add data to the plot.
It would be pretty cool to be able to do:
ax.plot(x, data_in_feet)
ax.plot(x, data_in_meters)
and get it all scaled right!
there is little difference between the code
you suggest::
ax.plot(values.rescale('cm'))
ax.set_xlim(limits.rescale('cm'))
and::
ax.plot(values.rescale('cm').tofloat())
ax.set_xlim(limits.rescale('cm').tofloat())
where the latter means we have no units or custom type support.
there are a couple differences:
1) with date2num, we still always use float-days- since-epoc for the actual numbers. That means that there can be one set of formatters. In that example, what units would tofloat() return? If we want formatter to work, some info about the units needs to be passed into mpl.
2) in the second version -- every unit-ed data type would have to have a tofloat() method (and what units would those floats be in?), or it would be:
ax.plot(mpl.length2num(values.rescale('cm')) )
ax.set_xlim(mpl.length2num(limits.rescale('cm')) )
In the end, I think datetimes are easier, not as many options.
I'm not sure all this was very clear, but hopefully it added some signal with the noise!
-Chris
···
<Chris.Barker@...236...> wrote:
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@...236...