Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

···

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org> on behalf of Jody Klymak <jklymak@uvic.ca>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

···

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org> on behalf of Jody Klymak <jklymak at uvic.ca>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

Practically, I think what we are proposing is that for unit support the
user must supply two functions for each axis:

   - A mapping from your unit objects to floating point numbers
   - A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the
moment. Doing this would mean we can convert units as soon as they enter
Matplotlib, only ever have to deal with floating point numbers internally,
and then use the second function as late as possible when the user requests
stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib
3.0 at the earliest, so will be python 3 only.

David

···

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca> wrote:

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth,
and OK from downstream users. I actually don?t think it?ll be a huge
change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit
perplexed by their inconsistent and (to my simple mind) convoluted
application in the codebase. Having experience from people who try to use
them everyday will be absolutely key.

Cheers, Jody

> On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) < > theodore.r.drain at jpl.nasa.gov> wrote:
>
> We use units for everything in our system (in fact, we funded John
Hunter originally to add in a unit system so we could use MPL) so it's a
crucial system for us. In our system, we have our own time classes (which
handle relativistic time frames as well as much higher precision
representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms. I
understand the words you're using, but I'm not really clear on what the
real proposed changes are. For example, the current unit API returns a
units.AxisInfo object so the converter can set the formatter and locators
to use. Is that what you mean in the 2nd paragraph about ticks and
labels? Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.
Are any of these changes going to change the conversion API? (note - I'm
not against changing it - I'm just not sure if there are any changes or
not).
>
> Another thing to consider: many of the examples people use are scripts
which make a plot and stop. But there are other use cases which are more
complicated and stress the system in different ways. We write several GUI
applications (in PyQt) that use MPL for plotting. In these cases, the user
is interacting with the plot to add and remove artists, change styles,
modify data, etc etc. So having a good object oriented API for modifying
things after construction is important for this to work. So when units are
involved, it can't be a "convert once at construction" and never touch
units again. We are constantly adjusting limits, moving artists, etc in
unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other
items that would be useful to explicitly spelled out. Things like which
API's in MPL should accept units and which won't and which methods return
unitized data and which don't. It would be nice if there was a clear
policy on this. Maybe one exists and I'm not aware of it - it would be
helpful to repeat it in a discussion on changing the unit system.
Obviously I would love to have every method accept and return unitized data
:-).
>
> I bring this up because I was just working on a hover/annotation class
that needed to move a single annotation artist with the mouse. To move the
annotation box the way I needed to, I had to set to one private member
variable, call two set methods, use attribute assignment for one value, and
set one semi-public member variable - some of which work with units and
some of which didn't. I think having a clear "this kind of method
accepts/returns units" policy would help when people are adding new
accessors/methods/variables to make it more clear what kind of data is
acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit
upgrades, but to make that happen I need to get a clear statement of what
problem is being solved and the scope of the work so I can explain to our
management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=
jpl.nasa.gov at python.org> on behalf of Jody Klymak <jklymak at uvic.ca>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead
to a more stringent documentation and implimentation?.
>
> In response to @anntzer I thought about the units support a bit - it
seems that rather than a transform, a more straightforward approach is to
have the converter map to float arrays in a unique way. This float mapping
would be completely analogous to `date2num` in `dates`, in that it doesn?t
change and is perfectly invertible without matplotlib ever knowing about
the unit information, though the axis could store it for the the tick
locators and formatters. It would also have an inverse that would supply
data back to the user in unit-aware data (though not necessarily in the
unit that the user supplied. e.g. if they supply 8*in, the and the
converter converts everything to meter floats, then the returned unitized
inverse would be 0.203*m, or whatever convention the converter wants to
supply.).
>
> User ?unit? control, i.e. making the plot in inches instead of m, would
be accomplished with ticks locators and formatters. Matplotlib would never
directly convert between cm and inches (any more than it converts from days
to hours for dates), the downstream-supplied tick formatter and labeller
would do it.
>
> Each axis would only get one converter, set by the first call to the
axis. Subsequent calls to the axis would pass all data (including bare
floats) to the converter. If the converter wants to pass bare floats then
it can do so. If it wants to accept other data types then it can do so.
It should be possible for the user to clear or set the converter, but then
they should know what they are doing and why.
>
> Whats missing? I don?t think this is wildly different than what we
have, but maybe a bit more clear.
>
> Cheers, Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org
> Matplotlib-devel Info Page
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org
> Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180207/5adf07b2/attachment.html&gt;

I think what's also being proposed, and I think Ted also suggested, is an
API audit to figure out how units are/are not being implemented in each
function. Potentially we could even try to smooth out inconsistencies (like
between plot and scatter).

···

On Feb 7, 2018 6:43 AM, "David Stansby" <dstansby at gmail.com> wrote:

Practically, I think what we are proposing is that for unit support the
user must supply two functions for each axis:

   - A mapping from your unit objects to floating point numbers
   - A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the
moment. Doing this would mean we can convert units as soon as they enter
Matplotlib, only ever have to deal with floating point numbers internally,
and then use the second function as late as possible when the user requests
stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib
3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca> wrote:

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth,
and OK from downstream users. I actually don?t think it?ll be a huge
change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit
perplexed by their inconsistent and (to my simple mind) convoluted
application in the codebase. Having experience from people who try to use
them everyday will be absolutely key.

Cheers, Jody

> On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) < >> theodore.r.drain at jpl.nasa.gov> wrote:
>
> We use units for everything in our system (in fact, we funded John
Hunter originally to add in a unit system so we could use MPL) so it's a
crucial system for us. In our system, we have our own time classes (which
handle relativistic time frames as well as much higher precision
representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms.
I understand the words you're using, but I'm not really clear on what the
real proposed changes are. For example, the current unit API returns a
units.AxisInfo object so the converter can set the formatter and locators
to use. Is that what you mean in the 2nd paragraph about ticks and
labels? Or is that changing?
>
> The current unit api is pretty simple and in
units.ConversionInterface. Are any of these changes going to change the
conversion API? (note - I'm not against changing it - I'm just not sure if
there are any changes or not).
>
> Another thing to consider: many of the examples people use are scripts
which make a plot and stop. But there are other use cases which are more
complicated and stress the system in different ways. We write several GUI
applications (in PyQt) that use MPL for plotting. In these cases, the user
is interacting with the plot to add and remove artists, change styles,
modify data, etc etc. So having a good object oriented API for modifying
things after construction is important for this to work. So when units are
involved, it can't be a "convert once at construction" and never touch
units again. We are constantly adjusting limits, moving artists, etc in
unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other
items that would be useful to explicitly spelled out. Things like which
API's in MPL should accept units and which won't and which methods return
unitized data and which don't. It would be nice if there was a clear
policy on this. Maybe one exists and I'm not aware of it - it would be
helpful to repeat it in a discussion on changing the unit system.
Obviously I would love to have every method accept and return unitized data
:-).
>
> I bring this up because I was just working on a hover/annotation class
that needed to move a single annotation artist with the mouse. To move the
annotation box the way I needed to, I had to set to one private member
variable, call two set methods, use attribute assignment for one value, and
set one semi-public member variable - some of which work with units and
some of which didn't. I think having a clear "this kind of method
accepts/returns units" policy would help when people are adding new
accessors/methods/variables to make it more clear what kind of data is
acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit
upgrades, but to make that happen I need to get a clear statement of what
problem is being solved and the scope of the work so I can explain to our
management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=
jpl.nasa.gov at python.org> on behalf of Jody Klymak <jklymak at uvic.ca>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to
lead to a more stringent documentation and implimentation?.
>
> In response to @anntzer I thought about the units support a bit - it
seems that rather than a transform, a more straightforward approach is to
have the converter map to float arrays in a unique way. This float mapping
would be completely analogous to `date2num` in `dates`, in that it doesn?t
change and is perfectly invertible without matplotlib ever knowing about
the unit information, though the axis could store it for the the tick
locators and formatters. It would also have an inverse that would supply
data back to the user in unit-aware data (though not necessarily in the
unit that the user supplied. e.g. if they supply 8*in, the and the
converter converts everything to meter floats, then the returned unitized
inverse would be 0.203*m, or whatever convention the converter wants to
supply.).
>
> User ?unit? control, i.e. making the plot in inches instead of m, would
be accomplished with ticks locators and formatters. Matplotlib would never
directly convert between cm and inches (any more than it converts from days
to hours for dates), the downstream-supplied tick formatter and labeller
would do it.
>
> Each axis would only get one converter, set by the first call to the
axis. Subsequent calls to the axis would pass all data (including bare
floats) to the converter. If the converter wants to pass bare floats then
it can do so. If it wants to accept other data types then it can do so.
It should be possible for the user to clear or set the converter, but then
they should know what they are doing and why.
>
> Whats missing? I don?t think this is wildly different than what we
have, but maybe a bit more clear.
>
> Cheers, Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org
> Matplotlib-devel Info Page
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org
> Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180207/d1f1fd6a/attachment-0001.html&gt;

That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

···

________________________________________
From: David Stansby <dstansby@gmail.com>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  * A mapping from your unit objects to floating point numbers
  * A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>> on behalf of Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

I'm momentarily a bit away from Matplotlib development due to real life
piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some
devs (including myself) being relatively dismissive about unit support is
the lack of well-defined use case, other than "it'd be nice if we supported
units" (i.e., especially from the point of view of devs who *don't* use
units themselves, it ends up being an ever moving target). In particular,
tests on unit support ("unit unit tests"? :-)) currently only rely on the
old JPL unit code that ended up integrated into Matplotlib's test suite,
but does not test integration with the two major unit packages I am aware
of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent

all kinds of relevant units. In particular, I was at some point hoping to
completely work in deunitized data internally, *including the plotting*,
and rely on the fact that if the deunitized and the unitized data are
usually linked by an affine transform, so the plotting part doesn't need to
convert back to unitized data and we only need to place and label the ticks
accordingly; however Ted mentioned relativistic units, which imply the use
of a non-affine transform. So I think it would also be really helpful if
JPL could release some reasonably documented unit library with their actual
use cases (and how it differs from pint & astropy.units), so that we know
better what is actually needed (I believe carrying the JPL unit code in our
own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API
discussion, I believe a relatively simple and consistent line would be to
make Axes methods unitized and everything else deunitized (but with clear
ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov>:

That sounds fine to me. Our original unit prototype API actually had
conversions for both directions but I think the float->unit version was
removed (or really moved) when the ticker/formatter portion of the unit API
was settled on.

Using floats/numpy arrays internally is going to easier and faster so I
think that's a plus. The biggest issue we're going to run in to is what's
defined as "internal" vs part of the unit API. Some things are easy like
the Axes/Axis API. But we also use low level API's like the patches. Are
those unitized? This is the pro and con of using something like Python
where basically everything is public. It makes it possible to do lots of
things, but it's much harder to define a clear library with a specific
public API.

Somewhere in the process we should write a proposal that outlines which
classes/methods are part of the unit api and which are going to be
considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal
implementation classes are placed in a sub-package inside MPL 3.0, it
becomes clearer to people later on what the "official' public API vs what
can be optimized to just use floats. Obviously the dev's would need to
decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the
user must supply two functions for each axis:

  * A mapping from your unit objects to floating point numbers
  * A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the
moment. Doing this would mean we can convert units as soon as they enter
Matplotlib, only ever have to deal with floating point numbers internally,
and then use the second function as late as possible when the user requests
stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib
3.0 at the earliest, so will be python 3 only.

David

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth,
and OK from downstream users. I actually don?t think it?ll be a huge
change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit
perplexed by their inconsistent and (to my simple mind) convoluted
application in the codebase. Having experience from people who try to use
them everyday will be absolutely key.

Cheers, Jody

>
> We use units for everything in our system (in fact, we funded John
Hunter originally to add in a unit system so we could use MPL) so it's a
crucial system for us. In our system, we have our own time classes (which
handle relativistic time frames as well as much higher precision
representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms. I
understand the words you're using, but I'm not really clear on what the
real proposed changes are. For example, the current unit API returns a
units.AxisInfo object so the converter can set the formatter and locators
to use. Is that what you mean in the 2nd paragraph about ticks and
labels? Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.
Are any of these changes going to change the conversion API? (note - I'm
not against changing it - I'm just not sure if there are any changes or
not).
>
> Another thing to consider: many of the examples people use are scripts
which make a plot and stop. But there are other use cases which are more
complicated and stress the system in different ways. We write several GUI
applications (in PyQt) that use MPL for plotting. In these cases, the user
is interacting with the plot to add and remove artists, change styles,
modify data, etc etc. So having a good object oriented API for modifying
things after construction is important for this to work. So when units are
involved, it can't be a "convert once at construction" and never touch
units again. We are constantly adjusting limits, moving artists, etc in
unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other
items that would be useful to explicitly spelled out. Things like which
API's in MPL should accept units and which won't and which methods return
unitized data and which don't. It would be nice if there was a clear
policy on this. Maybe one exists and I'm not aware of it - it would be
helpful to repeat it in a discussion on changing the unit system.
Obviously I would love to have every method accept and return unitized data
:-).
>
> I bring this up because I was just working on a hover/annotation class
that needed to move a single annotation artist with the mouse. To move the
annotation box the way I needed to, I had to set to one private member
variable, call two set methods, use attribute assignment for one value, and
set one semi-public member variable - some of which work with units and
some of which didn't. I think having a clear "this kind of method
accepts/returns units" policy would help when people are adding new
accessors/methods/variables to make it more clear what kind of data is
acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit
upgrades, but to make that happen I need to get a clear statement of what
problem is being solved and the scope of the work so I can explain to our
management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=
jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>> on behalf of
Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead
to a more stringent documentation and implimentation?.
>
> In response to @anntzer I thought about the units support a bit - it
seems that rather than a transform, a more straightforward approach is to
have the converter map to float arrays in a unique way. This float mapping
would be completely analogous to `date2num` in `dates`, in that it doesn?t
change and is perfectly invertible without matplotlib ever knowing about
the unit information, though the axis could store it for the the tick
locators and formatters. It would also have an inverse that would supply
data back to the user in unit-aware data (though not necessarily in the
unit that the user supplied. e.g. if they supply 8*in, the and the
converter converts everything to meter floats, then the returned unitized
inverse would be 0.203*m, or whatever convention the converter wants to
supply.).
>
> User ?unit? control, i.e. making the plot in inches instead of m, would
be accomplished with ticks locators and formatters. Matplotlib would never
directly convert between cm and inches (any more than it converts from days
to hours for dates), the downstream-supplied tick formatter and labeller
would do it.
>
> Each axis would only get one converter, set by the first call to the
axis. Subsequent calls to the axis would pass all data (including bare
floats) to the converter. If the converter wants to pass bare floats then
it can do so. If it wants to accept other data types then it can do so.
It should be possible for the user to clear or set the converter, but then
they should know what they are doing and why.
>
> Whats missing? I don?t think this is wildly different than what we
have, but maybe a bit more clear.
>
> Cheers, Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
> Matplotlib-devel Info Page
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
> Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/fc3c08ed/attachment-0001.html&gt;

···

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklyma > k at uvic.ca>> wrote:
> On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) < > theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>> > wrote:

Thanks Antony,

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

? or an indication that the astropy (for instance) use-case is good enough to base an API around.

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

I was going to suggest that distinction as well. Anything that requires `axes.add_artist` is deunitized since we use those artists all over the place internally and keeping track of whether we have units or not would be really hard.

Cheers, Jody

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com <mailto:dstansby at gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  * A mapping from your unit objects to floating point numbers
  * A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

>
> We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).
>
> Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).
>
> I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org>>> on behalf of Jody Klymak <jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.
>
> In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).
>
> User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.
>
> Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.
>
> Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.
>
> Cheers, Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>
> https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/f822932a/attachment-0001.html&gt;

···

On Feb 8, 2018, at 8:09 AM, Antony Lee <antony.lee at berkeley.edu> wrote:
On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>>> wrote:
> On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>>> wrote:

Sorry - that's not what I meant. The unit conversions API that's in place works fine I can't think of a better way to describe the use cases than the basic ones that seem (at least to me) to be obvious. Numbers with units (5*km) and time classes (datetime or some other time class like we use) are the primary use case. Another way to say it is that users have data where the normal representation is not float and they want to plot it, control how the transformation to float is done (plot in km or miles, in UTC or GPS time) and manipulate the plot after it's plotted (get bounds, change bounds, change units, move artists, edit data, etc) in the non-float representation that their data is already in.

I realize that units are "a pain", but they're hugely useful. Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system). The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot. I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it. You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created. The Artist classes are one of the primary API's for applications. Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created. Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing. If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother. We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on. Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

···

________________________________________
From: anntzer.lee at gmail.com <anntzer.lee at gmail.com> on behalf of Antony Lee <antony.lee@berkeley.edu>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby@gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

  * A mapping from your unit objects to floating point numbers
  * A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

The problem here (as you mentioned) is that essentially close to everything
is a public API in Matplotlib, and I believe quite strongly that it is
unreasonable to make every function check for unitized data (and what about
attributes? are bboxes and transforms supposed to handle units too?). For
example, this leads to Line2D.get_data have the orig=True[False kwarg and
the class needs to internally keep both unitized and deunitized data
around; duplicating this support throughout all artists would be a *lot* of
code.

While Axes methods can reasonably support units out of the box, I think it
is more reasonable to have (up to bikeshedding) `Axes.unitize` and
`Axes.deunitize` and then have people who need to play with the artists
themselves do e.g. `artist.set_data(artist.axes.deunitize(unitized_data))`
and `artist.axes.unitize(artist.get_data())`. Yes, I realize this may be
more work for you, but it's also a tradeoff of less work for us :slight_smile: With
this design, another possibility (which I guess Tom is not going to like,
but I actually think is reasonable) would be for you to patch all the
Artist classes yourself to support unitized data in all methods you want
(using the proper wrapper methods).

The "moving target" part is basically that there has never been complete
support for units everywhere in the code base, and because things are added
in a piecemeal fashion rather than with a well thought-out design, I'm a
bit tired of the constant stream of "oh, this function doesn't support
datetimes, we need to fix it". Again, I believe rethinking the design in a
comprehensive fashion would help with that.

Antony

2018-02-08 18:13 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov>:

Sorry - that's not what I meant. The unit conversions API that's in place
works fine I can't think of a better way to describe the use cases than
the basic ones that seem (at least to me) to be obvious. Numbers with
units (5*km) and time classes (datetime or some other time class like we
use) are the primary use case. Another way to say it is that users have
data where the normal representation is not float and they want to plot it,
control how the transformation to float is done (plot in km or miles, in
UTC or GPS time) and manipulate the plot after it's plotted (get bounds,
change bounds, change units, move artists, edit data, etc) in the non-float
representation that their data is already in.

I realize that units are "a pain", but they're hugely useful. Just
plotting datetimes is going to be a pain without units (and was a huge pain
before the unit system). The proposal that only Axes supports units is
going to cause us a massive problem as that's rarely everything that we do
with a plot. I could do a survey to find all the interactions we use (and
that doesn't even touch the 1000's of lines of code our users have written)
if that would help but anything that's part of the public api (axes,
artists, patches, etc) is probably being used - i.e. pretty much anything
that's in the current user's guide is something that we use/want/need to
work with unitized data.

This is kind of what I meant in my previous email about use cases. Saying
"just Axes has units" is basically saying the only valid unit use case is
create a plot one time and look at it. You can't manipulate it, edit it,
or build any kind of plotting GUI application (which we have many of) once
the plot has been created. The Artist classes are one of the primary API's
for applications. Artists are created, edited, and manipulated if you want
to allow the user to modify things in a plot after it's created. Even
the most basic cases like calling Line2D.set_data() wouldn't be allowed
with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.
The reason it keeps popping up is that code gets added without something
considering units which then triggers a bug reports which require fixing.
If there was a clearer policy and new code was required to have test cases
that cover non-unit and unit inputs, I think things would go much
smoother. We'd be happy to help with submitting new test cases to cover
unit cases in existing code once a policy is decided on. Maybe what's
needed is better documentation for developers who don't use units so they
can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com <anntzer.lee at gmail.com> on behalf of Antony
Lee <antony.lee at berkeley.edu>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life
piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some
devs (including myself) being relatively dismissive about unit support is
the lack of well-defined use case, other than "it'd be nice if we supported
units" (i.e., especially from the point of view of devs who *don't* use
units themselves, it ends up being an ever moving target). In particular,
tests on unit support ("unit unit tests"? :-)) currently only rely on the
old JPL unit code that ended up integrated into Matplotlib's test suite,
but does not test integration with the two major unit packages I am aware
of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to
represent all kinds of relevant units. In particular, I was at some point
hoping to completely work in deunitized data internally, *including the
plotting*, and rely on the fact that if the deunitized and the unitized
data are usually linked by an affine transform, so the plotting part
doesn't need to convert back to unitized data and we only need to place and
label the ticks accordingly; however Ted mentioned relativistic units,
which imply the use of a non-affine transform. So I think it would also be
really helpful if JPL could release some reasonably documented unit library
with their actual use cases (and how it differs from pint & astropy.units),
so that we know better what is actually needed (I believe carrying the JPL
unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API
discussion, I believe a relatively simple and consistent line would be to
make Axes methods unitized and everything else deunitized (but with clear
ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me. Our original unit prototype API actually had
conversions for both directions but I think the float->unit version was
removed (or really moved) when the ticker/formatter portion of the unit API
was settled on.

Using floats/numpy arrays internally is going to easier and faster so I
think that's a plus. The biggest issue we're going to run in to is what's
defined as "internal" vs part of the unit API. Some things are easy like
the Axes/Axis API. But we also use low level API's like the patches. Are
those unitized? This is the pro and con of using something like Python
where basically everything is public. It makes it possible to do lots of
things, but it's much harder to define a clear library with a specific
public API.

Somewhere in the process we should write a proposal that outlines which
classes/methods are part of the unit api and which are going to be
considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal
implementation classes are placed in a sub-package inside MPL 3.0, it
becomes clearer to people later on what the "official' public API vs what
can be optimized to just use floats. Obviously the dev's would need to
decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the
user must supply two functions for each axis:

  * A mapping from your unit objects to floating point numbers
  * A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the
moment. Doing this would mean we can convert units as soon as they enter
Matplotlib, only ever have to deal with floating point numbers internally,
and then use the second function as late as possible when the user requests
stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib
3.0 at the earliest, so will be python 3 only.

David

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth,
and OK from downstream users. I actually don?t think it?ll be a huge
change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit
perplexed by their inconsistent and (to my simple mind) convoluted
application in the codebase. Having experience from people who try to use
them everyday will be absolutely key.

Cheers, Jody

>
> We use units for everything in our system (in fact, we funded John
Hunter originally to add in a unit system so we could use MPL) so it's a
crucial system for us. In our system, we have our own time classes (which
handle relativistic time frames as well as much higher precision
representations) and a custom unit system for floating point values.
>
> I think it's important to talk about these changes in concrete terms. I
understand the words you're using, but I'm not really clear on what the
real proposed changes are. For example, the current unit API returns a
units.AxisInfo object so the converter can set the formatter and locators
to use. Is that what you mean in the 2nd paragraph about ticks and
labels? Or is that changing?
>
> The current unit api is pretty simple and in units.ConversionInterface.
Are any of these changes going to change the conversion API? (note - I'm
not against changing it - I'm just not sure if there are any changes or
not).
>
> Another thing to consider: many of the examples people use are scripts
which make a plot and stop. But there are other use cases which are more
complicated and stress the system in different ways. We write several GUI
applications (in PyQt) that use MPL for plotting. In these cases, the user
is interacting with the plot to add and remove artists, change styles,
modify data, etc etc. So having a good object oriented API for modifying
things after construction is important for this to work. So when units are
involved, it can't be a "convert once at construction" and never touch
units again. We are constantly adjusting limits, moving artists, etc in
unitized space after the plot is created.
>
> So in addition to the ConversionInterface API, I think there are other
items that would be useful to explicitly spelled out. Things like which
API's in MPL should accept units and which won't and which methods return
unitized data and which don't. It would be nice if there was a clear
policy on this. Maybe one exists and I'm not aware of it - it would be
helpful to repeat it in a discussion on changing the unit system.
Obviously I would love to have every method accept and return unitized data
:-).
>
> I bring this up because I was just working on a hover/annotation class
that needed to move a single annotation artist with the mouse. To move the
annotation box the way I needed to, I had to set to one private member
variable, call two set methods, use attribute assignment for one value, and
set one semi-public member variable - some of which work with units and
some of which didn't. I think having a clear "this kind of method
accepts/returns units" policy would help when people are adding new
accessors/methods/variables to make it more clear what kind of data is
acceptable in each.
>
> Ted
> ps: I may be able to help with some resources to work on any unit
upgrades, but to make that happen I need to get a clear statement of what
problem is being solved and the scope of the work so I can explain to our
management why it's important.
>
> ________________________________________
> From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=
jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org><mailto:
jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of
Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak@
uvic.ca<mailto:jklymak at uvic.ca>>>
> Sent: Saturday, February 3, 2018 9:25 PM
> To: matplotlib development list
> Subject: [Matplotlib-devel] Units discussion...
>
> Hi all,
>
> To carry on the gitter discussion about unit handling, hopefully to lead
to a more stringent documentation and implimentation?.
>
> In response to @anntzer I thought about the units support a bit - it
seems that rather than a transform, a more straightforward approach is to
have the converter map to float arrays in a unique way. This float mapping
would be completely analogous to `date2num` in `dates`, in that it doesn?t
change and is perfectly invertible without matplotlib ever knowing about
the unit information, though the axis could store it for the the tick
locators and formatters. It would also have an inverse that would supply
data back to the user in unit-aware data (though not necessarily in the
unit that the user supplied. e.g. if they supply 8*in, the and the
converter converts everything to meter floats, then the returned unitized
inverse would be 0.203*m, or whatever convention the converter wants to
supply.).
>
> User ?unit? control, i.e. making the plot in inches instead of m, would
be accomplished with ticks locators and formatters. Matplotlib would never
directly convert between cm and inches (any more than it converts from days
to hours for dates), the downstream-supplied tick formatter and labeller
would do it.
>
> Each axis would only get one converter, set by the first call to the
axis. Subsequent calls to the axis would pass all data (including bare
floats) to the converter. If the converter wants to pass bare floats then
it can do so. If it wants to accept other data types then it can do so.
It should be possible for the user to clear or set the converter, but then
they should know what they are doing and why.
>
> Whats missing? I don?t think this is wildly different than what we
have, but maybe a bit more clear.
>
> Cheers, Jody
>
>
>
>
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
> Matplotlib-devel Info Page
> _______________________________________________
> Matplotlib-devel mailing list
> Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
> Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/3e457026/attachment-0001.html&gt;

···

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklyma > k at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
> On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) < > theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov > ><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theo > dore.r.drain at jpl.nasa.gov>>> wrote:

I realize that units are "a pain", but they're hugely useful. Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system). The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot. I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a little dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are implemented
  a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists. Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don?t need unit support. But I don?t think we are being hypercritical in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it. You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created. The Artist classes are one of the primary API's for applications. Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created. Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing. If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother. We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on. Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com <mailto:anntzer.lee at gmail.com> <anntzer.lee at gmail.com <mailto:anntzer.lee at gmail.com>> on behalf of Antony Lee <antony.lee at berkeley.edu <mailto:antony.lee at berkeley.edu>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com <mailto:dstansby at gmail.com><mailto:dstansby at gmail.com <mailto:dstansby at gmail.com>>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org <mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org>><mailto:jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org>>>> on behalf of Jody Klymak <jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/be016cd5/attachment-0001.html&gt;

···

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>>>> wrote:

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>>>> wrote:

Sorry if it came across that way - I wasn't try to say that it doesn't need work. I completely agree with you on 1) and 2).

As far as what to do about units in the draw stage - I'm not saying anything like that (thought that might be the result). I'm saying that to support units, we should design the API's to support units and be explicit about which API's support units and which don't (new doc tag maybe?). I'm not making any statements about the underlying affect of that statement on the code. There are a probably a number of designs that meet that API goal but I don't know enough of the MPL internals to advocate for one over the other.

I think we can help with building a better toy unit system. Or we can standardize on datetime and some existing unit package. Whatever makes it easier for people to write test cases.

···

________________________________________
From: Jody Klymak <jklymak@uvic.ca>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful. Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system). The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot. I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a little dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are implemented
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists. Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don?t need unit support. But I don?t think we are being hypercritical in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it. You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created. The Artist classes are one of the primary API's for applications. Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created. Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing. If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother. We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on. Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com> <anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com>> on behalf of Antony Lee <antony.lee at berkeley.edu<mailto:antony.lee@berkeley.edu>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com><mailto:dstansby@gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org<mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak@uvic.ca>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

*puts hand up* I'm (sort of...) a Matplotlib developer and use (am starting
to use) units in my day to day research.

My proposal remains this:

   - The user provides a method for converting objects of their custom
   class to a floating point numbers
   - The user provides a method for converting floating point numbers to
   their custom class
   - *Everything *in Matplotlib that accepts data accepts custom type
   objects
   - The first thing it does is convert these objects to floats, and then
   everything we do internally is with those floats

Can anyone point out reasons that this isn't the right way to do it? This
has the advantages:

   - Matplotlib takes no responsibility for the conversion
   - We only every calculate with floats
   - You can use whatever objects you want, as long as your converter goes
   object --> float

Having thought a little bit this seems like the obvious way to do it, but I
may be missing something.

David

···

On 8 February 2018 at 17:39, Jody Klymak <jklymak at uvic.ca> wrote:

I realize that units are "a pain", but they're hugely useful. Just
plotting datetimes is going to be a pain without units (and was a huge pain
before the unit system). The proposal that only Axes supports units is
going to cause us a massive problem as that's rarely everything that we do
with a plot. I could do a survey to find all the interactions we use (and
that doesn't even touch the 1000's of lines of code our users have written)
if that would help but anything that's part of the public api (axes,
artists, patches, etc) is probably being used - i.e. pretty much anything
that's in the current user's guide is something that we use/want/need to
work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features
(maybe the existing one is fine, but please see https://github.com/
matplotlib/matplotlib/issues/9713 for why I?m a little dismayed with the
state of things).

2) write a developer?s guide explaining how units should be/are
implemented
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the
draw stage (or cache stage) for all artists. Thats maybe fine, but as a
new developer, I found the units support woefully under-documented. The
fact that others have hacked in units support in various inconsistent ways
means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that
we don?t need unit support. But I don?t think we are being hypercritical
in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying
"just Axes has units" is basically saying the only valid unit use case is
create a plot one time and look at it. You can't manipulate it, edit it,
or build any kind of plotting GUI application (which we have many of) once
the plot has been created. The Artist classes are one of the primary API's
for applications. Artists are created, edited, and manipulated if you want
to allow the user to modify things in a plot after it's created. Even
the most basic cases like calling Line2D.set_data() wouldn't be allowed
with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.
The reason it keeps popping up is that code gets added without something
considering units which then triggers a bug reports which require fixing.
If there was a clearer policy and new code was required to have test cases
that cover non-unit and unit inputs, I think things would go much
smoother. We'd be happy to help with submitting new test cases to cover
unit cases in existing code once a policy is decided on. Maybe what's
needed is better documentation for developers who don't use units so they
can easily write a test case with units when adding/modifying
functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com <anntzer.lee at gmail.com> on behalf of Antony
Lee <antony.lee at berkeley.edu>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life
piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some
devs (including myself) being relatively dismissive about unit support is
the lack of well-defined use case, other than "it'd be nice if we supported
units" (i.e., especially from the point of view of devs who *don't* use
units themselves, it ends up being an ever moving target). In particular,
tests on unit support ("unit unit tests"? :-)) currently only rely on the
old JPL unit code that ended up integrated into Matplotlib's test suite,
but does not test integration with the two major unit packages I am aware
of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to
represent all kinds of relevant units. In particular, I was at some point
hoping to completely work in deunitized data internally, *including the
plotting*, and rely on the fact that if the deunitized and the unitized
data are usually linked by an affine transform, so the plotting part
doesn't need to convert back to unitized data and we only need to place and
label the ticks accordingly; however Ted mentioned relativistic units,
which imply the use of a non-affine transform. So I think it would also be
really helpful if JPL could release some reasonably documented unit library
with their actual use cases (and how it differs from pint & astropy.units),
so that we know better what is actually needed (I believe carrying the JPL
unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API
discussion, I believe a relatively simple and consistent line would be to
make Axes methods unitized and everything else deunitized (but with clear
ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov>>>:
That sounds fine to me. Our original unit prototype API actually had
conversions for both directions but I think the float->unit version was
removed (or really moved) when the ticker/formatter portion of the unit API
was settled on.

Using floats/numpy arrays internally is going to easier and faster so I
think that's a plus. The biggest issue we're going to run in to is what's
defined as "internal" vs part of the unit API. Some things are easy like
the Axes/Axis API. But we also use low level API's like the patches. Are
those unitized? This is the pro and con of using something like Python
where basically everything is public. It makes it possible to do lots of
things, but it's much harder to define a clear library with a specific
public API.

Somewhere in the process we should write a proposal that outlines which
classes/methods are part of the unit api and which are going to be
considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal
implementation classes are placed in a sub-package inside MPL 3.0, it
becomes clearer to people later on what the "official' public API vs what
can be optimized to just use floats. Obviously the dev's would need to
decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com
<dstansby at gmail.com>>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the
user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the
moment. Doing this would mean we can convert units as soon as they enter
Matplotlib, only ever have to deal with floating point numbers internally,
and then use the second function as late as possible when the user requests
stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib
3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto: > jklymak at uvic.ca <jklymak at uvic.ca>><mailto:jklymak at uvic.ca > <jklymak at uvic.ca><mailto:jklymak at uvic.ca <jklymak at uvic.ca>>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth,
and OK from downstream users. I actually don?t think it?ll be a huge
change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit
perplexed by their inconsistent and (to my simple mind) convoluted
application in the codebase. Having experience from people who try to use
them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) < > theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov > <theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov > <theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov > <theodore.r.drain at jpl.nasa.gov>>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter
originally to add in a unit system so we could use MPL) so it's a crucial
system for us. In our system, we have our own time classes (which handle
relativistic time frames as well as much higher precision representations)
and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I
understand the words you're using, but I'm not really clear on what the
real proposed changes are. For example, the current unit API returns a
units.AxisInfo object so the converter can set the formatter and locators
to use. Is that what you mean in the 2nd paragraph about ticks and
labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.
Are any of these changes going to change the conversion API? (note - I'm
not against changing it - I'm just not sure if there are any changes or
not).

Another thing to consider: many of the examples people use are scripts
which make a plot and stop. But there are other use cases which are more
complicated and stress the system in different ways. We write several GUI
applications (in PyQt) that use MPL for plotting. In these cases, the user
is interacting with the plot to add and remove artists, change styles,
modify data, etc etc. So having a good object oriented API for modifying
things after construction is important for this to work. So when units are
involved, it can't be a "convert once at construction" and never touch
units again. We are constantly adjusting limits, moving artists, etc in
unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other
items that would be useful to explicitly spelled out. Things like which
API's in MPL should accept units and which won't and which methods return
unitized data and which don't. It would be nice if there was a clear
policy on this. Maybe one exists and I'm not aware of it - it would be
helpful to repeat it in a discussion on changing the unit system.
Obviously I would love to have every method accept and return unitized data
:-).

I bring this up because I was just working on a hover/annotation class
that needed to move a single annotation artist with the mouse. To move the
annotation box the way I needed to, I had to set to one private member
variable, call two set methods, use attribute assignment for one value, and
set one semi-public member variable - some of which work with units and
some of which didn't. I think having a clear "this kind of method
accepts/returns units" policy would help when people are adding new
accessors/methods/variables to make it more clear what kind of data is
acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit
upgrades, but to make that happen I need to get a clear statement of what
problem is being solved and the scope of the work so I can explain to our
management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.
drain=jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org
<jpl.nasa.gov at python.org>><mailto:jpl.nasa.gov at python.org
<jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org
<jpl.nasa.gov at python.org>>>> on behalf of Jody Klymak <jklymak at uvic.ca<
mailto:jklymak at uvic.ca <jklymak at uvic.ca>><mailto:jklymak at uvic.ca
<jklymak at uvic.ca><mailto:jklymak at uvic.ca <jklymak at uvic.ca>>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead
to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems
that rather than a transform, a more straightforward approach is to have
the converter map to float arrays in a unique way. This float mapping
would be completely analogous to `date2num` in `dates`, in that it doesn?t
change and is perfectly invertible without matplotlib ever knowing about
the unit information, though the axis could store it for the the tick
locators and formatters. It would also have an inverse that would supply
data back to the user in unit-aware data (though not necessarily in the
unit that the user supplied. e.g. if they supply 8*in, the and the
converter converts everything to meter floats, then the returned unitized
inverse would be 0.203*m, or whatever convention the converter wants to
supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be
accomplished with ticks locators and formatters. Matplotlib would never
directly convert between cm and inches (any more than it converts from days
to hours for dates), the downstream-supplied tick formatter and labeller
would do it.

Each axis would only get one converter, set by the first call to the axis.
Subsequent calls to the axis would pass all data (including bare floats) to
the converter. If the converter wants to pass bare floats then it can do
so. If it wants to accept other data types then it can do so. It should
be possible for the user to clear or set the converter, but then they
should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have,
but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/027540af/attachment-0001.html&gt;

If you think you can make this work, I'm all for it. This would definitely
be a project where I think a large PR covering many changes would be nicer
(well, it'd still be hell to review...) to convince us skeptics :slight_smile: that
the approach is indeed viable.
Antony

2018-02-08 19:02 GMT+01:00 David Stansby <dstansby at gmail.com>:

*puts hand up* I'm (sort of...) a Matplotlib developer and use (am
starting to use) units in my day to day research.

My proposal remains this:

   - The user provides a method for converting objects of their custom
   class to a floating point numbers
   - The user provides a method for converting floating point numbers to
   their custom class
   - *Everything *in Matplotlib that accepts data accepts custom type
   objects
   - The first thing it does is convert these objects to floats, and then
   everything we do internally is with those floats

Can anyone point out reasons that this isn't the right way to do it? This
has the advantages:

   - Matplotlib takes no responsibility for the conversion
   - We only every calculate with floats
   - You can use whatever objects you want, as long as your converter
   goes object --> float

Having thought a little bit this seems like the obvious way to do it, but
I may be missing something.

David

I realize that units are "a pain", but they're hugely useful. Just
plotting datetimes is going to be a pain without units (and was a huge pain
before the unit system). The proposal that only Axes supports units is
going to cause us a massive problem as that's rarely everything that we do
with a plot. I could do a survey to find all the interactions we use (and
that doesn't even touch the 1000's of lines of code our users have written)
if that would help but anything that's part of the public api (axes,
artists, patches, etc) is probably being used - i.e. pretty much anything
that's in the current user's guide is something that we use/want/need to
work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features
(maybe the existing one is fine, but please see
Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a
little dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are
implemented
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the
draw stage (or cache stage) for all artists. Thats maybe fine, but as a
new developer, I found the units support woefully under-documented. The
fact that others have hacked in units support in various inconsistent ways
means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that
we don?t need unit support. But I don?t think we are being hypercritical
in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases.
Saying "just Axes has units" is basically saying the only valid unit use
case is create a plot one time and look at it. You can't manipulate it,
edit it, or build any kind of plotting GUI application (which we have many
of) once the plot has been created. The Artist classes are one of the
primary API's for applications. Artists are created, edited, and
manipulated if you want to allow the user to modify things in a plot after
it's created. Even the most basic cases like calling Line2D.set_data()
wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.
The reason it keeps popping up is that code gets added without something
considering units which then triggers a bug reports which require fixing.
If there was a clearer policy and new code was required to have test cases
that cover non-unit and unit inputs, I think things would go much
smoother. We'd be happy to help with submitting new test cases to cover
unit cases in existing code once a policy is decided on. Maybe what's
needed is better documentation for developers who don't use units so they
can easily write a test case with units when adding/modifying
functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com <anntzer.lee at gmail.com> on behalf of Antony
Lee <antony.lee at berkeley.edu>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life
piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some
devs (including myself) being relatively dismissive about unit support is
the lack of well-defined use case, other than "it'd be nice if we supported
units" (i.e., especially from the point of view of devs who *don't* use
units themselves, it ends up being an ever moving target). In particular,
tests on unit support ("unit unit tests"? :-)) currently only rely on the
old JPL unit code that ended up integrated into Matplotlib's test suite,
but does not test integration with the two major unit packages I am aware
of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to
represent all kinds of relevant units. In particular, I was at some point
hoping to completely work in deunitized data internally, *including the
plotting*, and rely on the fact that if the deunitized and the unitized
data are usually linked by an affine transform, so the plotting part
doesn't need to convert back to unitized data and we only need to place and
label the ticks accordingly; however Ted mentioned relativistic units,
which imply the use of a non-affine transform. So I think it would also be
really helpful if JPL could release some reasonably documented unit library
with their actual use cases (and how it differs from pint & astropy.units),
so that we know better what is actually needed (I believe carrying the JPL
unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API
discussion, I believe a relatively simple and consistent line would be to
make Axes methods unitized and everything else deunitized (but with clear
ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov>>>:
That sounds fine to me. Our original unit prototype API actually had
conversions for both directions but I think the float->unit version was
removed (or really moved) when the ticker/formatter portion of the unit API
was settled on.

Using floats/numpy arrays internally is going to easier and faster so I
think that's a plus. The biggest issue we're going to run in to is what's
defined as "internal" vs part of the unit API. Some things are easy like
the Axes/Axis API. But we also use low level API's like the patches. Are
those unitized? This is the pro and con of using something like Python
where basically everything is public. It makes it possible to do lots of
things, but it's much harder to define a clear library with a specific
public API.

Somewhere in the process we should write a proposal that outlines which
classes/methods are part of the unit api and which are going to be
considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal
implementation classes are placed in a sub-package inside MPL 3.0, it
becomes clearer to people later on what the "official' public API vs what
can be optimized to just use floats. Obviously the dev's would need to
decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com
<dstansby at gmail.com>>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the
user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at
the moment. Doing this would mean we can convert units as soon as they
enter Matplotlib, only ever have to deal with floating point numbers
internally, and then use the second function as late as possible when the
user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to
Matplotlib 3.0 at the earliest, so will be python 3 only.

David

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth,
and OK from downstream users. I actually don?t think it?ll be a huge
change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit
perplexed by their inconsistent and (to my simple mind) convoluted
application in the codebase. Having experience from people who try to use
them everyday will be absolutely key.

Cheers, Jody

We use units for everything in our system (in fact, we funded John Hunter
originally to add in a unit system so we could use MPL) so it's a crucial
system for us. In our system, we have our own time classes (which handle
relativistic time frames as well as much higher precision representations)
and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I
understand the words you're using, but I'm not really clear on what the
real proposed changes are. For example, the current unit API returns a
units.AxisInfo object so the converter can set the formatter and locators
to use. Is that what you mean in the 2nd paragraph about ticks and
labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.
Are any of these changes going to change the conversion API? (note - I'm
not against changing it - I'm just not sure if there are any changes or
not).

Another thing to consider: many of the examples people use are scripts
which make a plot and stop. But there are other use cases which are more
complicated and stress the system in different ways. We write several GUI
applications (in PyQt) that use MPL for plotting. In these cases, the user
is interacting with the plot to add and remove artists, change styles,
modify data, etc etc. So having a good object oriented API for modifying
things after construction is important for this to work. So when units are
involved, it can't be a "convert once at construction" and never touch
units again. We are constantly adjusting limits, moving artists, etc in
unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other
items that would be useful to explicitly spelled out. Things like which
API's in MPL should accept units and which won't and which methods return
unitized data and which don't. It would be nice if there was a clear
policy on this. Maybe one exists and I'm not aware of it - it would be
helpful to repeat it in a discussion on changing the unit system.
Obviously I would love to have every method accept and return unitized data
:-).

I bring this up because I was just working on a hover/annotation class
that needed to move a single annotation artist with the mouse. To move the
annotation box the way I needed to, I had to set to one private member
variable, call two set methods, use attribute assignment for one value, and
set one semi-public member variable - some of which work with units and
some of which didn't. I think having a clear "this kind of method
accepts/returns units" policy would help when people are adding new
accessors/methods/variables to make it more clear what kind of data is
acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit
upgrades, but to make that happen I need to get a clear statement of what
problem is being solved and the scope of the work so I can explain to our
management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.
drain=jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org
<jpl.nasa.gov at python.org>><mailto:jpl.nasa.gov at python.org
<jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org
<jpl.nasa.gov at python.org>>>> on behalf of Jody Klymak <jklymak at uvic.ca<
mailto:jklymak at uvic.ca <jklymak at uvic.ca>><mailto:jklymak at uvic.ca
<jklymak at uvic.ca><mailto:jklymak at uvic.ca <jklymak at uvic.ca>>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead
to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it
seems that rather than a transform, a more straightforward approach is to
have the converter map to float arrays in a unique way. This float mapping
would be completely analogous to `date2num` in `dates`, in that it doesn?t
change and is perfectly invertible without matplotlib ever knowing about
the unit information, though the axis could store it for the the tick
locators and formatters. It would also have an inverse that would supply
data back to the user in unit-aware data (though not necessarily in the
unit that the user supplied. e.g. if they supply 8*in, the and the
converter converts everything to meter floats, then the returned unitized
inverse would be 0.203*m, or whatever convention the converter wants to
supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would
be accomplished with ticks locators and formatters. Matplotlib would never
directly convert between cm and inches (any more than it converts from days
to hours for dates), the downstream-supplied tick formatter and labeller
would do it.

Each axis would only get one converter, set by the first call to the
axis. Subsequent calls to the axis would pass all data (including bare
floats) to the converter. If the converter wants to pass bare floats then
it can do so. If it wants to accept other data types then it can do so.
It should be possible for the user to clear or set the converter, but then
they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have,
but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/1e55cb2a/attachment-0001.html&gt;

···

On 8 February 2018 at 17:39, Jody Klymak <jklymak at uvic.ca> wrote:

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklyma >> k at uvic.ca <jklymak at uvic.ca>><mailto:jklymak at uvic.ca <jklymak at uvic.ca>< >> mailto:jklymak at uvic.ca <jklymak at uvic.ca>>>> wrote:
On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) < >> theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov >> <theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov >> <theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov >> <theodore.r.drain at jpl.nasa.gov>>>> wrote:

It sounds like the solution to not supporting units everywhere was to tell users to add a one line call to all inputs and a one line call to all outputs to convert to/from floats. So if that's a viable solution, then doesn't it follow that having that code in the library is also a viable solution? And isn't that the point of a library to reduce the same code being written over and over again? I'm not saying it's not work. But I don't see any concrete problems with the approach being put forth.

One possible plan might be:

1) Decide on the data management approach. Starting with something like "units are removed and added at the function interface points and not kept internally" might be a fine. So basically classes have to do the input->internal conversion for inputs and the internal->output conversion for return values but unitized data is not kept internally. This does require that unit converters also handle non-unitized data as MPL classes will be calling other methods with their internal data (non-unitized) but that shouldn't be a huge problem. This might also lead to a class hierarchy of unit converters where a user converter tries to handle the data and if it can't, it calls the base converter which handles floats, numpy, arrays, etc.

2) Update the doc processing (or just codify a standard) to identify unit supporting methods. Add developer docs to explain how and where external<->internal data conversions should occur.

3) Implement "standard" date and unit test classes and converters to use for all test cases

4) Create a prioritized list of the API's to work on. Start with Axes and add standardized unit handling in all of those methods first (or maybe prioritize the methods in Axes since there are a lot of them). Ideally, this includes updating the docs and having unit and non-unit test cases for each. Once that's done and proves the concept, start working through the underlying Artist's and Patches as time permits.

I think this approach allows individual methods to be updated and tested. They don't all have to be done at once and since the converter has to handle non-unitized data, this shouldn't break existing code. It seems like each method in Axes can be it's own PR w/ test case to make review and merging simpler. Working from the top (Axes) down also means that the internal classes don't need to be changed.

I can probably help w/ resources for doing this but I'll have to check on availability once a plan is finalized.

Ted

···

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org> on behalf of Antony Lee <antony.lee@berkeley.edu>
Sent: Thursday, February 8, 2018 10:16 AM
To: David Stansby
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

If you think you can make this work, I'm all for it. This would definitely be a project where I think a large PR covering many changes would be nicer (well, it'd still be hell to review...) to convince us skeptics :slight_smile: that the approach is indeed viable.
Antony

2018-02-08 19:02 GMT+01:00 David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com>>:
*puts hand up* I'm (sort of...) a Matplotlib developer and use (am starting to use) units in my day to day research.

My proposal remains this:

  * The user provides a method for converting objects of their custom class to a floating point numbers
  * The user provides a method for converting floating point numbers to their custom class
  * Everything in Matplotlib that accepts data accepts custom type objects
  * The first thing it does is convert these objects to floats, and then everything we do internally is with those floats

Can anyone point out reasons that this isn't the right way to do it? This has the advantages:

  * Matplotlib takes no responsibility for the conversion
  * We only every calculate with floats
  * You can use whatever objects you want, as long as your converter goes object --> float

Having thought a little bit this seems like the obvious way to do it, but I may be missing something.

David

On 8 February 2018 at 17:39, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>> wrote:

I realize that units are "a pain", but they're hugely useful. Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system). The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot. I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a little dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are implemented
a) in matplotlib modules
        b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists. Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don?t need unit support. But I don?t think we are being hypercritical in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it. You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created. The Artist classes are one of the primary API's for applications. Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created. Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing. If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother. We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on. Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com> <anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com>> on behalf of Antony Lee <antony.lee at berkeley.edu<mailto:antony.lee@berkeley.edu>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com><mailto:dstansby@gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org<mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak@uvic.ca>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

I think we can help with building a better toy unit system. Or we can standardize on datetime and some existing unit package. Whatever makes it easier for people to write test cases.

For me, the problem w/ datetime is that it is not fully featured units handling in that it doesn?t support multiple units. Its really just a class of data that we have known conversion to float for.

What we need an example of is how the following should work.

x = np.arange(10)
y = x*2 * myunitclass.in
ax.plot(x, y)
z = x*2 * myunitclass.cm
ax.plot(x, z)

So when a new feature is added, we can ask that its units support is made clear. I guess I don?t mind if those are astropy units or yt units, or pint, or?? though there will be some pushback about including another test dependency.

Would pint units work? Its a very small dependency, but maybe not as full featured or structured wildly differently from the others?

A test suite to my mind would
- test basic functionality
- test mixing allowed dimensions (i.e. inches and centimeters)
- test changing the axis units (so all the plotted data changes its values, *or* the tick locators/formatters change their values).
- test that disallowed mixed dimensions fail.
- ??

Cheers, Jody

···

On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov> wrote:

________________________________________
From: Jody Klymak <jklymak at uvic.ca <mailto:jklymak at uvic.ca>>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful. Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system). The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot. I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a little dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are implemented
a) in matplotlib modules
       b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists. Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don?t need unit support. But I don?t think we are being hypercritical in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it. You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created. The Artist classes are one of the primary API's for applications. Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created. Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing. If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother. We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on. Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com <mailto:anntzer.lee at gmail.com><mailto:anntzer.lee at gmail.com <mailto:anntzer.lee at gmail.com>> <anntzer.lee at gmail.com <mailto:anntzer.lee at gmail.com><mailto:anntzer.lee at gmail.com <mailto:anntzer.lee at gmail.com>>> on behalf of Antony Lee <antony.lee at berkeley.edu <mailto:antony.lee at berkeley.edu><mailto:antony.lee at berkeley.edu <mailto:antony.lee at berkeley.edu>>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com <mailto:dstansby at gmail.com><mailto:dstansby at gmail.com <mailto:dstansby at gmail.com>><mailto:dstansby at gmail.com <mailto:dstansby at gmail.com>>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov <mailto:theodore.r.drain at jpl.nasa.gov>>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org <mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org <mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org>><mailto:jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org>><mailto:jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org <mailto:jpl.nasa.gov at python.org>>>> on behalf of Jody Klymak <jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca <mailto:jklymak at uvic.ca>>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

--
Jody Klymak

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/f48d7de2/attachment-0001.html&gt;

I think we can help with building a better toy unit system. Or we can
standardize on datetime and some existing unit package. Whatever makes it
easier for people to write test cases.

For me, the problem w/ datetime is that it is not fully featured units
handling in that it doesn?t support multiple units. Its really just a
class of data that we have known conversion to float for.

What we need an example of is how the following should work.

x = np.arange(10)
y = x*2 * myunitclass.in
ax.plot(x, y)
z = x*2 * myunitclass.cm
ax.plot(x, z)

So when a new feature is added, we can ask that its units support is made
clear. I guess I don?t mind if those are astropy units or yt units, or
pint, or?? though there will be some pushback about including another test
dependency.

Would pint units work? Its a very small dependency, but maybe not as full
featured or structured wildly differently from the others?

One wrinkle: pint implements a "wrapper" array class rather than an ndarray
subclass. Both astropy and yt use and ndarray subclass. There are some
classes of errors that only happen for one style of unit arrays and other
classes of errors that only happen for the other.

A test suite to my mind would
- test basic functionality
- test mixing allowed dimensions (i.e. inches and centimeters)
- test changing the axis units (so all the plotted data changes its
values, *or* the tick locators/formatters change their values).
- test that disallowed mixed dimensions fail.
- ??

Cheers, Jody

________________________________________
From: Jody Klymak <jklymak at uvic.ca>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful. Just
plotting datetimes is going to be a pain without units (and was a huge pain
before the unit system). The proposal that only Axes supports units is
going to cause us a massive problem as that's rarely everything that we do
with a plot. I could do a survey to find all the interactions we use (and
that doesn't even touch the 1000's of lines of code our users have written)
if that would help but anything that's part of the public api (axes,
artists, patches, etc) is probably being used - i.e. pretty much anything
that's in the current user's guide is something that we use/want/need to
work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features
(maybe the existing one is fine, but please see
Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a little
dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are implemented
a) in matplotlib modules
       b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the
draw stage (or cache stage) for all artists. Thats maybe fine, but as a
new developer, I found the units support woefully under-documented. The
fact that others have hacked in units support in various inconsistent ways
means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that
we don?t need unit support. But I don?t think we are being hypercritical
in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying
"just Axes has units" is basically saying the only valid unit use case is
create a plot one time and look at it. You can't manipulate it, edit it,
or build any kind of plotting GUI application (which we have many of) once
the plot has been created. The Artist classes are one of the primary API's
for applications. Artists are created, edited, and manipulated if you want
to allow the user to modify things in a plot after it's created. Even
the most basic cases like calling Line2D.set_data() wouldn't be allowed
with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target.
The reason it keeps popping up is that code gets added without something
considering units which then triggers a bug reports which require fixing.
If there was a clearer policy and new code was required to have test cases
that cover non-unit and unit inputs, I think things would go much
smoother. We'd be happy to help with submitting new test cases to cover
unit cases in existing code once a policy is decided on. Maybe what's
needed is better documentation for developers who don't use units so they
can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com
<anntzer.lee at gmail.com>> <anntzer.lee at gmail.com<mailto:
anntzer.lee at gmail.com <anntzer.lee at gmail.com>>> on behalf of Antony Lee <
antony.lee at berkeley.edu<mailto:antony.lee at berkeley.edu
<antony.lee at berkeley.edu>>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life
piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some
devs (including myself) being relatively dismissive about unit support is
the lack of well-defined use case, other than "it'd be nice if we supported
units" (i.e., especially from the point of view of devs who *don't* use
units themselves, it ends up being an ever moving target). In particular,
tests on unit support ("unit unit tests"? :-)) currently only rely on the
old JPL unit code that ended up integrated into Matplotlib's test suite,
but does not test integration with the two major unit packages I am aware
of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to
represent all kinds of relevant units. In particular, I was at some point
hoping to completely work in deunitized data internally, *including the
plotting*, and rely on the fact that if the deunitized and the unitized
data are usually linked by an affine transform, so the plotting part
doesn't need to convert back to unitized data and we only need to place and
label the ticks accordingly; however Ted mentioned relativistic units,
which imply the use of a non-affine transform. So I think it would also be
really helpful if JPL could release some reasonably documented unit library
with their actual use cases (and how it differs from pint & astropy.units),
so that we know better what is actually needed (I believe carrying the JPL
unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API
discussion, I believe a relatively simple and consistent line would be to
make Axes methods unitized and everything else deunitized (but with clear
ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov>>>:
That sounds fine to me. Our original unit prototype API actually had
conversions for both directions but I think the float->unit version was
removed (or really moved) when the ticker/formatter portion of the unit API
was settled on.

Using floats/numpy arrays internally is going to easier and faster so I
think that's a plus. The biggest issue we're going to run in to is what's
defined as "internal" vs part of the unit API. Some things are easy like
the Axes/Axis API. But we also use low level API's like the patches. Are
those unitized? This is the pro and con of using something like Python
where basically everything is public. It makes it possible to do lots of
things, but it's much harder to define a clear library with a specific
public API.

Somewhere in the process we should write a proposal that outlines which
classes/methods are part of the unit api and which are going to be
considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal
implementation classes are placed in a sub-package inside MPL 3.0, it
becomes clearer to people later on what the "official' public API vs what
can be optimized to just use floats. Obviously the dev's would need to
decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com
<dstansby at gmail.com>><mailto:dstansby at gmail.com <dstansby at gmail.com>>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the
user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the
moment. Doing this would mean we can convert units as soon as they enter
Matplotlib, only ever have to deal with floating point numbers internally,
and then use the second function as late as possible when the user requests
stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib
3.0 at the earliest, so will be python 3 only.

David

Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth,
and OK from downstream users. I actually don?t think it?ll be a huge
change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit
perplexed by their inconsistent and (to my simple mind) convoluted
application in the codebase. Having experience from people who try to use
them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <
theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov>><mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov
<theodore.r.drain at jpl.nasa.gov>>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter
originally to add in a unit system so we could use MPL) so it's a crucial
system for us. In our system, we have our own time classes (which handle
relativistic time frames as well as much higher precision representations)
and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I
understand the words you're using, but I'm not really clear on what the
real proposed changes are. For example, the current unit API returns a
units.AxisInfo object so the converter can set the formatter and locators
to use. Is that what you mean in the 2nd paragraph about ticks and
labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface.
Are any of these changes going to change the conversion API? (note - I'm
not against changing it - I'm just not sure if there are any changes or
not).

Another thing to consider: many of the examples people use are scripts
which make a plot and stop. But there are other use cases which are more
complicated and stress the system in different ways. We write several GUI
applications (in PyQt) that use MPL for plotting. In these cases, the user
is interacting with the plot to add and remove artists, change styles,
modify data, etc etc. So having a good object oriented API for modifying
things after construction is important for this to work. So when units are
involved, it can't be a "convert once at construction" and never touch
units again. We are constantly adjusting limits, moving artists, etc in
unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other
items that would be useful to explicitly spelled out. Things like which
API's in MPL should accept units and which won't and which methods return
unitized data and which don't. It would be nice if there was a clear
policy on this. Maybe one exists and I'm not aware of it - it would be
helpful to repeat it in a discussion on changing the unit system.
Obviously I would love to have every method accept and return unitized data
:-).

I bring this up because I was just working on a hover/annotation class
that needed to move a single annotation artist with the mouse. To move the
annotation box the way I needed to, I had to set to one private member
variable, call two set methods, use attribute assignment for one value, and
set one semi-public member variable - some of which work with units and
some of which didn't. I think having a clear "this kind of method
accepts/returns units" policy would help when people are adding new
accessors/methods/variables to make it more clear what kind of data is
acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit
upgrades, but to make that happen I need to get a clear statement of what
problem is being solved and the scope of the work so I can explain to our
management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.
drain=jpl.nasa.gov at python.org<mailto:matplotlib-devel-
bounces+ted.drain=jpl.nasa.gov at python.org
<matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org>><mailto:jpl.
nasa.gov at python.org <jpl.nasa.gov at python.org>><mailto:
jpl.nasa.gov at python.org <jpl.nasa.gov at python.org><mailto
:jpl.nasa.gov at python.org <jpl.nasa.gov at python.org>>>> on behalf of Jody
Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca <jklymak at uvic.ca>><mailto:
jklymak at uvic.ca <jklymak at uvic.ca>><mailto:jklymak at uvic.ca
<jklymak at uvic.ca><mailto:jklymak at uvic.ca <jklymak at uvic.ca>>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead
to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems
that rather than a transform, a more straightforward approach is to have
the converter map to float arrays in a unique way. This float mapping
would be completely analogous to `date2num` in `dates`, in that it doesn?t
change and is perfectly invertible without matplotlib ever knowing about
the unit information, though the axis could store it for the the tick
locators and formatters. It would also have an inverse that would supply
data back to the user in unit-aware data (though not necessarily in the
unit that the user supplied. e.g. if they supply 8*in, the and the
converter converts everything to meter floats, then the returned unitized
inverse would be 0.203*m, or whatever convention the converter wants to
supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be
accomplished with ticks locators and formatters. Matplotlib would never
directly convert between cm and inches (any more than it converts from days
to hours for dates), the downstream-supplied tick formatter and labeller
would do it.

Each axis would only get one converter, set by the first call to the axis.
Subsequent calls to the axis would pass all data (including bare floats) to
the converter. If the converter wants to pass bare floats then it can do
so. If it wants to accept other data types then it can do so. It should
be possible for the user to clear or set the converter, but then they
should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have,
but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>><mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org
<Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

--
Jody Klymak
Jody M. Klymak - UVic Ocean Physics

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/06a31652/attachment-0001.html&gt;

···

On Thu, Feb 8, 2018 at 12:48 PM, Jody Klymak <jklymak at uvic.ca> wrote:

On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) < > theodore.r.drain at jpl.nasa.gov> wrote:
On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto: > jklymak at uvic.ca <jklymak at uvic.ca>><mailto:jklymak at uvic.ca > <jklymak at uvic.ca>><mailto:jklymak at uvic.ca <jklymak at uvic.ca><mailto: > jklymak at uvic.ca <jklymak at uvic.ca>>>> wrote:

One wrinkle: pint implements a "wrapper" array class rather than an ndarray subclass. Both astropy and yt use and ndarray subclass. There are some classes of errors that only happen for one style of unit arrays and other classes of errors that only happen for the other.

OK. glad I asked. Is that the case w/ JPL units as well? If that stipulation makes things easier, maybe se could enforce it?

Is there a smaller library that subclasses ndarray for units support? I imagine we could vendorize a subset of whatever astropy or yt do. Or maybe they aren?t so huge that they would be unreasonable to make as test dependencies. yt is only 68 Mb.

Cheers, Jody

···

--
Jody Klymak

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20180208/6efbc3b8/attachment.html&gt;

Does numpy subclassing really matter? If the docs say the unit converter must convert from the external type to the internal type, then as long as the converter does that, it doesn't matter what the external type is or what it inherits from right? The point is that the converter class is the only class manipulating the external data objects - MPL shouldn't care what they are or what they inherit from.

I think one issue is that data types are malleable in the API right now. Lists, tuples, numpy, ints, floats, etc are all possible inputs in many/most cases. IMO, the unit API should not be malleable at all. The unit converter API should say that the return type of external->internal conversion is always a specific value type (e.g. list of float, numpy float 64 array).

Jody: IMO, your example should plot the data in inches in the first plot call, then convert the second input to inches and plot that. The plot calls supports the xunits keyword argument which tells the converter what floating point unit conversion to apply. If that keyword is not specified, then it defaults to the type of the input. The example that needs to be more clear is if I do this:

ax.plot( x1, y1, xunits="km" )
ax.plot( x2, y2, xunits="miles" )

IMO, either the floats are km or miles, not both. So either the first call sticks the converter to using km and the second xunits is ignored. Or the second input overrides the first and requires that the first artists go back through a conversion to miles. Either is a reasonable choice for behavior (but the first is much easier to implement).

···

________________________________________
From: Nathan Goldbaum <nathan12343@gmail.com>
Sent: Thursday, February 8, 2018 10:52 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

On Thu, Feb 8, 2018 at 12:48 PM, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>> wrote:

On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>> wrote:

I think we can help with building a better toy unit system. Or we can standardize on datetime and some existing unit package. Whatever makes it easier for people to write test cases.

For me, the problem w/ datetime is that it is not fully featured units handling in that it doesn?t support multiple units. Its really just a class of data that we have known conversion to float for.

What we need an example of is how the following should work.

x = np.arange(10)
y = x*2 * myunitclass.in<http://myunitclass.in>
ax.plot(x, y)
z = x*2 * myunitclass.cm<http://myunitclass.cm>
ax.plot(x, z)

So when a new feature is added, we can ask that its units support is made clear. I guess I don?t mind if those are astropy units or yt units, or pint, or?? though there will be some pushback about including another test dependency.

Would pint units work? Its a very small dependency, but maybe not as full featured or structured wildly differently from the others?

One wrinkle: pint implements a "wrapper" array class rather than an ndarray subclass. Both astropy and yt use and ndarray subclass. There are some classes of errors that only happen for one style of unit arrays and other classes of errors that only happen for the other.

A test suite to my mind would
- test basic functionality
- test mixing allowed dimensions (i.e. inches and centimeters)
- test changing the axis units (so all the plotted data changes its values, *or* the tick locators/formatters change their values).
- test that disallowed mixed dimensions fail.
- ??

Cheers, Jody

________________________________________
From: Jody Klymak <jklymak at uvic.ca<mailto:jklymak@uvic.ca>>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful. Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system). The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot. I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a little dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are implemented
a) in matplotlib modules
       b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists. Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don?t need unit support. But I don?t think we are being hypercritical in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it. You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created. The Artist classes are one of the primary API's for applications. Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created. Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing. If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother. We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on. Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com><mailto:anntzer.lee at gmail.com> <anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com><mailto:anntzer.lee at gmail.com>> on behalf of Antony Lee <antony.lee at berkeley.edu<mailto:antony.lee at berkeley.edu><mailto:antony.lee@berkeley.edu>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com><mailto:dstansby at gmail.com><mailto:dstansby@gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org<mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak@uvic.ca>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Jody Klymak

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
https://mail.python.org/mailman/listinfo/matplotlib-devel

Does numpy subclassing really matter? If the docs say the unit converter must convert from the external type to the internal type, then as long as the converter does that, it doesn't matter what the external type is or what it inherits from right? The point is that the converter class is the only class manipulating the external data objects - MPL shouldn't care what they are or what they inherit from.

I think one issue is that data types are malleable in the API right now. Lists, tuples, numpy, ints, floats, etc are all possible inputs in many/most cases. IMO, the unit API should not be malleable at all. The unit converter API should say that the return type of external->internal conversion is always a specific value type (e.g. list of float, numpy float 64 array).

Yep, I think we all agree on this, but it ends up being messy?.

Jody: IMO, your example should plot the data in inches in the first plot call, then convert the second input to inches and plot that. The plot calls supports the xunits keyword argument which tells the converter what floating point unit conversion to apply. If that keyword is not specified, then it defaults to the type of the input. The example that needs to be more clear is if I do this:

ax.plot( x1, y1, xunits="km" )
ax.plot( x2, y2, xunits="miles" )

IMO, either the floats are km or miles, not both. So either the first call sticks the converter to using km and the second xunits is ignored. Or the second input overrides the first and requires that the first artists go back through a conversion to miles. Either is a reasonable choice for behavior (but the first is much easier to implement).

That?d be great. Thats not what our toy does now. This way of setting the units is also not very flexible. I could imagine users wanting to change units at some point, either by setting the units in the `ax.plot` calls or explicitly on the `Axis` objects themselves. If we carry the unitized objects around, and only convert at draw time, post-facto conversion is fine. If we carry de-unitized data around, then we need an inverse so we can re-convert.

*My* idea which some others have also esposed, is that the converter converts to floats (likely representing some base unit that makes sense, i.e in the example above ?meters? or ?kilometers?). The ?xunits? are maleable until draw time, at which point the Formatter and Locator decide how to format themselves. The xdata is never changed. Thats basically how our datetime formatting works - it is converted to days since epoch and then the Formatter and Locator decide how to format the axis. I think this works equally well for other artists plotted in dataspace. Because you have an inverse function, other tools that rely on getting data-space data like cursor position, making a box, etc, can still return the inverse in apprpriate units.

Cheers, Jody

···

On 8 Feb 2018, at 11:08, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov> wrote:

________________________________________
From: Nathan Goldbaum <nathan12343 at gmail.com>
Sent: Thursday, February 8, 2018 10:52 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

On Thu, Feb 8, 2018 at 12:48 PM, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>> wrote:

On 8 Feb 2018, at 09:54, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>> wrote:

I think we can help with building a better toy unit system. Or we can standardize on datetime and some existing unit package. Whatever makes it easier for people to write test cases.

For me, the problem w/ datetime is that it is not fully featured units handling in that it doesn?t support multiple units. Its really just a class of data that we have known conversion to float for.

What we need an example of is how the following should work.

x = np.arange(10)
y = x*2 * myunitclass.in<http://myunitclass.in>
ax.plot(x, y)
z = x*2 * myunitclass.cm<http://myunitclass.cm>
ax.plot(x, z)

So when a new feature is added, we can ask that its units support is made clear. I guess I don?t mind if those are astropy units or yt units, or pint, or?? though there will be some pushback about including another test dependency.

Would pint units work? Its a very small dependency, but maybe not as full featured or structured wildly differently from the others?

One wrinkle: pint implements a "wrapper" array class rather than an ndarray subclass. Both astropy and yt use and ndarray subclass. There are some classes of errors that only happen for one style of unit arrays and other classes of errors that only happen for the other.

A test suite to my mind would
- test basic functionality
- test mixing allowed dimensions (i.e. inches and centimeters)
- test changing the axis units (so all the plotted data changes its values, *or* the tick locators/formatters change their values).
- test that disallowed mixed dimensions fail.
- ??

Cheers, Jody

________________________________________
From: Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca>>
Sent: Thursday, February 8, 2018 9:39 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I realize that units are "a pain", but they're hugely useful. Just plotting datetimes is going to be a pain without units (and was a huge pain before the unit system). The proposal that only Axes supports units is going to cause us a massive problem as that's rarely everything that we do with a plot. I could do a survey to find all the interactions we use (and that doesn't even touch the 1000's of lines of code our users have written) if that would help but anything that's part of the public api (axes, artists, patches, etc) is probably being used - i.e. pretty much anything that's in the current user's guide is something that we use/want/need to work with unitized data.

OK, *for discussion*: A scope of work for JPL and Matplotlib might be:

1) develop better toy unit module that has most of the desired features (maybe the existing one is fine, but please see Units handling different with plot than other functions... · Issue #9713 · matplotlib/matplotlib · GitHub for why I?m a little dismayed with the state of things).

2) write a developer?s guide explaining how units should be/are implemented
a) in matplotlib modules
      b) by downstream developers (this is probably adequate already).

It sounds like what you are saying is that units should be carried to the draw stage (or cache stage) for all artists. Thats maybe fine, but as a new developer, I found the units support woefully under-documented. The fact that others have hacked in units support in various inconsistent ways means that we need to police all this better.

OTOH, maybe Antony and I are poor people to lead this charge, given that we don?t need unit support. But I don?t think we are being hypercritical in pointing out it needs work.

Thanks a lot, Jody

This is kind of what I meant in my previous email about use cases. Saying "just Axes has units" is basically saying the only valid unit use case is create a plot one time and look at it. You can't manipulate it, edit it, or build any kind of plotting GUI application (which we have many of) once the plot has been created. The Artist classes are one of the primary API's for applications. Artists are created, edited, and manipulated if you want to allow the user to modify things in a plot after it's created. Even the most basic cases like calling Line2D.set_data() wouldn't be allowed with units if only Axes has unit support.

I'm not sure I understand the statement that units are a moving target. The reason it keeps popping up is that code gets added without something considering units which then triggers a bug reports which require fixing. If there was a clearer policy and new code was required to have test cases that cover non-unit and unit inputs, I think things would go much smoother. We'd be happy to help with submitting new test cases to cover unit cases in existing code once a policy is decided on. Maybe what's needed is better documentation for developers who don't use units so they can easily write a test case with units when adding/modifying functionality.

Ted

________________________________________
From: anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com><mailto:anntzer.lee at gmail.com> <anntzer.lee at gmail.com<mailto:anntzer.lee at gmail.com><mailto:anntzer.lee at gmail.com>> on behalf of Antony Lee <antony.lee at berkeley.edu<mailto:antony.lee at berkeley.edu><mailto:antony.lee at berkeley.edu>>
Sent: Thursday, February 8, 2018 8:09 AM
To: Drain, Theodore R (392P)
Cc: matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

I'm momentarily a bit away from Matplotlib development due to real life piling up, so I'll just keep this short.

One major point (already mentioned by others) that led, I think, to some devs (including myself) being relatively dismissive about unit support is the lack of well-defined use case, other than "it'd be nice if we supported units" (i.e., especially from the point of view of devs who *don't* use units themselves, it ends up being an ever moving target). In particular, tests on unit support ("unit unit tests"? :-)) currently only rely on the old JPL unit code that ended up integrated into Matplotlib's test suite, but does not test integration with the two major unit packages I am aware of (pint and astropy.units).

From the email of Ted it appears that these are not sufficient to represent all kinds of relevant units. In particular, I was at some point hoping to completely work in deunitized data internally, *including the plotting*, and rely on the fact that if the deunitized and the unitized data are usually linked by an affine transform, so the plotting part doesn't need to convert back to unitized data and we only need to place and label the ticks accordingly; however Ted mentioned relativistic units, which imply the use of a non-affine transform. So I think it would also be really helpful if JPL could release some reasonably documented unit library with their actual use cases (and how it differs from pint & astropy.units), so that we know better what is actually needed (I believe carrying the JPL unit code in our own code base is a mistake).

As for the public vs private, or rather unitized vs deunitized API discussion, I believe a relatively simple and consistent line would be to make Axes methods unitized and everything else deunitized (but with clear ways to convert to and from unitized data when not using Axes methods).

Antony

2018-02-07 16:33 GMT+01:00 Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov>>:
That sounds fine to me. Our original unit prototype API actually had conversions for both directions but I think the float->unit version was removed (or really moved) when the ticker/formatter portion of the unit API was settled on.

Using floats/numpy arrays internally is going to easier and faster so I think that's a plus. The biggest issue we're going to run in to is what's defined as "internal" vs part of the unit API. Some things are easy like the Axes/Axis API. But we also use low level API's like the patches. Are those unitized? This is the pro and con of using something like Python where basically everything is public. It makes it possible to do lots of things, but it's much harder to define a clear library with a specific public API.

Somewhere in the process we should write a proposal that outlines which classes/methods are part of the unit api and which are going to be considered internal. I'm sure we can help with that effort.

That also might help clarify/influence code structure - if internal implementation classes are placed in a sub-package inside MPL 3.0, it becomes clearer to people later on what the "official' public API vs what can be optimized to just use floats. Obviously the dev's would need to decide if that kind of restructuring is worth it or not.

Ted

________________________________________
From: David Stansby <dstansby at gmail.com<mailto:dstansby at gmail.com><mailto:dstansby at gmail.com><mailto:dstansby at gmail.com>>
Sent: Wednesday, February 7, 2018 3:42 AM
To: Jody Klymak
Cc: Drain, Theodore R (392P); matplotlib development list
Subject: Re: [Matplotlib-devel] Units discussion...

Practically, I think what we are proposing is that for unit support the user must supply two functions for each axis:

* A mapping from your unit objects to floating point numbers
* A mapping from those floats back to your unit objects

As far as I know function 2 is new, and doesn't need to be supplied at the moment. Doing this would mean we can convert units as soon as they enter Matplotlib, only ever have to deal with floating point numbers internally, and then use the second function as late as possible when the user requests stuff like e.g. the axis limits.

Also worth noting that any major change like this will go in to Matplotlib 3.0 at the earliest, so will be python 3 only.

David

On 7 February 2018 at 06:06, Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>> wrote:
Dear Ted,

Thanks so much for engaging on this.

Don?t worry, nothing at all is changing w/o substantial back and forth, and OK from downstream users. I actually don?t think it?ll be a huge change, probably just some clean up and better documentation.

FWIW, I?ve not personally done much programming w/ units, just been a bit perplexed by their inconsistent and (to my simple mind) convoluted application in the codebase. Having experience from people who try to use them everyday will be absolutely key.

Cheers, Jody

On Feb 6, 2018, at 14:17 PM, Drain, Theodore R (392P) <theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov><mailto:theodore.r.drain at jpl.nasa.gov<mailto:theodore.r.drain at jpl.nasa.gov>>> wrote:

We use units for everything in our system (in fact, we funded John Hunter originally to add in a unit system so we could use MPL) so it's a crucial system for us. In our system, we have our own time classes (which handle relativistic time frames as well as much higher precision representations) and a custom unit system for floating point values.

I think it's important to talk about these changes in concrete terms. I understand the words you're using, but I'm not really clear on what the real proposed changes are. For example, the current unit API returns a units.AxisInfo object so the converter can set the formatter and locators to use. Is that what you mean in the 2nd paragraph about ticks and labels? Or is that changing?

The current unit api is pretty simple and in units.ConversionInterface. Are any of these changes going to change the conversion API? (note - I'm not against changing it - I'm just not sure if there are any changes or not).

Another thing to consider: many of the examples people use are scripts which make a plot and stop. But there are other use cases which are more complicated and stress the system in different ways. We write several GUI applications (in PyQt) that use MPL for plotting. In these cases, the user is interacting with the plot to add and remove artists, change styles, modify data, etc etc. So having a good object oriented API for modifying things after construction is important for this to work. So when units are involved, it can't be a "convert once at construction" and never touch units again. We are constantly adjusting limits, moving artists, etc in unitized space after the plot is created.

So in addition to the ConversionInterface API, I think there are other items that would be useful to explicitly spelled out. Things like which API's in MPL should accept units and which won't and which methods return unitized data and which don't. It would be nice if there was a clear policy on this. Maybe one exists and I'm not aware of it - it would be helpful to repeat it in a discussion on changing the unit system. Obviously I would love to have every method accept and return unitized data :-).

I bring this up because I was just working on a hover/annotation class that needed to move a single annotation artist with the mouse. To move the annotation box the way I needed to, I had to set to one private member variable, call two set methods, use attribute assignment for one value, and set one semi-public member variable - some of which work with units and some of which didn't. I think having a clear "this kind of method accepts/returns units" policy would help when people are adding new accessors/methods/variables to make it more clear what kind of data is acceptable in each.

Ted
ps: I may be able to help with some resources to work on any unit upgrades, but to make that happen I need to get a clear statement of what problem is being solved and the scope of the work so I can explain to our management why it's important.

________________________________________
From: Matplotlib-devel <matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org<mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:matplotlib-devel-bounces+ted.drain=jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org><mailto:jpl.nasa.gov at python.org<mailto:jpl.nasa.gov at python.org>>> on behalf of Jody Klymak <jklymak at uvic.ca<mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca><mailto:jklymak at uvic.ca<mailto:jklymak at uvic.ca>>>
Sent: Saturday, February 3, 2018 9:25 PM
To: matplotlib development list
Subject: [Matplotlib-devel] Units discussion...

Hi all,

To carry on the gitter discussion about unit handling, hopefully to lead to a more stringent documentation and implimentation?.

In response to @anntzer I thought about the units support a bit - it seems that rather than a transform, a more straightforward approach is to have the converter map to float arrays in a unique way. This float mapping would be completely analogous to `date2num` in `dates`, in that it doesn?t change and is perfectly invertible without matplotlib ever knowing about the unit information, though the axis could store it for the the tick locators and formatters. It would also have an inverse that would supply data back to the user in unit-aware data (though not necessarily in the unit that the user supplied. e.g. if they supply 8*in, the and the converter converts everything to meter floats, then the returned unitized inverse would be 0.203*m, or whatever convention the converter wants to supply.).

User ?unit? control, i.e. making the plot in inches instead of m, would be accomplished with ticks locators and formatters. Matplotlib would never directly convert between cm and inches (any more than it converts from days to hours for dates), the downstream-supplied tick formatter and labeller would do it.

Each axis would only get one converter, set by the first call to the axis. Subsequent calls to the axis would pass all data (including bare floats) to the converter. If the converter wants to pass bare floats then it can do so. If it wants to accept other data types then it can do so. It should be possible for the user to clear or set the converter, but then they should know what they are doing and why.

Whats missing? I don?t think this is wildly different than what we have, but maybe a bit more clear.

Cheers, Jody

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org><mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page

--
Jody Klymak
Jody M. Klymak - UVic Ocean Physics

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org<mailto:Matplotlib-devel at python.org>
Matplotlib-devel Info Page

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
Matplotlib-devel Info Page

--
Jody Klymak