Folks,
Notes from today's phone call. There are 3 things left for 2.1.1:
categorical changes, 'fuzzy' images when mostly invalid, and the appevyor
cleanup
Tom
Ryan May, Eric Firing, Thomas Caswell
** categorical
- everyone on board with not sorting categorical values
- everyone on board with only accepting strings as categories
- some concern about supporting `np.nan`
- if the first entry in `nan`, will miss units
- do not want a python loop that chceks until it finds a not-nan
- defer nan handling
Tom's job to get all of the PRs collected an into one
Plan going forward to support mixed types, missing data, and explicit
ordering between categories:
- write a category class
- write a handler for pandas categorical
** Tom's pre-meeting notes
*** do not sort values
On one hand, sorting the values make sense as
#+BEGIN_SRC python
fig, (ax1, ax2) = plt.subplots(2, 1)
ax1.scatter([1, 2], [1, 2])
ax2.scatter([2, 1], [2, 1])
#+END_SRC
should produce visually identical plots so Tom thinks
#+BEGIN_SRC python
fig, (ax1, ax2) = plt.subplots(2, 1)
ax1.scatter(['a', 'b'], [1, 2])
ax2.scatter(['b', 'a'], [2, 1])
#+END_SRC
should as well (but Tom is wrong).
On the other hand, user may expect that there is some semantics in the
order they pass the data in in
#+BEGIN_SRC python
plt.bar(['first', 'second', 'third'], [1, 2, 3])
#+END_SRC
and blindly sorting alphabetically gives them no escape hatch.
practicality over purity, drop the sorting.
*** supporting non-string values
In the original implementation a cast through numpy was used which
converted all non-string values to strings so things like
#+BEGIN_SRC python
plt.bar([1, 2, 'apple'], [1, 2, 3])
#+END_SRC
would work. However this lead to the =2= and ='2'= being treated as
the same (which seems less than great). Supporting them as different
is possible, but is a fair bit of work because a number of places the
unit framework assumes that 'plain' numbers will pass through
un-changed.
A more worrying concern is that
#+BEGIN_SRC python
x = [52, 32, 'a', 'b']
y = [0, 1, 2, 3]
fig, (ax1, ax2) = plt.subplots()
ax1.plot(x, y, 'o')
ax2.plot(x[:2], y[:2], 'x')
#+END_SRC
in the first case the ints are treated as categoricals and in the
second they are not. If we want to support mixed types like this then
we need to make a special class (or use pandas categorical) which does
not have to guess the type on the fly.
requiring if the categorical unit handling is triggered, then all of the
values
must be string-like seems like the safest approach.
*** support for nan
Most of matptollib accepts `np.nan` as 'missing' data which is then
dropped as part of the draw process. This makes less sense with `bar` but
makes
lots of sense with `scatter`.
We should special-case allowing `np.nan` in as a 'string' and map it
map it to it's self.
*** special containers
It was proposed to look for objects arrays as a marker for catagorical
instead of the type of the data. Do not think we should do this as we try
to
be as agnostic about the container as possible everywhere else.
** appveyor
- drop building conda package
- remove conda recipe from the repo
Ryan is taking care of this
** set_theta_grid(frac)
- merged, improvement over current behavior, raising seems too
aggressive
** #8947
- ringing with lots of nans
this is Tom's job to investigate
** talked about traits / traitlets and friends
** major funding
- get mplot3D 'right'
- same interface
- uses real 3D tools
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20171120/4e61b48a/attachment.html>