Categorical variables and twinx make basically no sense

ajm8671 · March 14, 2019, 4:40am

Example code is pasted below. Basically, just wanted to say that this is
either completely non-intuitive, or I am failing to understand something
fundamental about matplotlib.

My impression would be that twinx would let me assign categorical values to
however they were before I twinned them (i.e., using the original axis'
order and coordinates). But, if you use categorical values, it will
truncate them to the length of the categorical list/array. In the example
code, if you comment out the x1/x2 categorical assignments and use integers
instead (uncomment those), it works as you'd expect. You can even swap the
order in which the integers are plotted, but NOT if you use the text
assignments, despite the fact that the integer and text axes are the same.

Gallery: https://imgur.com/a/NTgCFXB

Anyway, someone please let me know if there is some design principle I'm
missing here, or if this is a special case. It's missing from any of the
top level documentation (see:
https://matplotlib.org/gallery/lines_bars_and_markers/categorical_variables.html
, https://matplotlib.org/examples/api/two_scales.html ) and took me the
better part of this afternoon to figure out. The only conclusion I can come
to is that matplotlib treats these values differently, and converts the
text arrays to integers under the hood without trying to align them.

Cheers,

AJ

Code:

import matplotlib.pyplot as plt

x1 = ['apples', 'bananas', 'cheerios']
##x1 = [1,3,5]
y1 = [5, 6, 15]

x2 = ['apples','carrots','bananas','watermelon','cheerios']
##x2 = [1,2,3,4,5]
y2 = [100, 200, 300, 400, 500]

fig, ax = plt.subplots()

ax.scatter(x2, y2)
ax2 = ax.twinx()
ax2.scatter(x1, y1, color = "orange")

plt.show()
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20190313/3949113f/attachment.html>

jklymak1 · March 14, 2019, 5:29am

Thats basically correct to my understanding - the twin axes is a new axes that shares it?s xlimits with the old one, but we don?t have any way of passing ?categories? from one axes to the next, so it carries its own list of category->integer conversion that gets made anew when you call scatter. The first axes is the one that gets the tick labels.

You *may* be able to pass the converter to the second axes, but I?m not sure.

Easier would be to just do the following:


import matplotlib.pyplot as plt
import numpy as np

y1 = [5, np.NaN, 6, np.NaN, 15]
y2 = [100, 200, 300, 400, 500]

x = ['apples','carrots','bananas','watermelon','cheerios']

fig, ax = plt.subplots()
ax.scatter(x, y2)
ax2 = ax.twinx()
ax2.scatter(x, y1, color = "orange")
plt.show()

Cheers, Jody
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20190313/42b53449/attachment.html>

···

On Mar 13, 2019, at 21:40 PM, AJ M <ajm8671 at gmail.com> wrote:

The only conclusion I can come to is that matplotlib treats these values differently, and converts the text arrays to integers under the hood without trying to align them.

ajm8671 · March 14, 2019, 8:56am

Great, thanks for the reply. I tried the NaN solution during my tinkering
but thought I should somehow be getting away without having to recreate the
axes in the same dimensions, since it's not required when using integers.
Just making sure I wasn't missing anything more obvious

···

On Thu, Mar 14, 2019 at 12:29 AM Jody Klymak <jklymak at uvic.ca> wrote:

On Mar 13, 2019, at 21:40 PM, AJ M <ajm8671 at gmail.com> wrote:

The only conclusion I can come to is that matplotlib treats these values
differently, and converts the text arrays to integers under the hood
without trying to align them.

Thats basically correct to my understanding - the twin axes is a new axes
that shares it?s xlimits with the old one, but we don?t have any way of
passing ?categories? from one axes to the next, so it carries its own list
of category->integer conversion that gets made anew when you call scatter.
The first axes is the one that gets the tick labels.

You *may* be able to pass the converter to the second axes, but I?m not
sure.

Easier would be to just do the following:
import matplotlib.pyplot as plt
import numpy as np

y1 = [5, np.NaN, 6, np.NaN, 15]
y2 = [100, 200, 300, 400, 500]

x = ['apples','carrots','bananas','watermelon','cheerios']

fig, ax = plt.subplots()
ax.scatter(x, y2)
ax2 = ax.twinx()
ax2.scatter(x, y1, color = "orange")
plt.show()
Cheers, Jody

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-users/attachments/20190314/e8d28920/attachment.html>