NaN bugs

Ben_Axelrod · July 25, 2008, 3:08pm

I have noticed 2 bugs having to do with NaN handling in the scatter()
function. And one other bug that seems to be in numpy.

The min and max for the axes are not computed properly when there are
NaNs in the data. Example:

import pylab as pl

import numpy as np

x = np.asarray([0, 1, 2, 3, None, 5, 6, 7, 8, 9], float)

y = np.asarray([0, None, 2, 3, 4, 5, 6, 7, 8, 9], float)

ax = pl.subplot(111)

ax.scatter(x, y)

pl.show()

The points with NaN values are left out of the plot as expected, but you
will see that everything before the NaN is ignored when computing the axis
ranges. (The X axis goes from 4 to 10, cutting off some data, when it
should be from -1 to 10. The Y axis goes from 1 to 10 when it should be also
be from -1 to 10.) This is rather annoying since these simple calls fix
the issue:

ax.set_xlim(min(x), max(y))

ax.set_ylim(min(y), max(y))

We see the same behavior for the ‘c’ axis. Example:

import pylab as pl

import numpy as np

x = np.asarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], float)

y = np.asarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], float)

z = np.asarray([0, 1, 2, 3, 4, 5, None, 7, 8, 9], float)

ax = pl.subplot(111)

ax.scatter(x, y, c=z)

pl.show()

We see that everything before point 7 has zero color. And we can bandaid
fix it by adding:

ax.scatter(x, y, c=z,

vmin=min(z),

vmax=max(z))

Then only the one NaN point has zero color.

Both of the above mentioned bandaid fixes suffer from some bug
(I think in numpy). Where the min() and max() of a numpy array
where the first value is NaN, bugs out:

x = np.asarray([None, 1, 2, 3, 4, 5, 6, 7, 8, 9], float)

y = np.asarray([0, 1, 2, 3, 4, 5, 6, 7, 8, None], float)

z = np.asarray([0, 1, 2, 3, 4, 5, None, 7, 8, 9], float)

print min(x), max(x) #prints 1.#QNAN 1.#QNAN

print min(y), max(y) #prints 0.0 8.0

print min(z), max(z) #pritns 0.0 9.0

FYI, I am using MatPlotLib version 0.91.4 and NumPy 1.1.0 on windows
and Debian Linux.

Thanks,

-Ben

_John_Hunter1 · July 25, 2008, 4:39pm

I believe this is fixed in svn (0.98 branch) -- I tested your first
example and it behaved as expected. I f you have a build environment,
please test the release candidate

http://matplotlib.sourceforge.net/tmp/matplotlib-0.98.3rc1.tar.gz

Any other users who would like to test the release candidate, we would
be much obliged. We do not have any binaries for testing
unfortunately.

JDH

···

On Fri, Jul 25, 2008 at 10:08 AM, Ben Axelrod <[email protected]...> wrote:

I have noticed 2 bugs having to do with NaN handling in the scatter()

_Ryan_May1 · July 25, 2008, 6:56pm

Ben Axelrod wrote:

3. Both of the above mentioned bandaid fixes suffer from some bug (I think in numpy). Where the min() and max() of a numpy array where the first value is NaN, bugs out:

x = np.asarray([None, 1, 2, 3, 4, 5, 6, 7, 8, 9], float)

y = np.asarray([0, 1, 2, 3, 4, 5, 6, 7, 8, None], float)

z = np.asarray([0, 1, 2, 3, 4, 5, None, 7, 8, 9], float)

print min(x), max(x) #prints 1.#QNAN 1.#QNAN

print min(y), max(y) #prints 0.0 8.0

print min(z), max(z) #pritns 0.0 9.0

It's actually pure luck that min/max worked at all. What you want is numpy.nanmax() and numpy.nanmin() which properly handle NaN's in your array.

Ryan

···

--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma