NaN support

It looks like the new transforms codebase does not support data that contains
NaNs the way the old trunk did:

plot([1,2,NaN,4,5]) produces a plot with no line break
plot([NaN,2,3,4,5]) produces a plot with no line

I know use of NaN's is not encouraged, and we have discussed it on this list.
But since they are sometimes unintended and unavoidable, I'm reporting the
behavior so people are not surprised when they get an empty plot.

Darren

Darren Dale wrote:

It looks like the new transforms codebase does not support data that contains NaNs the way the old trunk did:

plot([1,2,NaN,4,5]) produces a plot with no line break
plot([NaN,2,3,4,5]) produces a plot with no line

I know use of NaN's is not encouraged, and we have discussed it on this list. But since they are sometimes unintended and unavoidable, I'm reporting the behavior so people are not surprised when they get an empty plot.

Darren

This comes up often enough that it might be worth doing automatic masking of NaNs in the high-level argument processing wherever we are now doing x=npy.ma.asarray(x) or similar. The calls could be replaced with x=npy.ma.masked_where(~npy.isfinite(x),x) I have resisted adding this overhead, but now I think the overhead might be negligible in almost all cases. It looks like it would be on the order of milliseconds per high-level call, where by high-level call I mean plot, pcolor, quiver, etc. (either via pylab or Axes method).

Eric

I think this is an oversight on my part when doing the rewrite. The old trunk had special code to deal with NaNs in draw_lines (in some backends, but not all). Since draw_lines disappeared (it was replaced with draw_path), this functionality fell through the cracks. I think somehow bringing the old solution over (without the backend dependence) may be faster than what Eric is proposing, since it is done "on-the-fly" in C (at least for Agg) and doesn't require allocating any more memory for the mask.

However, it feels a bit dirty on a gut level to allow two subtly different ways to specify data with gaps. I almost lean toward forcing the user to provide masked arrays if that's what they want to do -- but that's probably not realistic. I think Darren is right -- sometimes NaN's come up when you least expect them, after many other data sets and plots have already worked, and it would be bad for the app to suddenly just fail in that case.

I'll look into bringing the old way over to the new code. Failing that, I think Eric's suggestion is quite reasonable.

Cheers,
Mike

Eric Firing wrote:

···

Darren Dale wrote:

It looks like the new transforms codebase does not support data that contains NaNs the way the old trunk did:

plot([1,2,NaN,4,5]) produces a plot with no line break
plot([NaN,2,3,4,5]) produces a plot with no line

I know use of NaN's is not encouraged, and we have discussed it on this list. But since they are sometimes unintended and unavoidable, I'm reporting the behavior so people are not surprised when they get an empty plot.

Darren

This comes up often enough that it might be worth doing automatic masking of NaNs in the high-level argument processing wherever we are now doing x=npy.ma.asarray(x) or similar. The calls could be replaced with x=npy.ma.masked_where(~npy.isfinite(x),x) I have resisted adding this overhead, but now I think the overhead might be negligible in almost all cases. It looks like it would be on the order of milliseconds per high-level call, where by high-level call I mean plot, pcolor, quiver, etc. (either via pylab or Axes method).

Eric

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

I have something that works in r4833. It requires a clean rebuild, since the change was only to a header file. It produces no measurable slowdown on my fps test. (Of course, Eric's suggestion probably wouldn't have either...)

Please let me know how this works for you.

The nice thing about the new implementation is that it is in common code, so all backends should support NaNs (whereas before it was only Agg, Ps and Pdf). It does, curiously, also work on polygons etc. (the NaN-containing vertices are skipped), which doesn't make a whole lot of sense, but it's actually harder not to do that. IMHO, if you're putting NaNs in polygons you're asking for trouble anyway... If you are, speak up now... :wink:

Cheers,
Mike

Michael Droettboom wrote:

···

I think this is an oversight on my part when doing the rewrite. The old trunk had special code to deal with NaNs in draw_lines (in some backends, but not all). Since draw_lines disappeared (it was replaced with draw_path), this functionality fell through the cracks. I think somehow bringing the old solution over (without the backend dependence) may be faster than what Eric is proposing, since it is done "on-the-fly" in C (at least for Agg) and doesn't require allocating any more memory for the mask.

However, it feels a bit dirty on a gut level to allow two subtly different ways to specify data with gaps. I almost lean toward forcing the user to provide masked arrays if that's what they want to do -- but that's probably not realistic. I think Darren is right -- sometimes NaN's come up when you least expect them, after many other data sets and plots have already worked, and it would be bad for the app to suddenly just fail in that case.

I'll look into bringing the old way over to the new code. Failing that, I think Eric's suggestion is quite reasonable.

Cheers,
Mike

Eric Firing wrote:

Darren Dale wrote:

It looks like the new transforms codebase does not support data that contains NaNs the way the old trunk did:

plot([1,2,NaN,4,5]) produces a plot with no line break
plot([NaN,2,3,4,5]) produces a plot with no line

I know use of NaN's is not encouraged, and we have discussed it on this list. But since they are sometimes unintended and unavoidable, I'm reporting the behavior so people are not surprised when they get an empty plot.

Darren

This comes up often enough that it might be worth doing automatic masking of NaNs in the high-level argument processing wherever we are now doing x=npy.ma.asarray(x) or similar. The calls could be replaced with x=npy.ma.masked_where(~npy.isfinite(x),x) I have resisted adding this overhead, but now I think the overhead might be negligible in almost all cases. It looks like it would be on the order of milliseconds per high-level call, where by high-level call I mean plot, pcolor, quiver, etc. (either via pylab or Axes method).

Eric

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

Thank you, Mike.

···

On Wednesday 09 January 2008 04:36:21 pm Michael Droettboom wrote:

I have something that works in r4833. It requires a clean rebuild,
since the change was only to a header file. It produces no measurable
slowdown on my fps test. (Of course, Eric's suggestion probably
wouldn't have either...)

Please let me know how this works for you.

The nice thing about the new implementation is that it is in common
code, so all backends should support NaNs (whereas before it was only
Agg, Ps and Pdf). It does, curiously, also work on polygons etc. (the
NaN-containing vertices are skipped), which doesn't make a whole lot of
sense, but it's actually harder not to do that. IMHO, if you're putting
NaNs in polygons you're asking for trouble anyway... If you are, speak
up now... :wink:

Cheers,
Mike

Michael Droettboom wrote:
> I think this is an oversight on my part when doing the rewrite. The old
> trunk had special code to deal with NaNs in draw_lines (in some
> backends, but not all). Since draw_lines disappeared (it was replaced
> with draw_path), this functionality fell through the cracks. I think
> somehow bringing the old solution over (without the backend dependence)
> may be faster than what Eric is proposing, since it is done "on-the-fly"
> in C (at least for Agg) and doesn't require allocating any more memory
> for the mask.
>
> However, it feels a bit dirty on a gut level to allow two subtly
> different ways to specify data with gaps. I almost lean toward forcing
> the user to provide masked arrays if that's what they want to do -- but
> that's probably not realistic. I think Darren is right -- sometimes
> NaN's come up when you least expect them, after many other data sets and
> plots have already worked, and it would be bad for the app to suddenly
> just fail in that case.
>
> I'll look into bringing the old way over to the new code. Failing that,
> I think Eric's suggestion is quite reasonable.
>
> Cheers,
> Mike
>
> Eric Firing wrote:
>> Darren Dale wrote:
>>> It looks like the new transforms codebase does not support data that
>>> contains NaNs the way the old trunk did:
>>>
>>> plot([1,2,NaN,4,5]) produces a plot with no line break
>>> plot([NaN,2,3,4,5]) produces a plot with no line
>>>
>>> I know use of NaN's is not encouraged, and we have discussed it on this
>>> list. But since they are sometimes unintended and unavoidable, I'm
>>> reporting the behavior so people are not surprised when they get an
>>> empty plot.
>>>
>>> Darren
>>
>> This comes up often enough that it might be worth doing automatic
>> masking of NaNs in the high-level argument processing wherever we are
>> now doing x=npy.ma.asarray(x) or similar. The calls could be replaced
>> with x=npy.ma.masked_where(~npy.isfinite(x),x) I have resisted adding
>> this overhead, but now I think the overhead might be negligible in
>> almost all cases. It looks like it would be on the order of
>> milliseconds per high-level call, where by high-level call I mean plot,
>> pcolor, quiver, etc. (either via pylab or Axes method).
>>
>> Eric
>>
>> ------------------------------------------------------------------------
>>- Check out the new SourceForge.net Marketplace.
>> It's the best place to buy or sell services for
>> just about anything Open Source.
>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketp
>>lace _______________________________________________
>> Matplotlib-devel mailing list
>> Matplotlib-devel@lists.sourceforge.net
>> matplotlib-devel List Signup and Options

--
Darren S. Dale, Ph.D.
Staff Scientist
Cornell High Energy Synchrotron Source
Cornell University
275 Wilson Lab
Rt. 366 & Pine Tree Road
Ithaca, NY 14853

darren.dale@...143...
office: (607) 255-3819
fax: (607) 255-9001
http://www.chess.cornell.edu