histogram bug

John_Hunter · December 27, 2006, 8:18pm

Oops, I replied to your previous message before seeing this

    > one. Still, the larger question remains: maybe we should do
    > something to make it easier for users to understand what is
    > going on when the transform chokes on log(0). Changing
    > numbers <=0 to a small positive number and issuing a warning
    > would accomplish this, and I don't see much disadvantage.

This is tricky to implement in practice. Eg, what if the user did a
bar graph where the heights were order 1e-10? Without knowing what
the user intended when creating the graphics primitives it is
difficult to know what to do with them. I am hesitant to alter data
at the level of graphics primitives without knowing the operation that
created them. One possible solution may be to simply create a
helper function (loghist, logbar) which works like semilogx: it knows
what the user wants to do and does the right thing, in this case
making sure that the "bottom" of the rectangles is some suitable
positive number less than all the heights.

I definitely agree that the error message is not terribly helpful.
One possibility is to inspect most of the objects at set_xscale and
set_yscale and issue a warning if there is non-positive data.

Eg: 'one or more patches has a non-positive y coordinate'

This won't be too helpful for mpl newbies who don't know what a patch
is, but it will provide some additional information (at the expense of
inspecting all the data at scale changes)

Something like

if xscale=='log':
   for line in self.lines:
       xdata = line.get_xdata(valid_only = True)
       if min(xdata)<=0.:
           warn on lines and break

   for patch in self.patches:
       if min([x for x,y in patch.get_verts()])<=0.:
           warn on patches and break

   for collection in self.collections:
       if min([x for x,y in collection.get_verts()])<=0.:
            warn on collections and break

JDH

Eric_Firing1 · December 28, 2006, 1:59am

John Hunter wrote:

    > Oops, I replied to your previous message before seeing this
    > one. Still, the larger question remains: maybe we should do
    > something to make it easier for users to understand what is
    > going on when the transform chokes on log(0). Changing
    > numbers <=0 to a small positive number and issuing a warning
    > would accomplish this, and I don't see much disadvantage.

This is tricky to implement in practice. Eg, what if the user did a
bar graph where the heights were order 1e-10? Without knowing what
the user intended when creating the graphics primitives it is
difficult to know what to do with them. I am hesitant to alter data
at the level of graphics primitives without knowing the operation that
created them. One possible solution may be to simply create a

John,

Adjusting zero and negative values (or maybe just zero) would be unacceptable in a numerics library, but in the context of our graphical transforms it is analogous to clipping, and this we do all the time--we don't raise an exception if someone tries to plot outside the box. (This clipping strategy to handle nonpositive values is present already in the LogLocator.)

We can use such a small adjustment value that a problem such as you mention above is highly unlikely--and note that floating point itself has limitations, and does not permit arbitrarily small or large numbers. Furthermore, note that the user can always take advantage of the bottom kwarg. And if in some extreme case the user has not used the bottom kwarg and the bars really are shorter than the adjustment value, it will probably be quite obvious.

It is in ordinary line plotting that adjusting the value could be misleading--it plots an extremely small number (if the data limits are set to include it) instead of zero. Maybe this is enough of a drawback to nix the whole idea.

Every alternative that you propose is more complicated and less comprehensive than the low-level adjustment, however, and I see little if any real advantage to the alternatives.

helper function (loghist, logbar) which works like semilogx: it knows
what the user wants to do and does the right thing, in this case
making sure that the "bottom" of the rectangles is some suitable
positive number less than all the heights.

I definitely agree that the error message is not terribly helpful.

If you still don't want the adjustment, then the easiest way to improve the error message would be to raise a Python exception instead of a c++ error in places like

         for(int i=0; i < length; i++)
         {
                 if (x<=0) { throw std::domain_error("Cannot take log of nonpositive value"); }
                 else newx[i] = log10(x[i]);
         }

The domain error message above is informative, but it never makes it out to the user.

It looks like one variation on my suggestion--warning when adjusting a nonpositive value--would be difficult to implement.

Eric

···

One possibility is to inspect most of the objects at set_xscale and
set_yscale and issue a warning if there is non-positive data.

Eg: 'one or more patches has a non-positive y coordinate'

This won't be too helpful for mpl newbies who don't know what a patch
is, but it will provide some additional information (at the expense of
inspecting all the data at scale changes)

Something like

if xscale=='log': for line in self.lines:
       xdata = line.get_xdata(valid_only = True)
       if min(xdata)<=0.:
           warn on lines and break
      for patch in self.patches:
       if min([x for x,y in patch.get_verts()])<=0.: warn on patches and break

   for collection in self.collections:
       if min([x for x,y in collection.get_verts()])<=0.:
            warn on collections and break

JDH