obscure colormapping issue related to quantization

Howdy all,

I have a very obscure (and minor) colormapping issue which I would like to discuss. I am writing a workaround and the question is whether or not this is worth changing the base matplotlib distribution. The issue applies to any code that uses colormapping such as matshow or imshow. I am going to write the change and it is up to the user community whether anyone else in the world cares about this.

In color.py::LinearSegmentedColormap.__call__
The code that determines the colormap value in the look up table is

         xa = (xa *(self.N-1)).astype(Int)
         rgba = zeros(xa.shape+(4,), Float)
         rgba[...,0] = take(self._red_lut, xa)
         rgba[...,1] = take(self._green_lut, xa)
         rgba[...,2] = take(self._blue_lut, xa)

I am using colormaps for thresholding data. If the normalized image value is less than some threshold I plot one color, if it is above that threshold, I plot a second color. All images produced this way are two colors. This is just what I do, but the issue is always there for thresholding.

Here's the issue. The code line xa = (xa *(self.N-1)).astype(Int) simply truncates the data, in essence taking the floor of xa, the reason why this matters is that value always get rounded down, so intensities above the threshold all get mapped to the threshold value. The error is small and dependent only on the quantization of the colormap, normally N=256. Nonetheless, there are times, like when the threshold is 0 that the rounded down values are visible and should not be.

The best way to make thresholds get rid of this problem is to use the ceiling function. So the code would read
         xa = ceil(xa *(self.N-1)).astype(Int)

Then intensities are rounded up, which is safe for thresholding. Note that the original code is not wrong. There are circumstance when it does the correct thing, even with thresholding. The new line also takes (not much) longer because of the ceil function.

There is a fudge involving only user code, which would be to negate the image data and reverse the colormap, but that is not particularly pretty.

My personal suggestion is to include a flag in the declaration of the colormap:

     def __init__(self, name, segmentdata, N=256, ceiling=False):
  self.ceiling = ceiling

then in the __call__ routine:

  if self.ceiling:
          xa = ceil(xa *(self.N-1)).astype(Int)
  else:
          xa = (xa *(self.N-1)).astype(Int)

Let me know what you think.

Danny

Isn't the essence of the problem the fact that the thresholds are specified as floats and the lookup table is quantized by its nature? (And that the current design allows for arbitrary N.) Whatever value you specify for a threshold (unless it it 0 or 1) won't necessarily correspond to a clean boundary between the quantized values (however you choose to define the boundary: floor, midpoint, or ceiling). Isn't it possible that your ceiling version has the same problem the other way (some values that are actually less now show up above the threshold)?

I'm wondering if some other approach isn't needed. But before going into more thought about that I'd like to clarify your usage.

Perry Greenfield

···

On Aug 1, 2005, at 1:07 PM, Danny Shevitz wrote:

Howdy all,

I have a very obscure (and minor) colormapping issue which I would like to discuss. I am writing a workaround and the question is whether or not this is worth changing the base matplotlib distribution. The issue applies to any code that uses colormapping such as matshow or imshow. I am going to write the change and it is up to the user community whether anyone else in the world cares about this.

In color.py::LinearSegmentedColormap.__call__
The code that determines the colormap value in the look up table is

        xa = (xa *(self.N-1)).astype(Int)
        rgba = zeros(xa.shape+(4,), Float)
        rgba[...,0] = take(self._red_lut, xa)
        rgba[...,1] = take(self._green_lut, xa)
        rgba[...,2] = take(self._blue_lut, xa)

I am using colormaps for thresholding data. If the normalized image value is less than some threshold I plot one color, if it is above that threshold, I plot a second color. All images produced this way are two colors. This is just what I do, but the issue is always there for thresholding.

Here's the issue. The code line xa = (xa *(self.N-1)).astype(Int) simply truncates the data, in essence taking the floor of xa, the reason why this matters is that value always get rounded down, so intensities above the threshold all get mapped to the threshold value. The error is small and dependent only on the quantization of the colormap, normally N=256. Nonetheless, there are times, like when the threshold is 0 that the rounded down values are visible and should not be.

The best way to make thresholds get rid of this problem is to use the ceiling function. So the code would read
        xa = ceil(xa *(self.N-1)).astype(Int)

Then intensities are rounded up, which is safe for thresholding. Note that the original code is not wrong. There are circumstance when it does the correct thing, even with thresholding. The new line also takes (not much) longer because of the ceil function.

There is a fudge involving only user code, which would be to negate the image data and reverse the colormap, but that is not particularly pretty.

My personal suggestion is to include a flag in the declaration of the colormap:

    def __init__(self, name, segmentdata, N=256, ceiling=False):
  self.ceiling = ceiling

then in the __call__ routine:

  if self.ceiling:
          xa = ceil(xa *(self.N-1)).astype(Int)
  else:
          xa = (xa *(self.N-1)).astype(Int)

Let me know what you think.

Danny