spy: ignore zero values in sparse matrix

When sparse matrices have explicit zero values, `axes.spy` plots those zero values. This behavior seems unintentional. For example, the following code should have a main diagonal with markers missing in the middle, but `spy` currently plots a full main diagonal.

···

#~~~~~~~~~~~
import scipy.sparse as sparse
import matplotlib.pyplot as plt

sp = sparse.spdiags([[1,1,1,0,0,0,1,1,1]], [0], 9, 9)
plt.spy(sp, marker='.')
#~~~~~~~~~~~

Below is a patch which only plots the nonzero entries in a sparse matrix. Note, sparse matrices with all zero entries raises an error; this behavior differs from dense matrices. I could change this behavior, but I wanted to minimize the code changed.

Cheers,

-Tony

PS: this patch also includes two trivial changes to some examples.

Index: lib/matplotlib/axes.py

--- lib/matplotlib/axes.py (revision 6122)
+++ lib/matplotlib/axes.py (working copy)
@@ -6723,9 +6723,11 @@
          else:
              if hasattr(Z, 'tocoo'):
                  c = Z.tocoo()
- y = c.row
- x = c.col
- z = c.data
+ nonzero = c.data != 0.
+ if all(nonzero == False):
+ raise ValueError('spy cannot plot sparse zeros matrix')
+ y = c.row[nonzero]
+ x = c.col[nonzero]
              else:
                  Z = np.asarray(Z)
                  if precision is None: mask = Z!=0.
Index: examples/pylab_examples/masked_demo.py

--- examples/pylab_examples/masked_demo.py (revision 6122)
+++ examples/pylab_examples/masked_demo.py (working copy)
@@ -1,4 +1,4 @@
-#!/bin/env python
+#!/usr/bin/env python
  '''
  Plot lines with points masked out.

Index: examples/misc/rec_groupby_demo.py

--- examples/misc/rec_groupby_demo.py (revision 6122)
+++ examples/misc/rec_groupby_demo.py (working copy)
@@ -2,7 +2,7 @@
  import matplotlib.mlab as mlab

-r = mlab.csv2rec('data/aapl.csv')
+r = mlab.csv2rec('../data/aapl.csv')
  r.sort()

  def daily_return(prices):

Is raising an exception the right choice here -- why can't we plot an
all zeros image?

JDH

···

On Fri, Sep 26, 2008 at 12:39 PM, Tony S Yu <tonyyu@...608...> wrote:

+ if all(nonzero == False):
+ raise ValueError('spy cannot plot sparse zeros
matrix')

I guess you could plot sparse all-zero matrices with image mode. My only hesitation is that sparse arrays tend to be very large and (I imagine) this would lead to very slow performance. I assumed this was the reason image mode wasn't adapted to use sparse arrays.

Actually, now that I think about it: you could plot a trivially small image and just adjust the coordinates so that they correspond to the original matrix shape. Is this what you were thinking?

I should note that a dense zero array also fails to plot with spy *if marker mode is used*.

-T

···

On Sep 26, 2008, at 2:28 PM, John Hunter wrote:

On Fri, Sep 26, 2008 at 12:39 PM, Tony S Yu <tonyyu@...608...> wrote:

+ if all(nonzero == False):
+ raise ValueError('spy cannot plot sparse zeros
matrix')

Is raising an exception the right choice here -- why can't we plot an
all zeros image?

JDH

I guess you could plot sparse all-zero matrices with image mode. My only
hesitation is that sparse arrays tend to be very large and (I imagine) this
would lead to very slow performance. I assumed this was the reason image
mode wasn't adapted to use sparse arrays.

Actually, now that I think about it: you could plot a trivially small image
and just adjust the coordinates so that they correspond to the original
matrix shape. Is this what you were thinking?

This is something I considered, but I was thinking less about the
implementation and more about the functionality. I don't want to
raise an exception unless the input doesn't make sense. I would
rather the user start at a boring image and figure out why it is blank
that deal with an exception.

I should note that a dense zero array also fails to plot with spy *if marker
mode is used*.

Can you fix this along with spy2?

JDH

···

On Fri, Sep 26, 2008 at 2:36 PM, Tony S Yu <tonyyu@...608...> wrote:

Actually, now that I think about it: you could plot a trivially small image
and just adjust the coordinates so that they correspond to the original
matrix shape. Is this what you were thinking?

This is something I considered, but I was thinking less about the
implementation and more about the functionality. I don't want to
raise an exception unless the input doesn't make sense. I would
rather the user start at a boring image and figure out why it is blank
that deal with an exception.

Yeah, I agree this is much friendlier.

I should note that a dense zero array also fails to plot with spy *if marker
mode is used*.

Can you fix this along with spy2?

I assume you mean spy, not spy2 (I just searched through the matplotlib files and saw that spy2 hasn't existed since 2006). I'll work on a patch to return a blank plot using the method described above (unless someone chimes in with a better suggestion).

-Tony

···

On Sep 26, 2008, at 3:38 PM, John Hunter wrote:

On Fri, Sep 26, 2008 at 2:36 PM, Tony S Yu <tonyyu@...608...> wrote:

Tony S Yu wrote:

+ if all(nonzero == False):
+ raise ValueError('spy cannot plot sparse zeros
matrix')

Is raising an exception the right choice here -- why can't we plot an
all zeros image?

JDH

I guess you could plot sparse all-zero matrices with image mode. My only hesitation is that sparse arrays tend to be very large and (I imagine) this would lead to very slow performance. I assumed this was the reason image mode wasn't adapted to use sparse arrays.

Also, if an image cannot be resolved by the output device, info is lost--one might not see anything at a location where there actually is a value--whereas with markers, a marker will always show up, and the only problem is that one can't necessarily distinguish a single point from a cluster.

The real problem with all-zero values is that plot can't handle "plot([],[])". One can work around this by putting in bogus values to plot a single point, saving the line, and then setting the line data to empty; or, better, by not using the high-level plot command, but by generating the Line2D object and adding it to the axes. The Line2D initializer is happy with empty x and y sequences. I think if you use this approach it will kill two bugs (failure on all-zeros with sparse and full arrays) with one very simple stone.

Eric

···

On Sep 26, 2008, at 2:28 PM, John Hunter wrote:

On Fri, Sep 26, 2008 at 12:39 PM, Tony S Yu <tonyyu@...608...> wrote:

Actually, now that I think about it: you could plot a trivially small image and just adjust the coordinates so that they correspond to the original matrix shape. Is this what you were thinking?

I should note that a dense zero array also fails to plot with spy *if marker mode is used*.

-T

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Thanks for the tip Eric. Below is a patch for spy that implements Eric's suggestion. This patch seems to work for a couple simple tests on my end: sparse and dense arrays with non-zero and all-zero values.

A couple of notes:

* the call to `add_artist` isn't needed to show the correct plot, but it may be helpful for debugging.

* the docstring for `spy` suggests that a Line2D instance is returned, but `spy` currently returns a list with a Line2D instance. I set all-zero arrays to return a list also, for consistency.

-Tony

Index: matplotlib/lib/matplotlib/axes.py

···

On Sep 26, 2008, at 5:01 PM, Eric Firing wrote:

Also, if an image cannot be resolved by the output device, info is lost--one might not see anything at a location where there actually is a value--whereas with markers, a marker will always show up, and the only problem is that one can't necessarily distinguish a single point from a cluster.

The real problem with all-zero values is that plot can't handle "plot([],[])". One can work around this by putting in bogus values to plot a single point, saving the line, and then setting the line data to empty; or, better, by not using the high-level plot command, but by generating the Line2D object and adding it to the axes. The Line2D initializer is happy with empty x and y sequences. I think if you use this approach it will kill two bugs (failure on all-zeros with sparse and full arrays) with one very simple stone.

Eric

===================================================================
--- matplotlib/lib/matplotlib/axes.py (revision 6123)
+++ matplotlib/lib/matplotlib/axes.py (working copy)
@@ -6723,9 +6723,9 @@
          else:
              if hasattr(Z, 'tocoo'):
                  c = Z.tocoo()
- y = c.row
- x = c.col
- z = c.data
+ nonzero = c.data != 0.
+ y = c.row[nonzero]
+ x = c.col[nonzero]
              else:
                  Z = np.asarray(Z)
                  if precision is None: mask = Z!=0.
@@ -6733,8 +6733,12 @@
                  y,x,z = mlab.get_xyz_where(mask, mask)
              if marker is None: marker = 's'
              if markersize is None: markersize = 10
- lines = self.plot(x, y, linestyle='None',
- marker=marker, markersize=markersize, **kwargs)
+ if len(x) == 0:
+ lines = [mlines.Line2D([], [])]
+ self.add_artist(lines[0])
+ else:
+ lines = self.plot(x, y, linestyle='None',
+ marker=marker, markersize=markersize, **kwargs)
              nr, nc = Z.shape
              self.set_xlim(xmin=-0.5, xmax=nc-0.5)
              self.set_ylim(ymin=nr-0.5, ymax=-0.5)
Index: matplotlib/examples/pylab_examples/masked_demo.py

--- matplotlib/examples/pylab_examples/masked_demo.py (revision 6123)
+++ matplotlib/examples/pylab_examples/masked_demo.py (working copy)
@@ -1,4 +1,4 @@
-#!/bin/env python
+#!/usr/bin/env python
  '''
  Plot lines with points masked out.

Index: matplotlib/examples/misc/rec_groupby_demo.py

--- matplotlib/examples/misc/rec_groupby_demo.py (revision 6123)
+++ matplotlib/examples/misc/rec_groupby_demo.py (working copy)
@@ -2,7 +2,7 @@
  import matplotlib.mlab as mlab

-r = mlab.csv2rec('data/aapl.csv')
+r = mlab.csv2rec('../data/aapl.csv')
  r.sort()

  def daily_return(prices):

Tony S Yu wrote:

Also, if an image cannot be resolved by the output device, info is lost--one might not see anything at a location where there actually is a value--whereas with markers, a marker will always show up, and the only problem is that one can't necessarily distinguish a single point from a cluster.

The real problem with all-zero values is that plot can't handle "plot([],[])". One can work around this by putting in bogus values to plot a single point, saving the line, and then setting the line data to empty; or, better, by not using the high-level plot command, but by generating the Line2D object and adding it to the axes. The Line2D initializer is happy with empty x and y sequences. I think if you use this approach it will kill two bugs (failure on all-zeros with sparse and full arrays) with one very simple stone.

Eric

Thanks for the tip Eric. Below is a patch for spy that implements Eric's suggestion. This patch seems to work for a couple simple tests on my end: sparse and dense arrays with non-zero and all-zero values.

Tony,

Thanks. I will take care of this shortly, along with fixing the failure of plot([],[]) and maybe a few other things.

Eric

···

On Sep 26, 2008, at 5:01 PM, Eric Firing wrote:

A couple of notes:

* the call to `add_artist` isn't needed to show the correct plot, but it may be helpful for debugging.

* the docstring for `spy` suggests that a Line2D instance is returned, but `spy` currently returns a list with a Line2D instance. I set all-zero arrays to return a list also, for consistency.

-Tony

Index: matplotlib/lib/matplotlib/axes.py

--- matplotlib/lib/matplotlib/axes.py (revision 6123)
+++ matplotlib/lib/matplotlib/axes.py (working copy)
@@ -6723,9 +6723,9 @@
         else:
             if hasattr(Z, 'tocoo'):
                 c = Z.tocoo()
- y = c.row
- x = c.col
- z = c.data
+ nonzero = c.data != 0.
+ y = c.row[nonzero]
+ x = c.col[nonzero]
             else:
                 Z = np.asarray(Z)
                 if precision is None: mask = Z!=0.
@@ -6733,8 +6733,12 @@
                 y,x,z = mlab.get_xyz_where(mask, mask)
             if marker is None: marker = 's'
             if markersize is None: markersize = 10
- lines = self.plot(x, y, linestyle='None',
- marker=marker, markersize=markersize, **kwargs)
+ if len(x) == 0:
+ lines = [mlines.Line2D([], [])]
+ self.add_artist(lines[0])
+ else:
+ lines = self.plot(x, y, linestyle='None',
+ marker=marker, markersize=markersize, **kwargs)
             nr, nc = Z.shape
             self.set_xlim(xmin=-0.5, xmax=nc-0.5)
             self.set_ylim(ymin=nr-0.5, ymax=-0.5)
Index: matplotlib/examples/pylab_examples/masked_demo.py

--- matplotlib/examples/pylab_examples/masked_demo.py (revision 6123)
+++ matplotlib/examples/pylab_examples/masked_demo.py (working copy)
@@ -1,4 +1,4 @@
-#!/bin/env python
+#!/usr/bin/env python
'''
Plot lines with points masked out.

Index: matplotlib/examples/misc/rec_groupby_demo.py

--- matplotlib/examples/misc/rec_groupby_demo.py (revision 6123)
+++ matplotlib/examples/misc/rec_groupby_demo.py (working copy)
@@ -2,7 +2,7 @@
import matplotlib.mlab as mlab

-r = mlab.csv2rec('data/aapl.csv')
+r = mlab.csv2rec('../data/aapl.csv')
r.sort()

def daily_return(prices):

Tony S Yu wrote:

Also, if an image cannot be resolved by the output device, info is lost--one might not see anything at a location where there actually is a value--whereas with markers, a marker will always show up, and the only problem is that one can't necessarily distinguish a single point from a cluster.

The real problem with all-zero values is that plot can't handle "plot([],[])". One can work around this by putting in bogus values to plot a single point, saving the line, and then setting the line data to empty; or, better, by not using the high-level plot command, but by generating the Line2D object and adding it to the axes. The Line2D initializer is happy with empty x and y sequences. I think if you use this approach it will kill two bugs (failure on all-zeros with sparse and full arrays) with one very simple stone.

Eric

Thanks for the tip Eric. Below is a patch for spy that implements Eric's suggestion. This patch seems to work for a couple simple tests on my end: sparse and dense arrays with non-zero and all-zero values.

A couple of notes:

* the call to `add_artist` isn't needed to show the correct plot, but it may be helpful for debugging.

* the docstring for `spy` suggests that a Line2D instance is returned, but `spy` currently returns a list with a Line2D instance. I set all-zero arrays to return a list also, for consistency.

Tony,

Changes to spy and a few other things are in svn 6127.

Regarding your last point, the docstring made more sense than the original implementation, so I changed the implementation to correspond to it. I hope this does not cause more trouble than it is worth; if it looks like it will, then I can easily change the behavior back and modify the docstring. It seems silly to always return a list with a single item, though, and I doubt that many people are making heavy use of the return value of the spy method anyway.

Regarding your original idea, that sparse arrays should be handled like ordinary arrays, with only nonzero values plotted: I think this is going too far, and not far enough, so I did the following:

1) If "precision" is None or a non-zero value, the behavior for sparse and ordinary arrays is identical. Previously, the precision kwarg was silently ignored for sparse arrays. Now it is used.

2) If "precision" is 0, then one gets the old behavior: all locations with data are shown, regardless of value. It seems to me that one really wants to have this behavior available, to see how much of a sparse array is filled in.

I am not entirely comfortable with the way the "precision" kwarg is being used to control this, but it seemed preferable to adding another kwarg. Alternatives could include swapping the roles of 0 and None, or letting precision take a string value to specify the old behavior.

Actually, I think the most logical thing would be to let the default None give the old behavior, and require precision=0 to get the new behavior. What do you think? Is it OK if I make this change? It is more consistent with the old behavior.

I also changed the behavior so that if a sparse array is input, with no marker specifications, it simply makes a default marker plot instead of raising an exception.

Eric

···

On Sep 26, 2008, at 5:01 PM, Eric Firing wrote:

-Tony

Index: matplotlib/lib/matplotlib/axes.py

--- matplotlib/lib/matplotlib/axes.py (revision 6123)
+++ matplotlib/lib/matplotlib/axes.py (working copy)
@@ -6723,9 +6723,9 @@
         else:
             if hasattr(Z, 'tocoo'):
                 c = Z.tocoo()
- y = c.row
- x = c.col
- z = c.data
+ nonzero = c.data != 0.
+ y = c.row[nonzero]
+ x = c.col[nonzero]
             else:
                 Z = np.asarray(Z)
                 if precision is None: mask = Z!=0.
@@ -6733,8 +6733,12 @@
                 y,x,z = mlab.get_xyz_where(mask, mask)
             if marker is None: marker = 's'
             if markersize is None: markersize = 10
- lines = self.plot(x, y, linestyle='None',
- marker=marker, markersize=markersize, **kwargs)
+ if len(x) == 0:
+ lines = [mlines.Line2D([], [])]
+ self.add_artist(lines[0])
+ else:
+ lines = self.plot(x, y, linestyle='None',
+ marker=marker, markersize=markersize, **kwargs)
             nr, nc = Z.shape
             self.set_xlim(xmin=-0.5, xmax=nc-0.5)
             self.set_ylim(ymin=nr-0.5, ymax=-0.5)
Index: matplotlib/examples/pylab_examples/masked_demo.py

--- matplotlib/examples/pylab_examples/masked_demo.py (revision 6123)
+++ matplotlib/examples/pylab_examples/masked_demo.py (working copy)
@@ -1,4 +1,4 @@
-#!/bin/env python
+#!/usr/bin/env python
'''
Plot lines with points masked out.

Index: matplotlib/examples/misc/rec_groupby_demo.py

--- matplotlib/examples/misc/rec_groupby_demo.py (revision 6123)
+++ matplotlib/examples/misc/rec_groupby_demo.py (working copy)
@@ -2,7 +2,7 @@
import matplotlib.mlab as mlab

-r = mlab.csv2rec('data/aapl.csv')
+r = mlab.csv2rec('../data/aapl.csv')
r.sort()

def daily_return(prices):

Hi Eric,

Sorry for the late reply.

Actually, I think the most logical thing would be to let the default None give the old behavior, and require precision=0 to get the new behavior. What do you think? Is it OK if I make this change? It is more consistent with the old behavior.

I'm ambivalent about this change. On one hand, I think it makes a lot more sense to have None give the old behavior and precision=0 to ignore zero values in the sparse array (then precision would be consistent for finite values and for zero).

On the other hand, I think ignoring zero values should be the default behavior for sparse arrays (although, I definitely agree there should be the option to plot all assigned values).

Would it be possible to make the change you suggest and also change the default precision value to 0? (see diff below) This change would also allow you to remove a lot of the special handling for precision=None, since precision=0 gives the same result (I didn't go this far in the diff below).

I also changed the behavior so that if a sparse array is input, with no marker specifications, it simply makes a default marker plot instead of raising an exception.

Excellent idea. That behavior is much more user-friendly.

Thanks,

-Tony

PS. Any comments on the small changes to the examples. Both changes are necessary for those examples to work on my computer (the shebang line throws an error when I run the code from my text editor).

Index: matplotlib/lib/matplotlib/axes.py

···

On Sep 27, 2008, at 8:56 PM, Eric Firing wrote:

--- matplotlib/lib/matplotlib/axes.py (revision 6141)
+++ matplotlib/lib/matplotlib/axes.py (working copy)
@@ -6648,7 +6648,7 @@

          return Pxx, freqs, bins, im

- def spy(self, Z, precision=None, marker=None, markersize=None,
+ def spy(self, Z, precision=0., marker=None, markersize=None,
              aspect='equal', **kwargs):
          """
          call signature::
@@ -6731,14 +6731,11 @@
          else:
              if hasattr(Z, 'tocoo'):
                  c = Z.tocoo()
- if precision == 0:
+ if precision is None:
                      y = c.row
                      x = c.col
                  else:
- if precision is None:
- nonzero = c.data != 0.
- else:
- nonzero = np.absolute(c.data) > precision
+ nonzero = np.absolute(c.data) > precision
                      y = c.row[nonzero]
                      x = c.col[nonzero]
              else:
Index: matplotlib/examples/pylab_examples/masked_demo.py

--- matplotlib/examples/pylab_examples/masked_demo.py (revision 6141)
+++ matplotlib/examples/pylab_examples/masked_demo.py (working copy)
@@ -1,4 +1,4 @@
-#!/bin/env python
+#!/usr/bin/env python
  '''
  Plot lines with points masked out.

Index: matplotlib/examples/misc/rec_groupby_demo.py

--- matplotlib/examples/misc/rec_groupby_demo.py (revision 6141)
+++ matplotlib/examples/misc/rec_groupby_demo.py (working copy)
@@ -2,7 +2,7 @@
  import matplotlib.mlab as mlab

-r = mlab.csv2rec('data/aapl.csv')
+r = mlab.csv2rec('../data/aapl.csv')
  r.sort()

  def daily_return(prices):

Tony S Yu wrote:

Hi Eric,

Sorry for the late reply.

Actually, I think the most logical thing would be to let the default None give the old behavior, and require precision=0 to get the new behavior. What do you think? Is it OK if I make this change? It is more consistent with the old behavior.

I'm ambivalent about this change. On one hand, I think it makes a lot more sense to have None give the old behavior and precision=0 to ignore zero values in the sparse array (then precision would be consistent for finite values and for zero).

On the other hand, I think ignoring zero values should be the default behavior for sparse arrays (although, I definitely agree there should be the option to plot all assigned values).

Would it be possible to make the change you suggest and also change the default precision value to 0? (see diff below) This change would also allow you to remove a lot of the special handling for precision=None, since precision=0 gives the same result (I didn't go this far in the diff below).

Good point. I made that change, but then made precision='present' be the value for sparse arrays to show all filled cells. precision=None is deprecated, but converted to 0.

Eric

···

On Sep 27, 2008, at 8:56 PM, Eric Firing wrote: