As any of you subscribed to the numpy-discussion list will have
probably noticed, there's intense debate going on about how numpy can
do a better job of handling missing data and masked arrays. Part of
the problem is that we aren't actually sure what users need these
features to do. There's one group who just wants R-style "missing
data", and their needs are pretty straightforward -- they just want a
magic value that indicates some data point doesn't actually exist. But
it seems like there's also demand for a more "masked array"-like
feature, similar to the current numpy.ma, where the mask is
non-destructive and easily manipulable. No-one seems clear on who
exactly this should work, though, and there's a lot of disagreement
about what semantics make sense. (If you want more details, there's a
wiki page summarizing some of this).
Since you seem to be the biggest users of numpy.ma, it would be really
helpful if you could explain how you actually use it, so we can make
sure that whatever we do in numpy-land is actually useful to you!
What does matplotlib use masked arrays for? Is it just a convenient
way to keep an array and a boolean mask together in one object, or do
you take advantage of more numpy.ma features? For example, do you
- unmask values?
- create multiple arrays that share the same storage for their data,
but have different masks? (i.e., creating a new array with new
elements masked, but without actually allocating the memory for a full
- use reduction operations on masked arrays? (e.g., np.sum(masked_arr))
- use binary operations on masked arrays? (e.g., masked_arr1 + masked_arr2)
And while we're at it, any complaints about how numpy.ma works now,
that a new version might do better?
Thanks in advance,