Dear all matplotlib users,
Happy New Year.
I try to check the distribution of a 2D array and I find that the histogram plot function doesn’t respect the numpy masked array?
In [188]: a=range(1,6); b=np.array(a+a[::-1])
In [189]: b=np.ma.masked_equal(b,2); b=np.ma.masked_equal(b,5)
In [190]: b
Out[190]:
masked_array(data = [1 – 3 4 – -- 4 3 – 1],
mask = [False True False False True True False False True False],
fill_value = 5)
In [191]: n,bins,patches=plt.hist(b)
In [192]: n
Out[192]: array([2, 0, 2, 0, 0, 2, 0, 2, 0, 2])
In [193]: n.sum()
Out[193]: 10
it seems that all the elements (masked or not) are counted in the history plotting?
and the original value is used but not the fill_value?
I attach a figure below.
In [194]: plt.show()
···
–
Chao YUE
Laboratoire des Sciences du Climat et de l’Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
Yes, this is a known issue (at least, from the comments within the function). Looks like hist() uses np.asarray() instead of np.asanyarray(), which would result in the array being stripped of its mask. However, I don’t think the fix is as straight-forward as changing that to np.asanyarray(). I will take a peek and see what can be done.
Ben Root
···
On Mon, Jan 2, 2012 at 11:10 AM, Chao YUE <chaoyuejoy@…1896…> wrote:
Dear all matplotlib users,
Happy New Year.
I try to check the distribution of a 2D array and I find that the histogram plot function doesn’t respect the numpy masked array?
In [188]: a=range(1,6); b=np.array(a+a[::-1])
In [189]: b=np.ma.masked_equal(b,2); b=np.ma.masked_equal(b,5)
In [190]: b
Out[190]:
masked_array(data = [1 – 3 4 – – 4 3 – 1],
mask = [False True False False True True False False True False],
fill_value = 5)
In [191]: n,bins,patches=plt.hist(b)
In [192]: n
Out[192]: array([2, 0, 2, 0, 0, 2, 0, 2, 0, 2])
In [193]: n.sum()
Out[193]: 10
it seems that all the elements (masked or not) are counted in the history plotting?
and the original value is used but not the fill_value?
I attach a figure below.
In [194]: plt.show()
Thanks Ben.
cheers,
Chao
2012/1/2 Benjamin Root <ben.root@…1304…>
···
On Mon, Jan 2, 2012 at 11:10 AM, Chao YUE <chaoyuejoy@…287…> wrote:
Dear all matplotlib users,
Happy New Year.
I try to check the distribution of a 2D array and I find that the histogram plot function doesn’t respect the numpy masked array?
In [188]: a=range(1,6); b=np.array(a+a[::-1])
In [189]: b=np.ma.masked_equal(b,2); b=np.ma.masked_equal(b,5)
In [190]: b
Out[190]:
masked_array(data = [1 – 3 4 – – 4 3 – 1],
mask = [False True False False True True False False True False],
fill_value = 5)
In [191]: n,bins,patches=plt.hist(b)
In [192]: n
Out[192]: array([2, 0, 2, 0, 0, 2, 0, 2, 0, 2])
In [193]: n.sum()
Out[193]: 10
it seems that all the elements (masked or not) are counted in the history plotting?
and the original value is used but not the fill_value?
I attach a figure below.
In [194]: plt.show()
Yes, this is a known issue (at least, from the comments within the function). Looks like hist() uses np.asarray() instead of np.asanyarray(), which would result in the array being stripped of its mask. However, I don’t think the fix is as straight-forward as changing that to np.asanyarray(). I will take a peek and see what can be done.
Ben Root
–
Chao YUE
Laboratoire des Sciences du Climat et de l’Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16