Andrew Jaffe wrote:
Hi All,
I've encountered a strange problem: I've been running some python code
on both a linux box and OS X, both with python 2.4.1 and the latest
numpy and matplotlib from svn.
I have found that when I transfer pickled numpy arrays from one machine
to the other (in either direction), the resulting data *looks* all right
(i.e., it is a numpy array of the correct type with the correct values
at the correct indices), but it seems to produce the wrong result in (at
least) one circumstance: matplotlib.hist() gives the completely wrong
picture (and set of bins).
This can be ameliorated by running the array through
arr=numpy.asarray(arr, dtype=numpy.float64)
but this seems like a complete kludge (and is only needed when you do
the transfer between machines).
You have a byteorder issue. You Linux box, which I presume has an Intel or AMD
CPU, is little-endian where your OS X box, which I presume has a PPC CPU, is
big-endian. numpy arrays can store their data in either endianness on either
kind of platform; their dtype objects tell you which byteorder they are using.
In the dtype specifications below, '>' means big-endian (I am using a PPC
PowerBook), and '<' means little-endian.
In [31]: a = linspace(0, 10, 11)
In [32]: a
Out[32]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
In [33]: a.dtype
Out[33]: dtype('>f8')
In [34]: b = a.newbyteorder()
In [35]: b
Out[35]:
array([ 0.00000000e+000, 3.03865194e-319, 3.16202013e-322,
1.04346664e-320, 2.05531309e-320, 2.56123631e-320,
3.06715953e-320, 3.57308275e-320, 4.07900597e-320,
4.33196758e-320, 4.58492919e-320])
In [36]: b.dtype
Out[36]: dtype('<f8')
In [41]: a.tostring()[-8:]
Out[41]: '@$\x00\x00\x00\x00\x00\x00'
In [42]: b.tostring()[-8:]
Out[42]: '@$\x00\x00\x00\x00\x00\x00'
Apparently, the pickle stores the data in the creator machine's byteorder and so
marked. When the reading machine loads the pickle, it recognizes that the
byteorder is opposite its native byteorder by its dtype.
Most operations work as you might expect:
In [44]: a.astype(dtype('<f8'))
Out[44]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
In [45]: c = _
In [46]: c.dtype
Out[46]: dtype('<f8')
In [47]: a + c
Out[47]: array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.])
Some don't:
In [54]: c.sort()
In [55]: c
Out[55]: array([ 0., 2., 3., 4., 5., 6., 7., 8., 9., 10., 1.])
This is a bug.
http://projects.scipy.org/scipy/numpy/ticket/47
···
--
Robert Kern
robert.kern@...287...
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco