This is one of the big differences between python and matlab: in

matlab, if an m-file has changed within a session, the change is

immediately effective. The python "import" statement is very

different.

Gotchya, thanks.

So, while I'm being a bother:

in Matlab, I often organize data in structures as:

adcp.time [1xN] adcp.z [Mx1] adcp.u [MxN]

where time is the x-axis, z the z-axis and u an array of values at

each depth and time (an example chosen after Eric's heart).

What is the recommended way to represent this in python? I see the

info about numpy structured arrays. Is that it? It also seems that

Mx1 arrays are hard in python. It also seems you need to preallocate

the whole array, which isn't very flexible compared to how you can do

it in Matlab. Am I missing something?

Jody,

A structured array is probably overkill; it would require storing everything as MxN, which may not be necessary.

Most of the time, if you have something that is 1-D, you can just keep it in a 1-D array. If you need adcp.time to behave as if it were MxN, you can just use it as-is, because numpy broadcasting will add dimensions to the left as needed. If you need adcp.z to behave as if it were MxN, you can simply index it like this: adcp.z[:, np.newaxis].

Now, for the structure syntax, you can use a class, e.g.

class Data:

pass

adcp = Data()

adcp.time = time

adcp.z = z

adcp.u = u

Now your adcp instance is just like the matlab structure.

This works, but you might want to use a more flexible container. One variation on the Bunch is here:

http://currents.soest.hawaii.edu/hgstage/pycurrents/file/8bf05a53b326/system/misc.py.

It is fancier than you need for now, but illustrates the sort of thing you can do with python, and it will work fine even when you don't need all its features. You could initialize it like this:

adcp = Bunch(time=time, z=z, u=u)

assuming, as before, that you already have individual numpy arrays called time, z, and u. You can still tack on additional attributes, like

adcp.something_else = whatever

The Bunch allows access using the structure notation, and also using dictionary syntax, so adcp.u is the same as adcp['u']. The dictionary syntax is particularly useful when automating operations, because you can easily iterate over a list of dictionary entries.

Regarding the need to pre-allocate: yes, matlab is slicker in this regard, and every now and then there is discussion about implementing equivalent behavior in numpy, or in an add-on module.

In many cases you can simply accumulate values in a list, and then at the end use an array constructor to make an ndarray from the list.

You can also use the numpy concatenate function, or its derivatives, but this usually makes sense only for gluing together small numbers of arrays.

Eric

## ···

On 2012/09/08 5:34 PM, Jody Klymak wrote:

Thanks, Jody