numpy and matplotlib usage

Hi,

I'm using matplotlib w/numerix set to numpy (as described in my prior post).

What I am wondering is in what situations one would want to:

import pylab
import numpy

together, because there is matlab-style stuff (e.g. matrices, arrays, cumprod, fft, arange etc.) by importing the pylab package alone.

On a related note, in trying to get a better idea of how math vs. plot functionality is handled in matplotlib, I looked at mlab.py and pylab.py; it doesn't seem there is a clear separation of these functions, but perhaps I'm missing something?

Thanks!

--b

Belinda,

I will give a short answer, and maybe someone else will be able to provide a more complete answer or a reference to one.

belinda thom wrote:

Hi,

I'm using matplotlib w/numerix set to numpy (as described in my prior post).

What I am wondering is in what situations one would want to:

import pylab
import numpy

One reason for doing this would be make it clear where your numeric components are really coming from, and to take full advantage of numpy.
An idiom that I often use is
import pylab as P
import numpy as N
and then anything that ultimately comes from numpy anyway, like arange, I invoke as N.arange rather than P.arange. Or you can do it all at the top of the script with "from pylab import ..." and "from numpy import ...".

together, because there is matlab-style stuff (e.g. matrices, arrays, cumprod, fft, arange etc.) by importing the pylab package alone.

Pylab is a convenience package that aggregates plotting and numerical functionality in a single namespace while hiding the differences among numeric packages. This support of alternative numeric packages has been an important part of matplotlib, but its utility will diminish and it will be phased out now that numpy is ready to use. An advantage of using numpy directly now, rather than via the parts of it that pylab imports, is that you will be getting accustomed to the new numpy style rather than the compatibility mode.

On a related note, in trying to get a better idea of how math vs. plot functionality is handled in matplotlib, I looked at mlab.py and pylab.py; it doesn't seem there is a clear separation of these functions, but perhaps I'm missing something?

You are correct. Pylab imports everything from matplotlib.numerix.mlab. Confusingly, there is a matplotlib.mlab and a matplotlib.numerix.mlab which uses an mlab from the selected numerical package. The numpy version of mlab, in turn, imports lots of stuff from numpy--it is another aggregator. It seems that numpy's mlab is trying to give a Matlab-like environment without the plotting, and pylab is trying to do the same thing but with the plotting included.

The net results of the attempts of numpy and pylab to make things convenient by aggregating functionality in a single namespace are (1) it really can be convenient--a one-stop shop, and (2) it can be very confusing with all the possible places things can get imported from, some of them with similar names.

Ipython is enormously helpful in showing where things really come from. For example in an "ipython -pylab" session, where "from pylab import *" has been done automatically by ipython, we get:

In [1]:arange?
Type: function
Base Class: <type 'function'>
String Form: <function arange at 0xb54f8ca4>
Namespace: Interactive
File: /usr/local/lib/python2.4/site-packages/numpy/oldnumeric/functions.py
Definition: arange(start, stop=None, step=1, typecode=None, dtype=None)
Docstring:
     <no docstring>

Note that pylab is pulling in numpy's backwards-compatibility version of arange rather than its native version.

Here is another example of different versions of a function with the same name, first the pylab version, then the numpy version:
In [3]:linspace?
Type: function
Base Class: <type 'function'>
String Form: <function linspace at 0xb53f59cc>
Namespace: Interactive
File: /usr/local/lib/python2.4/site-packages/matplotlib/mlab.py
Definition: linspace(xmin, xmax, N)
Docstring:
     <no docstring>

In [4]:import numpy

In [5]:numpy.linspace?
Type: function
Base Class: <type 'function'>
String Form: <function linspace at 0xb5570df4>
Namespace: Interactive
File: /usr/local/lib/python2.4/site-packages/numpy/lib/function_base.py
Definition: numpy.linspace(start, stop, num=50, endpoint=True, retstep=False)
Docstring:
     Return evenly spaced numbers.

     Return num evenly spaced samples from start to stop. If
     endpoint is True, the last sample is stop. If retstep is
     True then return the step value used.

Depending on what you are doing and on the style you prefer, you may want to use as much as possible from pylab, or you may want to use only a few things from the pylab namespace and then explicitly use the matplotlib and numpy namespaces (and object methods) for everything else. A bit of this is discussed in the matplotlib examples/pythonic_matplotlib.py.

I am inclined to directly invoke numpy instead of going through pylab and numerix.

That was longer than I intended for this message... but the complexity of A importing things from B and B from A via B.C, etc. is making my head spin. I hope that in the process of switching matplotlib to numpy-only we will be able to simplify all this.

Eric

Hi Belinda,

Hi,

I'm using matplotlib w/numerix set to numpy (as described in my prior
post).

What I am wondering is in what situations one would want to:

import pylab
import numpy

together, because there is matlab-style stuff (e.g. matrices, arrays,
cumprod, fft, arange etc.) by importing the pylab package alone.

To add to Eric's detailed reply, keep in mind that much of this
duplication is a historical accident. John Hunter developed mpl (and
hence pylab) back in the Dark Days of the Split (aka, when we lived
with Numeric and Numarray, both lacking critical functionality). At
that time he needed various pieces of numerical functionality for his
own work, so the most logical thing to do was to put it in the package
he had control over: matplotlib. In fact, the same thing happened in
three places: if you look at the python landscape for these tools
around 2003/4, you'll find that ipython, scipy and matplotlib ALL had
tools for: interactive work, plotting and numerics. Over time, as
each package has matured, we've all tried to move away from this, so
that hopefully the responsibilities will be:

- ipython -> interactive work
- numpy/scipy -> numerics
- matploblib -> plotting

While little code has been removed yet (to avoid breaking
compatibility for existing users), at least most of what mattered has
been moved to where it makes sense: numpy inherited utilities from
ipython and pylab, ipython has absorbed the interactive support for
matplotlib and I don't develop its plotting tools anymore (they were
for gnuplot), etc.

Following these ideas, in my personal use I normally do:

import numpy as N
import scipy as S
import pylab as P

and I try to use P.plottingStuff(), N.arrayStuff() and
S.scipyOnlyThings(). I think this is an approach that better matches
the real intent of these tools for the long term.

I hope this is useful.

best,

f

···

On 12/29/06, belinda thom <bthom@...1382...> wrote:

Fernando and Eric have offered very nice explanations, but I have one thing to add:

Fernando Perez wrote:

hopefully the responsibilities will be:

- ipython -> interactive work
- numpy/scipy -> numerics
- matploblib -> plotting

I sure hope so too.

Following these ideas, in my personal use I normally do:

import numpy as N
import scipy as S
import pylab as P

I do something very similar - I really believe in namespaces, "import *" is a "bad idea".

However, I also try to avoid pylab altogether, in favor of:

import matplotlib as MPL,

MPL.PlottingStuff()

matplotlib provides the plotting functionality in a nice OO way. pylab is essentially a wrapper that provides a Matlab-like procedural interface to matplotlib. For me, one of the reasons I use python, rather than Matlab is that it is a richer, more feature-full language, including OO. Some folks think the procedural approach is better for interactive use, but I'm not so sure, and I'm quite sure that the OO approach is better for "real programs"

At this point, it isn't quite possible to use just matplotlib for interactive use, as pylab has the functionality to manage the figure windows, etc, so I do use a tiny bit of pylab there, but try to keep to the OO interface otherwise.

THe key stumbling block is docs -- most of the docs and tutorials use the pylab interface, so it's a bit harder to figure out what to do. This should help get you started:

http://sourceforge.net/mailarchive/message.php?msg_id=11033442

(Did that ever get up on the Wiki?, and/or does anyone have other pointers?)

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...259...

Fernando Perez wrote:

hopefully the responsibilities will be:

- ipython -> interactive work
- numpy/scipy -> numerics
- matploblib -> plotting

I sure hope so too.

Following these ideas, in my personal use I normally do:

import numpy as N
import scipy as S
import pylab as P

At this point, it isn't quite possible to use just matplotlib for
interactive use, as pylab has the functionality to manage the figure
windows, etc, so I do use a tiny bit of pylab there, but try to keep to
the OO interface otherwise.

A tangential question; recently I was looking for a way to save/load numeric data (often so it could be used later for building plots). I found load/save better documented than numpy's to/fromfile, so used that. The question, from the responsibility point of view, however, is where I should be going to get this functionality. Are both equally "stable"? Also, since numpy borrows from matlab, I was surprised that load/save is only provided via matplotlib's mlab.py (its not in numpy's matlib.py).

THe key stumbling block is docs -- most of the docs and tutorials use
the pylab interface, so it's a bit harder to figure out what to do. This
should help get you started:

Agreed.

http://sourceforge.net/mailarchive/message.php?msg_id=11033442

I never found that on the wiki, and have spent some time looking thru it.

(Did that ever get up on the Wiki?, and/or does anyone have other pointers?)

As an experienced Matlab programmer, I have found

http://37mm.no/mpy/matlab-numpy.html

quite useful, but this isn't emphasizing the OO aspect of matplotlib.

···

On Jan 3, 2007, at 10:01 AM, Christopher Barker wrote:

-Chris

belinda thom schrieb:

A tangential question; recently I was looking for a way to save/load
numeric data (often so it could be used later for building plots). I
found load/save better documented than numpy's to/fromfile, so used
that. The question, from the responsibility point of view, however,
is where I should be going to get this functionality. Are both
equally "stable"? Also, since numpy borrows from matlab, I was
surprised that load/save is only provided via matplotlib's mlab.py
(its not in numpy's matlib.py).

Maybe you know that already, but in scipy there is something like
scipy.io.read_array and write_array which is very similiar to mpl's
load/save (IIRC).

IMHO something like that would be a welcome addition to numpy, but I
have learned that adding features to numpy is quite controversial...
(oh, we're on the mpl list right now, ok)

In the end I wrote my own csv read and write functions (see post in
other thread), because I didn't see why my code should depend on having
scipy or mpl installed just because of 20 or 30 lines of code.

(AFAIK numpy fromfile and tofile are for binary data, not text files.
Don't know if you want that.)

So if you never use numpy standalone, I'd say go for mpl's load/save.

cheers,
sven

You can use numpy's tofile and fromfile for text files, you just have to add a
delimiter kwarg. However, fromfile only returns 1D arrays.

···

On Thursday 04 January 2007 06:13, Sven Schreiber wrote:

belinda thom schrieb:
> A tangential question; recently I was looking for a way to save/load
> numeric data (often so it could be used later for building plots). I
> found load/save better documented than numpy's to/fromfile, so used
> that. The question, from the responsibility point of view, however,
> is where I should be going to get this functionality. Are both
> equally "stable"? Also, since numpy borrows from matlab, I was
> surprised that load/save is only provided via matplotlib's mlab.py
> (its not in numpy's matlib.py).

Maybe you know that already, but in scipy there is something like
scipy.io.read_array and write_array which is very similiar to mpl's
load/save (IIRC).

IMHO something like that would be a welcome addition to numpy, but I
have learned that adding features to numpy is quite controversial...
(oh, we're on the mpl list right now, ok)

In the end I wrote my own csv read and write functions (see post in
other thread), because I didn't see why my code should depend on having
scipy or mpl installed just because of 20 or 30 lines of code.

(AFAIK numpy fromfile and tofile are for binary data, not text files.
Don't know if you want that.)

Sven Schreiber wrote:

belinda thom schrieb:

Also, since numpy borrows from matlab,

not really -- pylab is specifically designed to be similar to matlab, numpy is not -- and the matlib is left over from Numeric, and I don't think it was all that well maintained there, either.

Maybe you know that already, but in scipy there is something like
scipy.io.read_array and write_array which is very similiar to mpl's
load/save (IIRC).

IMHO something like that would be a welcome addition to numpy, but I
have learned that adding features to numpy is quite controversial...
(oh, we're on the mpl list right now, ok)

It's an obvious thing to have in a comprehensive package -- maybe not so much in numpy. numpy does have the core tools to build that sort of thing, however.

In the end I wrote my own csv read and write functions (see post in
other thread), because I didn't see why my code should depend on having
scipy or mpl installed just because of 20 or 30 lines of code.

Again, that is another example of why SciPy should be better modularized -- what if it just depended on having scipy.io installed?

(AFAIK numpy fromfile and tofile are for binary data, not text files.
Don't know if you want that.)

As mentioned, fromfile and tofile do basic text file support -- but not preserving the shape of the array. However, what it does provide is very good performance for reading/writing lots of numbers to/from text files -- they are great tools for building your own text file parser/generator. In fact, your csv reader/writer and much of scipy.io could probably benefit from using them (I don't think it does now, as the text file support is new to numpy).

I don't know if someone has written it yet, but a load/save pair that put a small header with type and shape info in a file, then dumped the contents with tofile(...sep=something) could be pretty handy, and fast.

So if you never use numpy standalone, I'd say go for mpl's load/save.

What about pickle? I'm pretty sure pickle works, doesn't require any additional packages, and preserves the format of the array:

import numpy as N
import cPickle
>>> a = N.arange(24, dtype=N.float)
>>> a.shape = (2,3,4)
>>> a
array([[[ 0., 1., 2., 3.],
         [ 4., 5., 6., 7.],
         [ 8., 9., 10., 11.]],

        [[ 12., 13., 14., 15.],
         [ 16., 17., 18., 19.],
         [ 20., 21., 22., 23.]]])

>>> cPickle.dump(a, file("test.pickle",'wb'))

>>> b = cPickle.load(file("test.pickle",'rb'))

>>> b
array([[[ 0., 1., 2., 3.],
         [ 4., 5., 6., 7.],
         [ 8., 9., 10., 11.]],

        [[ 12., 13., 14., 15.],
         [ 16., 17., 18., 19.],
         [ 20., 21., 22., 23.]]])

This is one of those cases where "There should be one obvious way to do it" is failing :frowning:

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...259...