map projections

OK, made some headway here.

When profiling for performace, I often turn off the GUI so my numbers
aren't influenced by the user interface

> /usr/local/lib/python/profile.py plotmap.py -dAgg > prof.out

Here are the profile results sorted by cumulative time for your script
- only the most expensive functions are shown

   7.590 proj.py:43(__call__)
   7.080 os.py:611(popen2)
   7.050 popen2.py:31(__init__)
   7.050 popen2.py:143(popen2)
   2.610 matlab.py:1305(savefig)

On my system, over half the time is spent running your popen process.
Can't help you with this one :-). To get better numbers and focus on
matplotlib, I cached the xy points that the popen process is
generating to a pickled list.

You're using pcolor to generate the colormap. pcolor uses
collections, which is already pretty fast (agg implements collection
drawing in extension code, no python loops needed to make a pcolor).
However, you can get better performace still from imshow with the data
limits set. This will approximately double the performance of the
image part of the map. I had to make some changes to matplotlib (see
below) because image origin wasn't playing nicely with image extent -
this also fixes Andrew's bug.

For the line part of the map, I extended the
matplotlib.collections.LineCollection class to handle a sequence of
lines, where each line is defined by a list of tuples (x0,y0), (x1,
y1), ... Thus all of your lines are handled by a single object,
rather than having 1800+ separate line objects created in plot.
Again, no python loops required.

In the current form, the code takes about 1.15s to run on my machine
and is about 30x faster than the original code you posted which
includes the data loading part. Nonetheless, the matplotlib part is
much faster too, as you'll see when you interact with the data.

I'm including a link to a matplotlib snapshot below which includes the
required changes. As a bonus, you can try out the new navigation
toolbar (in progress, only works w/ gtk and gtkagg). It includes a
view limits stack, hand pan, and zoom to rectangle. Much nicer for
map navigation. And with the changes, you can actually interact with
your data with reasonable performance. I need to add some more
features to the toolbar, but give it a test drive and let me know if
you have suggestions. Set 'toolbar: toolbar2' in matplotlibrc

Thanks much for the fink package - I'll continue to point OS X users
to your site!

JDH

### matplotlib shapshot:
    http://nitace.bsd.uchicago.edu:8080/files/share/matplotlib-0.60.3a.tar.gz

### Here is a link to the xy point data:
    http://nitace.bsd.uchicago.edu:8080/files/share/xypts.pickled

### Here is the code snippet I used to generate it

xypts = []
for line in wcl:
    splitline = line.split()
    if splitline[0] == '#':
        segnum = segnum + 1
  if segnum > 1:
            # convert lon,lat to map coordinates x,y
            xys = zip(*proj(Numeric.array(lons),Numeric.array(lats)))
            xypts.append( xys)
pickle.dump(xypts, file('xypts.pickled', 'w') )

### Here is the modified plotmap script which uses imshow and line
### collections

import time
from matplotlib.matlab import *
from matplotlib.collections import LineCollection
from proj import Proj
import Numeric, cPickle

# set the default params for imshow
rc('image', origin='lower', cmap='jet')

ax = subplot(111)
nx = 349; ny = 277
dx = 32463.41; dy = 32463.41
xmax = (nx-1)*dx; ymax = (ny-1)*dy # size of domain to plot

datin = open('topodata.pickle','rb')
C = cPickle.load(datin)
# use imshow rather than pcolor for speed
im = ax.imshow(C, interpolation='nearest',
               extent=(0, xmax, 0, ymax))

xypts = cPickle.load(file('xypts.pickled', 'r') )
# all args are sequences, length 1 in case of linewidths and
# antialiased
collection = LineCollection(segments = xypts,
                            colors = ( (0,0,0,1), ), # black
                            linewidths = (1.5,),
                            antialiaseds = (0,), # turn off aa for speed
                            )

ax.add_collection(collection)

# you have to manually update the datalim; this is a bit ugly so I'll
# work on the interface
xs = [ x for thisline in xypts for x,y in thisline ]
ys = [ y for thisline in xypts for x,y in thisline ]
minx, maxx = min(xs), max(xs)
miny, maxy = min(ys), max(ys)
ax.update_datalim( ((minx, miny), (maxx, maxy)))

axis([0, xmax, 0, ymax])
ax.set_xticks([]) # no ticks
ax.set_yticks([])
title('40 km Topography - Lambert Conformal Conic Projection')
#savefig('test4')
show()

John:

OK, made some headway here.

When profiling for performace, I often turn off the GUI so my numbers
aren't influenced by the user interface

> /usr/local/lib/python/profile.py plotmap.py -dAgg > prof.out

Here are the profile results sorted by cumulative time for your script
- only the most expensive functions are shown

  7.590 proj.py:43(__call__)
  7.080 os.py:611(popen2)
  7.050 popen2.py:31(__init__)
  7.050 popen2.py:143(popen2)
  2.610 matlab.py:1305(savefig)

On my system, over half the time is spent running your popen process.
Can't help you with this one :-). To get better numbers and focus on
matplotlib, I cached the xy points that the popen process is
generating to a pickled list.

OK - looks like I have to work on the proj module. I probably should make it a C extension that calls the proj library directly instead of communicating with the proj application via pipes.

You're using pcolor to generate the colormap. pcolor uses
collections, which is already pretty fast (agg implements collection
drawing in extension code, no python loops needed to make a pcolor).
However, you can get better performace still from imshow with the data
limits set. This will approximately double the performance of the
image part of the map. I had to make some changes to matplotlib (see
below) because image origin wasn't playing nicely with image extent -
this also fixes Andrew's bug.

For the line part of the map, I extended the
matplotlib.collections.LineCollection class to handle a sequence of
lines, where each line is defined by a list of tuples (x0,y0), (x1,
y1), ... Thus all of your lines are handled by a single object,
rather than having 1800+ separate line objects created in plot.
Again, no python loops required.

In the current form, the code takes about 1.15s to run on my machine
and is about 30x faster than the original code you posted which
includes the data loading part. Nonetheless, the matplotlib part is
much faster too, as you'll see when you interact with the data.

I'm including a link to a matplotlib snapshot below which includes the
required changes. As a bonus, you can try out the new navigation
toolbar (in progress, only works w/ gtk and gtkagg). It includes a
view limits stack, hand pan, and zoom to rectangle. Much nicer for
map navigation. And with the changes, you can actually interact with
your data with reasonable performance. I need to add some more
features to the toolbar, but give it a test drive and let me know if
you have suggestions. Set 'toolbar: toolbar2' in matplotlibrc

Excellent! I'll do some more work on it and repost a (hopefully) faster version.

Thanks for the quick response.

-Jeff

···

On Tue, 13 Jul 2004, John Hunter wrote:

--
Jeffrey S. Whitaker Phone : (303)497-6313
Meteorologist FAX : (303)497-6449
NOAA/OAR/CDC R/CDC1 Email : Jeffrey.S.Whitaker@...259...
325 Broadway Web : www.cdc.noaa.gov/~jsw
Boulder, CO, USA 80303-3328 Office : Skaggs Research Cntr 1D-124

OK, made some headway here.

....

For the line part of the map, I extended the
matplotlib.collections.LineCollection class to handle a sequence of
lines, where each line is defined by a list of tuples (x0,y0), (x1,
y1), ... Thus all of your lines are handled by a single object,
rather than having 1800+ separate line objects created in plot.
Again, no python loops required.

In the current form, the code takes about 1.15s to run on my machine
and is about 30x faster than the original code you posted which
includes the data loading part. Nonetheless, the matplotlib part is
much faster too, as you'll see when you interact with the data.

John: I found that if I just call proj with all the lats and lons at once (instead of once for each segment) I can speed it up tremendously. Here's what I tried, using the new LineCollection snippets you sent me, and the updated matplotlib snapshot:

from matplotlib.matlab import *
from matplotlib.collections import LineCollection from proj import Proj
import Numeric, time

# open file with coastline data (from world coastline database).
wcl = open('wcl.txt')

# set up map projection parameters (lambert conformal conic,
# standard parallels at 50 deg N, center longitued 107 deg W.
params = {}
params['proj'] = 'lcc'
params['R'] = 63712000
params['lat_1'] = 50
params['lat_2'] = 50
params['lon_0'] = -107
proj = Proj(params)
llcornerx, llcornery = proj(-145.5,1.)
params['x_0'] = -llcornerx # add cartesian offset so lower left corner = (0,0)
params['y_0'] = -llcornery
# create a Proj instance for desired map.
proj = Proj(params)

# set the default params for imshow rc('image', origin='lower', cmap='jet')

ax = subplot(111) nx = 349; ny = 277 dx = 32463.41; dy = 32463.41 xmax = (nx-1)*dx; ymax = (ny-1)*dy # size of domain to plot

t1 = time.clock()
segnum = 0
lons = []
lats = []
for line in wcl:
     splitline = line.split()
     if splitline[0] != '#':
         lon = float(splitline[0])
   lat = float(splitline[1])
   lons.append(lon)
   lats.append(lat)
xs, ys = proj(Numeric.array(lons),Numeric.array(lats))
minx, maxx = min(xs), max(xs) miny, maxy = min(ys), max(ys) xypts = zip(xs.tolist(),ys.tolist())

wcl.close() # close coastline file
# all args are sequences, length 1 in case of linewidths and # antialiased collection = LineCollection(segments = xypts,
                             colors = ( (0,0,0,1), ), # black
                             linewidths = (1.5,),
                             antialiaseds = (0,),) # turn off aa for speed ax.add_collection(collection) ax.update_datalim( ((minx, miny), (maxx, maxy)))

axis([0, xmax, 0, ymax]) ax.set_xticks([]) # no ticks ax.set_yticks([]) title('Lambert Conformal Conic Projection') t2 = time.clock()
print t2-t1
show()

But when I run it, I get the following error:

Traceback (most recent call last):
   File "/sw/lib/python2.3/site-packages/matplotlib/backends/backend_gtkagg.py", line 75, in callback
     self.draw()
   File "/sw/lib/python2.3/site-packages/matplotlib/backends/backend_gtkagg.py", line 42, in draw
     agg.draw()
   File "/sw/lib/python2.3/site-packages/matplotlib/backends/backend_agg.py", line 291, in draw
     self.figure.draw(self.renderer)
   File "/sw/lib/python2.3/site-packages/matplotlib/figure.py", line 236, in draw
     for a in self.axes: a.draw(renderer)
   File "/sw/lib/python2.3/site-packages/matplotlib/axes.py", line 668, in draw
     c.draw(renderer)
   File "/sw/lib/python2.3/site-packages/matplotlib/collections.py", line 286, in draw
     self._transOffset)
TypeError: CXX: type error.

-Jeff

···

On Tue, 13 Jul 2004, John Hunter wrote:

  -- Jeffrey S. Whitaker Phone : (303)497-6313
NOAA/OAR/CDC R/CDC1 FAX : (303)497-6449
325 Broadway Web : http://www.cdc.noaa.gov/~jsw
Boulder, CO, USA 80305-3328 Office: Skaggs Research Cntr 1D-124