RFC on basemap changes

I've created a new basemap branch

https://matplotlib.svn.sourceforge.net/svnroot/matplotlib/trunk/toolkits/basemap-testing

with lots of under-the-hood optimizations and code refactoring. The most significant change is that I now use the GEOS library (http://geos.refractions.net) to determine whether a given coastline or political boundary feature is within the map projection region or not. If it is, the library clips it to the map region, and only the clipped features are processed. This results in big speedups (more than a factor of 10) for small map regions with high-resolution coastlines and boundaries. These changes were motivated by the oceanographers (Eric and Rob Hetland) who often deal with quite small domains.

There are a couple of downsides:

1) an external dependency on the GEOS lib (which is LGPL). I've included a copy of the source in svn, but there's still an extra ./configure; make; make install required. Eric and I searched high and low for a lighter weight, well-tested, BSD licensed solution, without much luck. General, robust polygon clipping is trickier than I thought.

2) more code paths through Pyrex C-extension code, increasing the probability of segfaults and bus errors.

All in all, I think the speed gains and increased code readability are worth it. However, before I make a new release I'd appreciate any feedback. Are there build issues? (esp. on Windows, which I've not tested) Does it feel faster? Is the dependence on an LGPL'ed lib a problem?

-Jeff

···

--
Jeffrey S. Whitaker Phone : (303)497-6313
Meteorologist FAX : (303)497-6449
NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker@...236...
325 Broadway Office : Skaggs Research Cntr 1D-124
Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg

Jeff Whitaker wrote:

1) an external dependency on the GEOS lib (which is LGPL).

Would it be any better to depend on an existing python binding GEOS? Here's one option:

http://trac.gispython.org/projects/PCL/wiki/Shapely

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Christopher Barker wrote:

Jeff Whitaker wrote:

1) an external dependency on the GEOS lib (which is LGPL).

Would it be any better to depend on an existing python binding GEOS? Here's one option:

http://trac.gispython.org/projects/PCL/wiki/Shapely

That is what Jeff started with, but it uses ctypes, and adds complexity to the licensing and the installation. Therefore he wrote his own pyrex binding to the C API of GEOS, which avoids all that and works fine for basemap purposes.

Eric

···

-Chris

Christopher Barker wrote:

Jeff Whitaker wrote:
  

1) an external dependency on the GEOS lib (which is LGPL).
    
Would it be any better to depend on an existing python binding GEOS? Here's one option:

http://trac.gispython.org/projects/PCL/wiki/Shapely

-Chris

Chris: I prototyped the code changes with Shapely (which uses ctypes), then coded my own pyrex replacement, both to get more speed and to avoid depending on ctypes too.

-Jeff

···

--
Jeffrey S. Whitaker Phone : (303)497-6313
NOAA/OAR/CDC R/PSD1 FAX : (303)497-6449
325 Broadway Boulder, CO, USA 80305-3328

Jeff,

as you mentioned license as one issue in not using shapely, I thought you might be interested in this:

···

-------- Original Message --------
Subject: [Community] Proposal to change Shapely license from LGPL to BSD
From: Sean Gillies <sgillies@...571...>

I propose that the Shapely license be changed to the 3 clause modified
BSD used by OWSLib, GeoJSON, Rtree, and WorldMill. I choose LGPL
orginally to match the GEOS license, but I think consistency across the
new GIS-Python projects is more important.
--------

As for ctypes vs. pyrex -- I find it ironic that you've chosen pyrex for dependency reasons -- there are lot of folks that using ctypes to asve the hassles of compiliation, particularly on Windows (see geoGjango, for instance). But I guess MPL required compilation anyway.

I'm just poking at this 'cause I'd really like to see as little redundancy of python bindings for stuff as possible. WE alrady have geos bound by ogr, shapely (I think geoDjango), and who knows who else.

Oh well, we all need to do what works best for our needs. I guess it's real credit to tools like swig, ctypes and pyrex (and python itself) that anyone would even consider writing their own bindings to something!

-Chris

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Christopher Barker wrote:
[...]

As for ctypes vs. pyrex -- I find it ironic that you've chosen pyrex for dependency reasons -- there are lot of folks that using ctypes to asve the hassles of compiliation, particularly on Windows (see geoGjango, for instance). But I guess MPL required compilation anyway.

Not just dependency reasons. When I tried to use Jeff's version with Shapely, I had to set LD_LIBRARY_PATH=/usr/local/lib for ctypes to find my geos library. I couldn't find any cleaner way to do it, and I don't consider that acceptable. Is there a clean way to tell ctypes where a library is? I was surprised it was not even checking that standard location.

Also, pyrex does a *lot* more than ctypes, and I think Jeff will tell you he has put that power to good use in his new basemap version.

Eric

Christopher Barker wrote:

Jeff,

as you mentioned license as one issue in not using shapely, I thought you might be interested in this:

Subject: [Community] Proposal to change Shapely license from LGPL to BSD
From: Sean Gillies <sgillies@...571...>

I propose that the Shapely license be changed to the 3 clause modified
BSD used by OWSLib, GeoJSON, Rtree, and WorldMill. I choose LGPL
orginally to match the GEOS license, but I think consistency across the
new GIS-Python projects is more important.
--------

As for ctypes vs. pyrex -- I find it ironic that you've chosen pyrex for dependency reasons -- there are lot of folks that using ctypes to asve the hassles of compiliation, particularly on Windows (see geoGjango, for instance). But I guess MPL required compilation anyway.

I'm just poking at this 'cause I'd really like to see as little redundancy of python bindings for stuff as possible. WE alrady have geos bound by ogr, shapely (I think geoDjango), and who knows who else.

Oh well, we all need to do what works best for our needs. I guess it's real credit to tools like swig, ctypes and pyrex (and python itself) that anyone would even consider writing their own bindings to something!

-Chris

Chris: I don't consider the pyrex GEOS interface I created for basemap a general-purpose binding - it's limited to the only the functionality that basemap needs. The ability of pyrex to do loops in C makes it a lot faster than the Shapely ctypes interface though. I have corresponded with Sean Gillies about this (and the LGPL licensing issue), and as a result I think Shapley 2.0 will be based on Pyrex and will have a BSD license. Perhaps then I can switch basemap back to using it.

Eric mentioned the fact that ctypes often has a hard time finding the shared library to load. I view this as a potential support nightmare for matplotlib if we use ctypes to interface with external libs, which might be installed anywhere. You can't even tell ctypes where to look - it's hard-coded into the Shapely __init__ and there's no way I can see to override the search path that ctypes uses.

-Jeff

···

-------- Original Message --------

--
Jeffrey S. Whitaker Phone : (303)497-6313
Meteorologist FAX : (303)497-6449
NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker@...236...
325 Broadway Office : Skaggs Research Cntr 1D-124
Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg

Jeff Whitaker wrote:

Chris: I don't consider the pyrex GEOS interface I created for basemap a general-purpose binding - it's limited to the only the functionality that basemap needs.

sure, but if there was an existing general purpose binding that met your needs, you wouldn't need this.

> The ability of pyrex to do loops in C makes it a

lot faster than the Shapely ctypes interface though.

True -- that is key. If the C lib doesn't provide a "vectorized" API, then you do need to find a way to loop in C yourself.

> I have

corresponded with Sean Gillies about this (and the LGPL licensing issue), and as a result I think Shapley 2.0 will be based on Pyrex and will have a BSD license. Perhaps then I can switch basemap back to using it.

Cool -- that is open-source collaboration as it should be!

Eric mentioned the fact that ctypes often has a hard time finding the shared library to load.

You can't even tell ctypes where to look - it's hard-coded into the Shapely __init__ and there's no way I can see to override the search path that ctypes uses.

Wow! that does seem an oversight. I haven't used ctypes myself -- frankly, I think people are trying to push it a bit too far, it's an excellent solution for calling the occasional system lib, but maybe not so much for writing full-featured wrappers.

Pyrex is very cool though. For the moment, I'm struggling with SWIG, and sadly it is a bit of a struggle. However I'm doing 'cause it's used by wxPython, GDAL, and VTK, all of which I want to be able to hack on a bit, so I might as well learn it.

Anyway -- basemap is looking better and better!

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...