shapefile generation example

Dear Python/Matplotlib/Ogr Users:

We are recent converts to Python, and are having trouble with some of its functionalities.
We’d like to submit our case for your consideration in hopes to get some educated help on the subject.

The problem:
When trying to use contour collections generated by contourf, the resulting shapefile contains overly simplified contours which poorly approximate the underlying field.

To reproduce the problem, we wrote a python script that specifies a 2-d analytical shape. This shape has small noise perturbations added, in order to simulate natural geophysical fields (wind speed, for example).

.
The shape is being sliced by contourf command, and the resulting collection is being plotted as a PDF file (PostScript) and converted to an output Shapefile using OGR module.

We also wrote several functions, defined inside the script, that take care of unpacking and exporting the contour collections as polygon or multipolygon shapefile entities thru OGR shapefile methods.

Two zoomed in views are attached (screenshot_* attachments): (1) a portion of the PDF figure, and (2) a visualization of the shapefile data for the same area. The PDF figure shows a contour line with fine scale structure (the fine structures are the noise we added) while a lack of fine structure is seen in the output shapefile. The PDF plot is what we expect. The output shapefile geometry is very different from what we would expect.

We can’t understand how a call to contourf can produce a plot that looks right AND shapefile data (taken from contourf’s collections) that appear to grossly simplify the geometry. We expect that both the plot and the shapefile come from the same contourf function results, yet look totally different.

We’d like to ask whether anyone else encountered limitations regarding the complexity of shapefiles written out by python?
Is this a possible problem with matplotlib.pyplot.contourf.collections method?

We would appreciate your help very much!
Test script and the resulting shapefile data set are attached.

Thank you!

Archive_0.002.zip (6.03 KB)

test_case.py (4.32 KB)

screenshot_Florida_SpatialKey.jpg

screenshot_Florida_PDF.png

···

S. Mark Leidner
Staff Scientist/Oklahoma Business Development
Atmospheric and Environmental Research, Inc.
350 David L. Boren Blvd, Suite 1535
Norman, OK 73072-7264 USA
ph: +1 405 325-1137
cell: +1 781 354-5969

Sergey Vinogradov, Ph.D., Staff Scientist
Atmospheric and Environmental Research, Inc.
131 Hartwell Ave., Lexington, MA 02421, USA
Phone: 1-781-761-2256 sergey@…3763…
Fax: 1-781-761-2299 http://www.aer.com
Web page :: http://ocean.mit.edu/~svinogra

I am not able to run the test case because I don’t have osgeo (also note that Nio isn’t used in the given example). However, I might have a guess as to what is going on. In mpl, there is path simplication logic to reduce complexity of the paths. There have been bugs in the past with this logic, and so it would be valuable to know what version of matplotlib you are using.

This simplification code is probably being activated within the call to to_polygons(). Which version of matplotlib are you using?

Ben Root

···

On Tue, Sep 13, 2011 at 11:36 AM, Leidner, Mark <mleidner@…3729…63…> wrote:

Dear Python/Matplotlib/Ogr Users:

We are recent converts to Python, and are having trouble with some of its functionalities.
We’d like to submit our case for your consideration in hopes to get some educated help on the subject.

The problem:
When trying to use contour collections generated by contourf, the resulting shapefile contains overly simplified contours which poorly approximate the underlying field.

To reproduce the problem, we wrote a python script that specifies a 2-d analytical shape. This shape has small noise perturbations added, in order to simulate natural geophysical fields (wind speed, for example).

.
The shape is being sliced by contourf command, and the resulting collection is being plotted as a PDF file (PostScript) and converted to an output Shapefile using OGR module.

We also wrote several functions, defined inside the script, that take care of unpacking and exporting the contour collections as polygon or multipolygon shapefile entities thru OGR shapefile methods.

Two zoomed in views are attached (screenshot_* attachments): (1) a portion of the PDF figure, and (2) a visualization of the shapefile data for the same area. The PDF figure shows a contour line with fine scale structure (the fine structures are the noise we added) while a lack of fine structure is seen in the output shapefile. The PDF plot is what we expect. The output shapefile geometry is very different from what we would expect.

We can’t understand how a call to contourf can produce a plot that looks right AND shapefile data (taken from contourf’s collections) that appear to grossly simplify the geometry. We expect that both the plot and the shapefile come from the same contourf function results, yet look totally different.

We’d like to ask whether anyone else encountered limitations regarding the complexity of shapefiles written out by python?
Is this a possible problem with matplotlib.pyplot.contourf.collections method?

We would appreciate your help very much!

Test script and the resulting shapefile data set are attached.

Thank you!

Ben,

Good to hear from you.

We are using matplotlib v1.0.1_5 on an install from Macports.

Hearing that there is simplification logic is very intriguing.

Mark

···

On 09/13/11, Benjamin Root <ben.root@…1304…> wrote:

On Tue, Sep 13, 2011 at 11:36 AM, Leidner, Mark <mleidner@…3763…> wrote:

Dear Python/Matplotlib/Ogr Users:

We are recent converts to Python, and are having trouble with some of its functionalities.
We’d like to submit our case for your consideration in hopes to get some educated help on the subject.

The problem:
When trying to use contour collections generated by contourf, the resulting shapefile contains overly simplified contours which poorly approximate the underlying field.

To reproduce the problem, we wrote a python script that specifies a 2-d analytical shape. This shape has small noise perturbations added, in order to simulate natural geophysical fields (wind speed, for example).

.
The shape is being sliced by contourf command, and the resulting collection is being plotted as a PDF file (PostScript) and converted to an output Shapefile using OGR module.

We also wrote several functions, defined inside the script, that take care of unpacking and exporting the contour collections as polygon or multipolygon shapefile entities thru OGR shapefile methods.

Two zoomed in views are attached (screenshot_* attachments): (1) a portion of the PDF figure, and (2) a visualization of the shapefile data for the same area. The PDF figure shows a contour line with fine scale structure (the fine structures are the noise we added) while a lack of fine structure is seen in the output shapefile. The PDF plot is what we expect. The output shapefile geometry is very different from what we would expect.

We can’t understand how a call to contourf can produce a plot that looks right AND shapefile data (taken from contourf’s collections) that appear to grossly simplify the geometry. We expect that both the plot and the shapefile come from the same contourf function results, yet look totally different.

We’d like to ask whether anyone else encountered limitations regarding the complexity of shapefiles written out by python?
Is this a possible problem with matplotlib.pyplot.contourf.collections method?

We would appreciate your help very much!

Test script and the resulting shapefile data set are attached.

Thank you!

I am not able to run the test case because I don’t have osgeo (also note that Nio isn’t used in the given example). However, I might have a guess as to what is going on. In mpl, there is path simplication logic to reduce complexity of the paths. There have been bugs in the past with this logic, and so it would be valuable to know what version of matplotlib you are using.

This simplification code is probably being activated within the call to to_polygons(). Which version of matplotlib are you using?

Ben Root

Try this and tell me if the results are better. Right before the line where you call to_polygons(), add this line:

multipolygon.should_simplify = False

The simplification logic gets triggered automatically if the rcParam[‘path.simplify’] is True and if there are more them 128 vertices and those vertices are all simple LINETO segments. I think in your situation, this is true. So, we can force a non-simplification directly like above, or set your rcParams file with path.simplify to False (but this may make graph rendering significantly slower and more resource intensive overall).

The path simplification logic is designed so that one does not see any visual differences, however, there might need to be some additional logic for those who are accessing the path directly.

I hope this helps!

Ben Root

···

On Tue, Sep 13, 2011 at 1:02 PM, Leidner, Mark <mleidner@…3768…3…> wrote:

Ben,

Good to hear from you.

We are using matplotlib v1.0.1_5 on an install from Macports.

Hearing that there is simplification logic is very intriguing.

Mark

Dear Python/Matplotlib/Ogr Users:

We are recent converts to Python, and are having trouble with some of
its functionalities.
We'd like to submit our case for your consideration in hopes to get some
educated help on the subject.

The problem:
When trying to use contour collections generated by contourf, the
resulting shapefile contains overly simplified contours which poorly
approximate the underlying field.

To reproduce the problem, we wrote a python script that specifies a 2-d
analytical shape. This shape has small noise perturbations added, in
order to simulate natural geophysical fields (wind speed, for example).

Your illustration seems horrendously complex. Can you distill it down to a simplest-possible case? Doing so might help you figure out where the problem is. I don't think it has anything to do with path simplification, because that occurs when the path is rendered. If I understand correctly, what mpl is plotting directly from your data is fine; you are running into surprises with the shapefile that you are generating from what mpl is using for its plotting. So, the problem would seem to be in the generation of the shapefile, not in mpl.

Eric

···

On 09/13/2011 06:36 AM, Leidner, Mark wrote:

.
The shape is being sliced by contourf command, and the resulting
collection is being plotted as a PDF file (PostScript) and converted to
an output Shapefile using OGR module.

We also wrote several functions, defined inside the script, that take
care of unpacking and exporting the contour collections as polygon or
multipolygon shapefile entities thru OGR shapefile methods.

Two zoomed in views are attached (screenshot_* attachments): (1) a
portion of the PDF figure, and (2) a visualization of the shapefile data
for the same area. The PDF figure shows a contour line with fine scale
structure (the fine structures are the noise we added) while a lack of
fine structure is seen in the output shapefile. The PDF plot is what we
expect. The output shapefile geometry is very different from what we
would expect.

We can't understand how a call to contourf can produce a plot that looks
right AND shapefile data (taken from contourf's collections) that appear
to grossly simplify the geometry. We expect that both the plot and the
shapefile come from the same contourf function results, yet look totally
different.

We'd like to ask whether anyone else encountered limitations regarding
the complexity of shapefiles written out by python?
Is this a possible problem with matplotlib.pyplot.contourf.collections
method?

We would appreciate your help very much!
Test script and the resulting shapefile data set are attached.

Thank you!

------------------------------------------------------------------------

------------------------------------------------------------------------

S. Mark Leidner
Staff Scientist/Oklahoma Business Development
Atmospheric and Environmental Research, Inc.
350 David L. Boren Blvd, Suite 1535
Norman, OK 73072-7264 USA
ph: +1 405 325-1137
cell: +1 781 354-5969

Sergey Vinogradov, Ph.D., Staff Scientist
Atmospheric and Environmental Research, Inc.
131 Hartwell Ave., Lexington, MA 02421, USA
Phone: 1-781-761-2256 sergey@...3763...
Fax: 1-781-761-2299 http://www.aer.com
Web page :: http://ocean.mit.edu/~svinogra

------------------------------------------------------------------------------
BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the
BlackBerry&reg; mobile platform with sessions, labs& more.
See new tools and technologies. Register for BlackBerry&reg; DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Ben,

I am finally replying to your most helpful post about shapefile generation.

Indeed, we found that turning off multipolygon path simplification just before the call to_polygons() did the trick:

we find that multipolygons now preserve all of the vertices that define them – in some cases in excess of 10K points.

Thanks so much for pointing us in the right direction.

Mark

···

On 09/13/11, Benjamin Root <ben.root@…1304…> wrote:

On Tue, Sep 13, 2011 at 1:02 PM, Leidner, Mark <mleidner@…3763…> wrote:

Ben,

Good to hear from you.

We are using matplotlib v1.0.1_5 on an install from Macports.

Hearing that there is simplification logic is very intriguing.

Mark

Try this and tell me if the results are better. Right before the line where you call to_polygons(), add this line:

multipolygon.should_simplify = False

The simplification logic gets triggered automatically if the rcParam[‘path.simplify’] is True and if there are more them 128 vertices and those vertices are all simple LINETO segments. I think in your situation, this is true. So, we can force a non-simplification directly like above, or set your rcParams file with path.simplify to False (but this may make graph rendering significantly slower and more resource intensive overall).

The path simplification logic is designed so that one does not see any visual differences, however, there might need to be some additional logic for those who are accessing the path directly.

I hope this helps!
Ben Root