eps export files crash text processors

Hi,

When I use matplotlib for a scatter plot with both dots and connecting lines,
the exported eps file is huge, if the distances between many points are small.
I think of this as a bug, since no preview tiff is included in the generated eps
and a variety of text processing applications (including OpenOffice) crash
when I try to import the eps. Ghostscript takes forever, too. Is there anything
that I can do in order to export reasonable eps files?

I am using:
python 2.4.2
matplotlib 0.87.2
numpy 0.9.8

with Linux

The following small example illustrates the problem:

···

--
import pylab,numpy,random
random.seed()
x=[random.gauss(0,1)/float(i)**2 for i in xrange(1,1000000)]
X=numpy.array(x,numpy.Float32)
pylab.plot(X[1:],X[:-1],"-", c="#eeeeee")
pylab.plot(X[1:],X[:-1],"xk")
pylab.show()
--

The resulting eps file:
-rw-r----- 1 xx users 212190257 Jul 4 09:39 image.eps

Thanks a lot in advance

Martin

Hi Martin,

I suggest upgrading to 0.87.3. I can run your test script and can open the
resulting eps file. Please note, however, that you are asking for a file that
plots 2e6 points. We have already optimized the postscript commands such that
each marker or line requires essentially only a single line of postscript
code (a great improvement over mpl version 0.87.2), but that still means that
we need to write 2e6 points x 18 bytes per point ~ 36 MB (it actually ends up
being 41MB), and postscript then has to interpret that massive file. This
yields about a factor of 5 improvement in the size of your file, and its
about the best we can do for postscript.

If you want a smaller file size, I suggest you convert your file to some other
format, since most of your data points lie on top of one another, and then
convert back to eps if you really need that format. For example, you can
convert your 41MB eps file into a 274KB pdf file or a 6KB png file.

Finally, we dont include tiff previews in our eps files, so this is not a bug.

Darren

···

On Friday 07 July 2006 12:21, Martin Manns wrote:

Hi,

When I use matplotlib for a scatter plot with both dots and connecting
lines, the exported eps file is huge, if the distances between many points
are small. I think of this as a bug, since no preview tiff is included in
the generated eps and a variety of text processing applications (including
OpenOffice) crash when I try to import the eps. Ghostscript takes forever,
too. Is there anything that I can do in order to export reasonable eps
files?

I am using:
python 2.4.2
matplotlib 0.87.2
numpy 0.9.8

with Linux

The following small example illustrates the problem:
--
import pylab,numpy,random
random.seed()
x=[random.gauss(0,1)/float(i)**2 for i in xrange(1,1000000)]
X=numpy.array(x,numpy.Float32)
pylab.plot(X[1:],X[:-1],"-", c="#eeeeee")
pylab.plot(X[1:],X[:-1],"xk")
pylab.show()
--

The resulting eps file:
-rw-r----- 1 xx users 212190257 Jul 4 09:39 image.eps

Thanks a lot in advance

Martin

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job
easier Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

--
Darren S. Dale, Ph.D.
Cornell High Energy Synchrotron Source
Cornell University
200L Wilson Lab
Rt. 366 & Pine Tree Road
Ithaca, NY 14853

dd55@...163...
office: (607) 255-9894
fax: (607) 255-9001

Martin,

When I try your example with svn matplotlib, I get a 34 MB eps file, and looking at it, I don't see much room for making it smaller--there is one obvious optimization, abbreviating "marker", but that's it. (The svg file is 456 MB!) So, maybe some major optimization has already been done between mpl 0.87.2 and svn.

The bigger problem is that each file format has basic characteristics and limitations. If you draw a million markers and line segments, you are inevitably going to have a big postscript file, unless the postscript backend somehow detects the fact that almost all of your points are indistinguishable and therefore deletes most of them--and this is really asking too much of a plotting backend, I think. (An alternative is to generate a pixel image and make the postscript from that; this is what matlab does under some circumstances, but it can result in big files of poor quality.)

Your options include: filter your points beforehand so you only plot points that are distinct; or use a pixel-based format like png, which keeps the file size under control.

Eric

Martin Manns wrote:

···

Hi,

When I use matplotlib for a scatter plot with both dots and connecting lines,
the exported eps file is huge, if the distances between many points are small.
I think of this as a bug, since no preview tiff is included in the generated eps
and a variety of text processing applications (including OpenOffice) crash
when I try to import the eps. Ghostscript takes forever, too. Is there anything
that I can do in order to export reasonable eps files?

I am using:
python 2.4.2
matplotlib 0.87.2
numpy 0.9.8

with Linux

The following small example illustrates the problem:
--
import pylab,numpy,random
random.seed()
x=[random.gauss(0,1)/float(i)**2 for i in xrange(1,1000000)]
X=numpy.array(x,numpy.Float32)
pylab.plot(X[1:],X[:-1],"-", c="#eeeeee")
pylab.plot(X[1:],X[:-1],"xk")
pylab.show()
--

The resulting eps file:
-rw-r----- 1 xx users 212190257 Jul 4 09:39 image.eps

Thanks a lot in advance

Martin

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Thanks for the suggestion. the abbreviation is "o" as of svn 2548.

···

On Friday 07 July 2006 3:30 pm, Eric Firing wrote:

Martin,

When I try your example with svn matplotlib, I get a 34 MB eps file, and
looking at it, I don't see much room for making it smaller--there is one
obvious optimization, abbreviating "marker", but that's it.