I use matplotlib to generate x-y data plots; i.e., 2-D plots. The problem is that the output files (the PDF files containing plots that are generated with matplotlib) are huge. I can generate files that are 100's of KB or even MBs. This seems absurd to me. These file sizes cause programs that use them to come to a grinding halt. My goal is to reduce the plot files that I produce with matplotlib. Details follow.
I use matplotlib from EPD.
Enthought Canopy Python 2.7.3 | 64-bit | (default, Aug 8 2013, 05:37:06)
I'm using Mac OS X Version 10.8.4.
I use a home-grown code whose starting point was an example code on matplotlib website.
My relevant imports are:
import matplotlib.pyplot as plt
My plotting code lines are:
outfile = "basefile" + ".pdf"
## pylab.savefig(outfile, bbox_inches=0)
My PDF files contain simple plots which consist of (a) data points only, (b) lines between data points (data points not plotted), or (c) both data points and lines.
I have a consistent problem in that the files produced have sizes that seem way too big.
For example, most recently, I am plotting 3 data sets; each data set has about 90,000 points. If I plot all three sets in one PDF figure, the file size is over 2MB.
This seems absurd to me. I used R plotting for many years (again, my own homegrown code, for 6 years) and never had this issue, and I was making these kinds of plots/figures.
I thought it may be a vector/raster issue, but the following web page says that PDF are generated as vector image, which, to my understanding (which could be wrong), is the more compact format.
Is there a command I can use to reduce the file size? Since I am using these in reports and publications, the figures are almost always less than 3 inches by 3 inches in size; i.e., I do not have issues about taking a raster figure and trying to blow it up. So I am not concerned about pixelation problems that occur when an image is increased in size.
Thank you very much.