For some reason, my earlier reply didn't seem to make it to the mailing
list. Here it is in its entirety:
"""
If you assign each figure to a new number, it will keep all of those figures
around in memory (because pyplot thinks you may want to use it again.) The
best route is to call close('all') or fig.close() with each loop iteration.
40MB per image doesn't sound way out of reason to me. How big are your
images?
"""
On 10/05/2009 03:46 AM, Leo Trottier wrote:
Hi,
I think I've figured out what's going on. It's a combination of things:
1) iPython is ignorant of the problems associated with caching massive data
output
2) iPython doesn't seem to have a good way to clear data from memory
reliably (Bug #412350 “%clear should also delete _NN references and Out[NN...” : Bugs : IPython)
iPython is designed for interactive use, and stores a lot of values so they
can be conveniently reused later. For long running "batch" scripts, you can
use "regular" Python, or run the code in iPython such that it isn't
displayed at the console (by using "import" or "%run"). Bug 2) may help
looks like it would still require some manual intervention to be usefull.
You're still using a tool designed for fine-grained interactive use (eg. a
pen) where one designed for automation may be more appropriate (eg. a laser
printer)
3) matplotlib/Python seems to be insufficiently aggressive in its garbage
collection (??)
Is that still true after forcibly closing the figures on each loop iteration
as I suggested? Many hours have been spent squashing memory leaks in
matplotlib, and I am not aware of any in at least 0.98 and later (other than
some unavoidable small leaks in certain GUI backends). Do you have a
standalone example that illustrates this on a recent version of matplotlib?
4) For obvious reasons, JPGs are much bigger when stored as arrays (though
they still seem to take up more memory than they should)
It's pretty easy to estimate the memory requirements for an image. If the
image is true-color (by this, I mean not color-mapped), you'll need
4-bytes-per-pixel for the original image, plus a cached scaled copy (the
size of which depends on the output dpi), again with 4 bytes per pixel. For
color-mapped images, you'll have 4-byte floats for each pixel, 4-byte rgba
for the color-mapped image, and again a cached scaled copy of that. Not
knowing the size of your input images, it's impossible to say if 40MB per
image is way too big or not, but it's not unheard of by any means.
Problems 1-3 seem problematic enough that they will get fixed eventually.
... but (4) is a design issue. Assuming it's possible, it looks like there
could be benefits to making an array-like wrapper around PIL image objects
(perhaps similar in principle to a sparse matrix). Given PIL.ImageMath,
ImagePath, etc., it seems actually fairly doable. Wouldn't something like
this be of major benefit to people using SciPy for anything image-related?
Are you suggesting decompressing the JPEG on-the-fly with each redraw? I'm
not certain that would be fast enough for interactive use. It may be worth
experimenting with, but it would require a lot of changes to how matplotlib
works. It's also very tricky to get right -- I'm not aware of any image
processing applications that don't ultimately store a dense matrix of
uncompressed image data in memory, except for something like compressed
OpenGL textures on a graphics card. PIL certainly doesn't retain the
compressed JPEG in memory. So, I'm not sure the cost/benefit tradeoff is
right here -- the problems it solves can be solved much more easily without
sacrificing speed in other ways. That is, if the image data is simply too
large, it can be scaled before feeding it to imshow(). And generating
multiple figures in batch is not a problem if the figure is explicitly
closed.
Hope this helps. I would like to get to the bottom of any memory leaks, so
if you can provide a standalone script that leaks, despite calling
figure.close() in each iteration, please let me know.
Cheers,
Mike
Leo
On Fri, Oct 2, 2009 at 7:45 AM, Michael Droettboom <mdroe@...86...> wrote:
If you assign each figure to a new number, it will keep all of those
figures around in memory (because pyplot thinks you may want to use it
again.) The best route is to call close('all') or fig.close() with each
loop iteration.
40MB per image doesn't sound way out of reason to me. How big are your
images?
Mike
On 10/01/2009 10:25 PM, Leo Trottier wrote:
I have a friend who's having strange memory issues when opening and
displaying images (using Matplotlib).
Here's what he says:
#######################################
pylab seems really inefficient: Opening a few images and displaying them
eats up tons of memory, and the memory doesn't get freed.
Starting python, and run
In [5]: from glob import *;
In [6]: from pylab import *
python has 33MB of memory.
Run
In [7]: i = 1
In [8]: for imname in glob("*.JPG"):
...: im = imread(imname)
...: figure(i); i = i+1
...: imshow(im)
...:
This opens 10 figures and displays them. Python takes 480MB of memory.
This is crazy, for 10 images -- 40+MB of memory for each!
In [14]: close("all")
In [15]: i = 1
In [16]: for imname in glob("*.JPG"):
im = imread(imname)
figure(i); i = i+1
imshow(im)
....:
....:
This closes all figures and opens them again. Python takes up 837MB of
memory.
and so on... Something is really wrong with memory management.
##### System info: ##############
(using macosx backend)
2.4GHz MacBook Pro Intel Core 2 Duo
4GB 667MHz DDR2 SDRAM
In [5]: sys.version
Out[5]: '2.6.2 (r262:71600, Oct 1 2009, 16:44:23) \n[GCC 4.2.1 (Apple
Inc. build 5646)]'
In [6]: numpy.__version__
Out[6]: '1.3.0'
In [7]: matplotlib.__version__
Out[7]: '0.99.1.1'
In [8]: scipy.__version__
Out[8]: '0.7.1'
In [9]:
________________________________
------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register
now!
http://p.sf.net/sfu/devconf
________________________________
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options