Memory leaks

I've been looking into the memory leaks exercised by the memleak_gui.py unit test. I've searched through the mailing list for information, but I'm new to the party here, so forgive me if I'm not fully current.

Eric Firing wrote:
"I think we have a similar problem with all interactive backends (the only one I didn't test is Qt4Agg) which also makes me suspect we are violating some gui rule, rather than that gtk, qt3, wx, and tk all have leaks."

Unfortunately, from what I've seen, there isn't a single rule being broken, but instead I've been running into lots of different "surprises" in the various toolkits. But I am starting to suspect that my old versions of Gtk (or PyGtk) have some bonafide leaks.

I just finished submitting patches (to the tracker) for a number of memory leaks in the Tk, Gtk, and Wx backends (other backends will hopefully follow). I did all my testing on RHEL4 with Python 2.5 and recent SVN matplotlib (rev. 3244), so it's quite possible that memory leaks still remain on other platforms.

Tk:
See the patch:
http://sourceforge.net/tracker/index.php?func=detail&aid=1745400&group_id=80706&atid=560722

Even after this patch, Tkinter still leaks 28 bytes every time it is initialized (every time a toplevel window is created). There may be a way to avoid the subsequent initializations, but it wasn't immediately obvious to me, and given the small size of this leak, I've passed on it for now.

Gtk:
See the patch:
http://sourceforge.net/tracker/index.php?func=detail&aid=1745406&group_id=80706&atid=560722

The patch fixes a number of Python-level leaks.
Unfortunately, the Gdk rendering backend still leaks gtk.gdk.Window objects, and I have so far been unable to determine a fix. GtkAgg, however, does not have this leak.
Under pygtk-2.4.0, the toolbars leak gdk.Pixbuf's and file selector dialogs.
Since these issues appear to be bugs in gtk and/or pygtk itself, I will probably first try a more recent version of them. (The gtk installed on RHEL4 is ancient (2.4), and I haven't yet tried building my own. If anyone has a more recent version of pygtk handy, I would appreciate a report from memleak_gui.py after applying my patch.)

Wx:
See the patch:
http://sourceforge.net/tracker/index.php?func=detail&aid=1745408&group_id=80706&atid=560722

This one was fairly simple, though surprising. Top-level windows do not fully destroy themselves as long as there are pending events. Manually flushing the event queue causes the windows to go away. See the wxPython docs:
http://wxwidgets.org/manuals/stable/wx_wxwindow.html#wxwindowdestroy

There were no further leaks in Wx/WxAgg that I was able to detect (in my environment).

As an aside, I thought I'd share the techniques I used to find these leaks (hopefully not to be pedantic, but these things were hard to come by online...), and it might be useful to some.

For C/C++ memory leaks, I really like valgrind (though it is Linux-only), though be sure to follow the directions to get it to play well with Python. I recommend rebuilding Python with "--without-pymalloc" to make memory reporting in general much more sensical (though slower). See:
    http://svn.python.org/view/python/trunk/Misc/README.valgrind
For an example, you can see the rsvg memory leak here:

==15979== 1,280 bytes in 20 blocks are definitely lost in loss record 13,506 of 13,885
==15979== at 0x4004405: malloc (vg_replace_malloc.c:149)
==15979== by 0x314941: (within /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x315E0C: (within /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x31624A: art_svp_intersector (in /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x316660: art_svp_intersect (in /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x6BFA86C: (within /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6BFD801: rsvg_render_path (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6BFD9E7: rsvg_handle_path (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6BFFDB5: rsvg_start_path (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6C07F59: (within /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x9C24BB: xmlParseStartTag (in /usr/lib/libxml2.so.2.6.16)
==15979== by 0xA4DADC: xmlParseChunk (in /usr/lib/libxml2.so.2.6.16)
==15979== by 0x6C08ED5: rsvg_handle_write_impl (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6C09539: rsvg_handle_write (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x5D51B51: (within /usr/lib/gtk-2.0/2.4.0/loaders/svg_loader.so)
==15979== by 0x32871F: (within /usr/lib/libgdk_pixbuf-2.0.so.0.400.13)
==15979== by 0x32885E: gdk_pixbuf_new_from_file (in /usr/lib/libgdk_pixbuf-2.0.so.0.400.13)
==15979== by 0x5CEE34: (within /usr/lib/libgtk-x11-2.0.so.0.400.13)
==15979== by 0x5CF095: gtk_window_set_default_icon_from_file (in /usr/lib/libgtk-x11-2.0.so.0.400.13)
==15979== by 0x59FCD1B: _wrap_gtk_window_set_default_icon_from_file (gtk.c:38156)
==15979== by 0x80C29B6: PyEval_EvalFrameEx (ceval.c:3564)

For finding cycles that result in uncollectable garbage, I wrote a cycle-finding utility (see attached file). (I plan to submit this as a Python Cookbook entry). Given a list of objects of interest, it will print out all the reference cycles involving those objects (though it can't traverse through extension objects that don't expose reference information to the garbage collector).

Objects that leak because legitimate Python references to them are still around are some of the trickiest to find. (This was an answer to a job interview question I was once asked: "How can you leak memory in Java/some other garbage collected language?") I find it useful to compare snapshots of all the objects in the interpreter before and after a suspected leak:

    existing_objects = [id(x) for x in gc.get_objects()]
    # ... do something that leaks ...
    # ... delete everything you can ...
    remaining_objects = [x for x in gc.get_objects() if id(x) not in existing_objects]

One can then scour through remaining_objects for anything that suspected of causing the problem, and do cycle detection on those objects, if necessary to find ways to forcibly remove them.

Cheers,
Mike

cycle_finder.py (2.31 KB)

Mike,

All this sounds like great progress--thanks! I particularly appreciate the descriptions of what problems you found and how you found them.

John et al.: is there a maintainer for each of these backends? I think it is very important that Mike's patches be checked out and applied ASAP if they are OK; or if there is a problem, then that info needs to get back to Mike. This should be very high priority. I can do a quick check and commit if necessary, but it would make more sense for someone more familiar with backends and gui code to do it. Or at least for others to do some testing on different platforms if I make the commits.

Eric

Michael Droettboom wrote:

···

I've been looking into the memory leaks exercised by the memleak_gui.py unit test. I've searched through the mailing list for information, but I'm new to the party here, so forgive me if I'm not fully current.

Eric Firing wrote:
"I think we have a similar problem with all interactive backends (the only one I didn't test is Qt4Agg) which also makes me suspect we are violating some gui rule, rather than that gtk, qt3, wx, and tk all have leaks."

Unfortunately, from what I've seen, there isn't a single rule being broken, but instead I've been running into lots of different "surprises" in the various toolkits. But I am starting to suspect that my old versions of Gtk (or PyGtk) have some bonafide leaks.

I just finished submitting patches (to the tracker) for a number of memory leaks in the Tk, Gtk, and Wx backends (other backends will hopefully follow). I did all my testing on RHEL4 with Python 2.5 and recent SVN matplotlib (rev. 3244), so it's quite possible that memory leaks still remain on other platforms.

Tk:
See the patch:
http://sourceforge.net/tracker/index.php?func=detail&aid=1745400&group_id=80706&atid=560722

Even after this patch, Tkinter still leaks 28 bytes every time it is initialized (every time a toplevel window is created). There may be a way to avoid the subsequent initializations, but it wasn't immediately obvious to me, and given the small size of this leak, I've passed on it for now.

Gtk:
See the patch:
http://sourceforge.net/tracker/index.php?func=detail&aid=1745406&group_id=80706&atid=560722

The patch fixes a number of Python-level leaks.
Unfortunately, the Gdk rendering backend still leaks gtk.gdk.Window objects, and I have so far been unable to determine a fix. GtkAgg, however, does not have this leak.
Under pygtk-2.4.0, the toolbars leak gdk.Pixbuf's and file selector dialogs.
Since these issues appear to be bugs in gtk and/or pygtk itself, I will probably first try a more recent version of them. (The gtk installed on RHEL4 is ancient (2.4), and I haven't yet tried building my own. If anyone has a more recent version of pygtk handy, I would appreciate a report from memleak_gui.py after applying my patch.)

Wx:
See the patch:
http://sourceforge.net/tracker/index.php?func=detail&aid=1745408&group_id=80706&atid=560722

This one was fairly simple, though surprising. Top-level windows do not fully destroy themselves as long as there are pending events. Manually flushing the event queue causes the windows to go away. See the wxPython docs:
http://wxwidgets.org/manuals/stable/wx_wxwindow.html#wxwindowdestroy

There were no further leaks in Wx/WxAgg that I was able to detect (in my environment).

As an aside, I thought I'd share the techniques I used to find these leaks (hopefully not to be pedantic, but these things were hard to come by online...), and it might be useful to some.

For C/C++ memory leaks, I really like valgrind (though it is Linux-only), though be sure to follow the directions to get it to play well with Python. I recommend rebuilding Python with "--without-pymalloc" to make memory reporting in general much more sensical (though slower). See:
   http://svn.python.org/view/python/trunk/Misc/README.valgrind
For an example, you can see the rsvg memory leak here:

==15979== 1,280 bytes in 20 blocks are definitely lost in loss record 13,506 of 13,885
==15979== at 0x4004405: malloc (vg_replace_malloc.c:149)
==15979== by 0x314941: (within /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x315E0C: (within /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x31624A: art_svp_intersector (in /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x316660: art_svp_intersect (in /usr/lib/libart_lgpl_2.so.2.3.16)
==15979== by 0x6BFA86C: (within /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6BFD801: rsvg_render_path (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6BFD9E7: rsvg_handle_path (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6BFFDB5: rsvg_start_path (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6C07F59: (within /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x9C24BB: xmlParseStartTag (in /usr/lib/libxml2.so.2.6.16)
==15979== by 0xA4DADC: xmlParseChunk (in /usr/lib/libxml2.so.2.6.16)
==15979== by 0x6C08ED5: rsvg_handle_write_impl (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x6C09539: rsvg_handle_write (in /usr/lib/librsvg-2.so.2.8.1)
==15979== by 0x5D51B51: (within /usr/lib/gtk-2.0/2.4.0/loaders/svg_loader.so)
==15979== by 0x32871F: (within /usr/lib/libgdk_pixbuf-2.0.so.0.400.13)
==15979== by 0x32885E: gdk_pixbuf_new_from_file (in /usr/lib/libgdk_pixbuf-2.0.so.0.400.13)
==15979== by 0x5CEE34: (within /usr/lib/libgtk-x11-2.0.so.0.400.13)
==15979== by 0x5CF095: gtk_window_set_default_icon_from_file (in /usr/lib/libgtk-x11-2.0.so.0.400.13)
==15979== by 0x59FCD1B: _wrap_gtk_window_set_default_icon_from_file (gtk.c:38156)
==15979== by 0x80C29B6: PyEval_EvalFrameEx (ceval.c:3564)

For finding cycles that result in uncollectable garbage, I wrote a cycle-finding utility (see attached file). (I plan to submit this as a Python Cookbook entry). Given a list of objects of interest, it will print out all the reference cycles involving those objects (though it can't traverse through extension objects that don't expose reference information to the garbage collector).

Objects that leak because legitimate Python references to them are still around are some of the trickiest to find. (This was an answer to a job interview question I was once asked: "How can you leak memory in Java/some other garbage collected language?") I find it useful to compare snapshots of all the objects in the interpreter before and after a suspected leak:

   existing_objects = [id(x) for x in gc.get_objects()]
   # ... do something that leaks ...
   # ... delete everything you can ...
   remaining_objects = [x for x in gc.get_objects() if id(x) not in existing_objects]

One can then scour through remaining_objects for anything that suspected of causing the problem, and do cycle detection on those objects, if necessary to find ways to forcibly remove them.

Cheers,
Mike

------------------------------------------------------------------------

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/

------------------------------------------------------------------------

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

gtk: Steve Chaplin or me
wx: Ken McIvor
qt: Darren?
tk: Charlie?

After we get these patches in, we can just give Michael commit
privileges :slight_smile: I can probably look at this Monday, but if you want to
commit and test some of these before then, please do so.

JDH

···

On 6/30/07, Eric Firing <efiring@...229...> wrote:

Mike,

All this sounds like great progress--thanks! I particularly appreciate
the descriptions of what problems you found and how you found them.

John et al.: is there a maintainer for each of these backends? I think

John Hunter wrote:

Mike,

All this sounds like great progress--thanks! I particularly appreciate
the descriptions of what problems you found and how you found them.

John et al.: is there a maintainer for each of these backends? I think

gtk: Steve Chaplin or me
wx: Ken McIvor
qt: Darren?
tk: Charlie?

After we get these patches in, we can just give Michael commit
privileges :slight_smile: I can probably look at this Monday, but if you want to
commit and test some of these before then, please do so.

Done. It looks like there is still plenty of memory leakage, but there are improvements, and the huge list of uncollectable garbage with tkAgg is gone.

I also made memleak_gui.py more flexible with arguments. For example, here are tests with three backends, a generous number of loops, and suppression of intermediate output:

python ../unit/memleak_gui.py -d wx -s 500 -e 1000 -q

uncollectable list: []

Backend WX, toolbar toolbar2
Averaging over loops 500 to 1000
Memory went from 29316k to 31211k
Average memory consumed per loop: 3.7900k bytes

python ../unit/memleak_gui.py -d tkagg -s 500 -e 1000 -q

uncollectable list: []

Backend TkAgg, toolbar toolbar2
Averaging over loops 500 to 1000
Memory went from 29202k to 31271k
Average memory consumed per loop: 4.1380k bytes

python ../unit/memleak_gui.py -d gtkagg -s 500 -e 1000 -q

uncollectable list: []

Backend GTKAgg, toolbar toolbar2
Averaging over loops 500 to 1000
Memory went from 29324k to 31131k
Average memory consumed per loop: 3.6140k bytes

So, this test is still showing problems, with similar memory consumption in these three backends.

Eric

···

On 6/30/07, Eric Firing <efiring@...229...> wrote:

JDH

Eric Firing wrote:

So, this test is still showing problems, with similar memory consumption in these three backends.

Not necessarily. By default, Python allocates large pools from the operating system and then manages those pools itself (though its PyMalloc call). Prior to Python 2.5, those pools were never freed. With Python 2.5, empty pools, when they occur, are freed back to the OS. Due to fragmentation issues, even if there is enough free space in those pools for new objects, new pools may need to be created anyway, since Python objects can't be moved once they are created. So seeing modest increases in memory usage during a long-running Python application is typical, and not something that can be avoided wiinaccurate at finding memory leaksthout micro-optimizing for pool performance (something that may be very difficult). If memory usage is truly increasing in an unbounded way, then, yes, there may be problems, but it should eventually stabilize (though in a test such as memleak_gui that may take many iterations). It's more interesting to see the curve of memory usage over time than the average over a number of iterations.

For further reading, see:
http://evanjones.ca/python-memory.html
README.valgrind in the Python source
http://mail.python.org/pipermail/python-dev/2006-March/061991.html

Because of this, using the total memory allocated by the Python process to track memory leaks is pretty blunt tool. More important metrics are the total number of GC objects (gc.get_objects()), GC garbage (gc.garbage), and using a tool like Valgrind or Purify to find mismatched malloc/frees. Another useful tool (but I didn't resort to yet with matplotlib testing) is to build Python with COUNT_ALLOCS, which then gives access to the total number of mallocs and frees in the Python interpreter at runtime.

IMO, the only reasonable way to use the total memory usage of Python to debug memory leaks is if you build Python without pool allocation (--without-pymalloc). That was how I was debugging memory leaks last week (in conjunction with valgrind, and the gc module), and with that configuration, I was only seeing memory leakage with Pygtk 2.4, and a very small amount with Tk. Are your numbers from a default build? If so, I'll rebuild my Python and check my numbers against yours. If they match, I suspect there's little we can do.

Cheers,
Mike

Michael Droettboom wrote:

Eric Firing wrote:

So, this test is still showing problems, with similar memory consumption in these three backends.

Not necessarily. By default, Python allocates large pools from the operating system and then manages those pools itself (though its PyMalloc call). Prior to Python 2.5, those pools were never freed. With Python 2.5, empty pools, when they occur, are freed back to the OS. Due to fragmentation issues, even if there is enough free space in those pools for new objects, new pools may need to be created anyway, since Python objects can't be moved once they are created. So seeing modest increases in memory usage during a long-running Python application is typical, and not something that can be avoided wiinaccurate at finding memory leaksthout micro-optimizing for pool performance (something that may be very difficult). If memory usage is truly increasing in an unbounded way, then, yes, there may be problems, but it should eventually stabilize (though in a test such as memleak_gui that may take many iterations). It's more interesting to see the curve of memory usage over time than the average over a number of iterations.

I agree. I just ran 2000 iterations with GtkAgg, plotted every 10th point, and the increase is linear (apart from a little bumpiness) over the entire range (not just the last 1000 iterations reported below):

Backend GTKAgg, toolbar toolbar2
Averaging over loops 1000 to 2000
Memory went from 31248k to 35040k
Average memory consumed per loop: 3.7920k bytes

Maybe this is just the behavior of pymalloc in 2.5?

For further reading, see:
http://evanjones.ca/python-memory.html
README.valgrind in the Python source
http://mail.python.org/pipermail/python-dev/2006-March/061991.html

Because of this, using the total memory allocated by the Python process to track memory leaks is pretty blunt tool. More important metrics are the total number of GC objects (gc.get_objects()), GC garbage (gc.garbage), and using a tool like Valgrind or Purify to find mismatched malloc/frees. Another useful tool (but I didn't resort to yet with matplotlib testing) is to build Python with COUNT_ALLOCS, which then gives access to the total number of mallocs and frees in the Python interpreter at runtime.

IMO, the only reasonable way to use the total memory usage of Python to debug memory leaks is if you build Python without pool allocation (--without-pymalloc). That was how I was debugging memory leaks last week (in conjunction with valgrind, and the gc module), and with that configuration, I was only seeing memory leakage with Pygtk 2.4, and a very small amount with Tk. Are your numbers from a default build? If so, I'll rebuild my Python and check my numbers against yours. If they match, I suspect there's little we can do.

I used stock Python 2.5 from ubuntu Feisty. I should compile a version as you suggest, but I haven't done it yet.

Eric

···

Cheers,
Mike

More results:

I've built and tested a more recent pygtk+ stack. (glib-2.12, gtk+-2.10.9, librsvg-2.16.1, libxml2-2.6.29, pygobject-2.13.1, pygtk-2.10.4...). The good news is that the C-level leaks I was seeing in pygtk 2.2 and 2.4 are resolved. In particular, using an SVG icon and Gdk rendering no longer seems problematic. I would suggest that anyone using old versions of pygtk should upgrade, rather than spending time on workarounds for matplotlib -- do you all agree? And my Gtk patch should probably be reverted to use an SVG icon for the window again (or to only do it on versions of pygtk > 2.xx). I don't know what percentage of users are still using pygtk-2.4 and earlier...

There is, however, a new patch (attached) to fix a leak of FileChooserDialog objects that I didn't see in earlier pygtk versions. I have to admit that I'm a bit puzzled by the solution -- it seems that the FileChooserDialog object refuses to destruct whenever any custom Python attributes have been added to the object. It doesn't really need them in this case so it's an easy fix, but I'm not sure why that was broken -- other classes do this and don't have problems (e.g. NavigationToolbar2GTK). Maybe a pygtk expert out there knows what this is about. It would be great if this resolved the linear memory growth that Eric is seeing with the Gtk backend.

GtkCairo seems to be free of leaks.

QtAgg (qt-3.3) was leaking because of a cyclical reference in the signals between the toolbar and its buttons. (Patch attached).

Qt4 is forthcoming (I'm still trying to compile something that runs the demos cleanly).

I tried the FltkAgg backend, but it doesn't seem to close the window at all when the figure is closed -- instead I get dozens of windows open at once. Is that a known bug or correct behavior?

Cheers,
Mike

Eric Firing wrote:

···

Michael Droettboom wrote:

Eric Firing wrote:

So, this test is still showing problems, with similar memory consumption in these three backends.

Not necessarily. By default, Python allocates large pools from the operating system and then manages those pools itself (though its PyMalloc call). Prior to Python 2.5, those pools were never freed. With Python 2.5, empty pools, when they occur, are freed back to the OS. Due to fragmentation issues, even if there is enough free space in those pools for new objects, new pools may need to be created anyway, since Python objects can't be moved once they are created. So seeing modest increases in memory usage during a long-running Python application is typical, and not something that can be avoided wiinaccurate at finding memory leaksthout micro-optimizing for pool performance (something that may be very difficult). If memory usage is truly increasing in an unbounded way, then, yes, there may be problems, but it should eventually stabilize (though in a test such as memleak_gui that may take many iterations). It's more interesting to see the curve of memory usage over time than the average over a number of iterations.

I agree. I just ran 2000 iterations with GtkAgg, plotted every 10th point, and the increase is linear (apart from a little bumpiness) over the entire range (not just the last 1000 iterations reported below):

Backend GTKAgg, toolbar toolbar2
Averaging over loops 1000 to 2000
Memory went from 31248k to 35040k
Average memory consumed per loop: 3.7920k bytes

Maybe this is just the behavior of pymalloc in 2.5?

For further reading, see:
http://evanjones.ca/python-memory.html
README.valgrind in the Python source
http://mail.python.org/pipermail/python-dev/2006-March/061991.html

Because of this, using the total memory allocated by the Python process to track memory leaks is pretty blunt tool. More important metrics are the total number of GC objects (gc.get_objects()), GC garbage (gc.garbage), and using a tool like Valgrind or Purify to find mismatched malloc/frees. Another useful tool (but I didn't resort to yet with matplotlib testing) is to build Python with COUNT_ALLOCS, which then gives access to the total number of mallocs and frees in the Python interpreter at runtime.

IMO, the only reasonable way to use the total memory usage of Python to debug memory leaks is if you build Python without pool allocation (--without-pymalloc). That was how I was debugging memory leaks last week (in conjunction with valgrind, and the gc module), and with that configuration, I was only seeing memory leakage with Pygtk 2.4, and a very small amount with Tk. Are your numbers from a default build? If so, I'll rebuild my Python and check my numbers against yours. If they match, I suspect there's little we can do.

I used stock Python 2.5 from ubuntu Feisty. I should compile a version as you suggest, but I haven't done it yet.

Eric

Cheers,
Mike

Forgot to attach the patches.

Oops,
Mike

Michael Droettboom wrote:

backend_gtk.3444.patch (2.13 KB)

memleak_qt.3444.patch (992 Bytes)

···

More results:

I've built and tested a more recent pygtk+ stack. (glib-2.12, gtk+-2.10.9, librsvg-2.16.1, libxml2-2.6.29, pygobject-2.13.1, pygtk-2.10.4...). The good news is that the C-level leaks I was seeing in pygtk 2.2 and 2.4 are resolved. In particular, using an SVG icon and Gdk rendering no longer seems problematic. I would suggest that anyone using old versions of pygtk should upgrade, rather than spending time on workarounds for matplotlib -- do you all agree? And my Gtk patch should probably be reverted to use an SVG icon for the window again (or to only do it on versions of pygtk > 2.xx). I don't know what percentage of users are still using pygtk-2.4 and earlier...

There is, however, a new patch (attached) to fix a leak of FileChooserDialog objects that I didn't see in earlier pygtk versions. I have to admit that I'm a bit puzzled by the solution -- it seems that the FileChooserDialog object refuses to destruct whenever any custom Python attributes have been added to the object. It doesn't really need them in this case so it's an easy fix, but I'm not sure why that was broken -- other classes do this and don't have problems (e.g. NavigationToolbar2GTK). Maybe a pygtk expert out there knows what this is about. It would be great if this resolved the linear memory growth that Eric is seeing with the Gtk backend.

GtkCairo seems to be free of leaks.

QtAgg (qt-3.3) was leaking because of a cyclical reference in the signals between the toolbar and its buttons. (Patch attached).

Qt4 is forthcoming (I'm still trying to compile something that runs the demos cleanly).

I tried the FltkAgg backend, but it doesn't seem to close the window at all when the figure is closed -- instead I get dozens of windows open at once. Is that a known bug or correct behavior?

Cheers,
Mike

Eric Firing wrote:
  

Michael Droettboom wrote:
    

Eric Firing wrote:
      

So, this test is still showing problems, with similar memory consumption in these three backends.
        

Not necessarily. By default, Python allocates large pools from the operating system and then manages those pools itself (though its PyMalloc call). Prior to Python 2.5, those pools were never freed. With Python 2.5, empty pools, when they occur, are freed back to the OS. Due to fragmentation issues, even if there is enough free space in those pools for new objects, new pools may need to be created anyway, since Python objects can't be moved once they are created. So seeing modest increases in memory usage during a long-running Python application is typical, and not something that can be avoided wiinaccurate at finding memory leaksthout micro-optimizing for pool performance (something that may be very difficult). If memory usage is truly increasing in an unbounded way, then, yes, there may be problems, but it should eventually stabilize (though in a test such as memleak_gui that may take many iterations). It's more interesting to see the curve of memory usage over time than the average over a number of iterations.
      

I agree. I just ran 2000 iterations with GtkAgg, plotted every 10th point, and the increase is linear (apart from a little bumpiness) over the entire range (not just the last 1000 iterations reported below):

Backend GTKAgg, toolbar toolbar2
Averaging over loops 1000 to 2000
Memory went from 31248k to 35040k
Average memory consumed per loop: 3.7920k bytes

Maybe this is just the behavior of pymalloc in 2.5?

For further reading, see:
http://evanjones.ca/python-memory.html
README.valgrind in the Python source
http://mail.python.org/pipermail/python-dev/2006-March/061991.html

Because of this, using the total memory allocated by the Python process to track memory leaks is pretty blunt tool. More important metrics are the total number of GC objects (gc.get_objects()), GC garbage (gc.garbage), and using a tool like Valgrind or Purify to find mismatched malloc/frees. Another useful tool (but I didn't resort to yet with matplotlib testing) is to build Python with COUNT_ALLOCS, which then gives access to the total number of mallocs and frees in the Python interpreter at runtime.

IMO, the only reasonable way to use the total memory usage of Python to debug memory leaks is if you build Python without pool allocation (--without-pymalloc). That was how I was debugging memory leaks last week (in conjunction with valgrind, and the gc module), and with that configuration, I was only seeing memory leakage with Pygtk 2.4, and a very small amount with Tk. Are your numbers from a default build? If so, I'll rebuild my Python and check my numbers against yours. If they match, I suspect there's little we can do.
      

I used stock Python 2.5 from ubuntu Feisty. I should compile a version as you suggest, but I haven't done it yet.

Eric

Cheers,
Mike

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Michael -- if you send me your sf ID I'll add you to the committers
list and you can check these in directly.

Vis-a-vis the gtk question, I agree that we should encourage people to
upgrade who are suffering from the leak rather than work around it. I
would like to summarize the status of known leaks for the FAQ so
perhaps you could summarize across the backends what kind of leaks
remain in the --without-pymalloc with the known problems fixed (eg the
gtk upgrade). If you could simply send me an update for the memory
leak FAQ (don't worry about the formatting, I can take care of that)
that would be great. Or if you are feeling doubly adventurous, you
can simply update the FAQ in the htdocs/faq.html.template svn document
and commit it along with your other changes.

Thanks for all the very useful and detailed work!

JDH

···

On 7/2/07, Michael Droettboom <mdroe@...31...> wrote:

Forgot to attach the patches.

John Hunter wrote:

Forgot to attach the patches.

Michael -- if you send me your sf ID I'll add you to the committers
list and you can check these in directly.

mdboom

Vis-a-vis the gtk question, I agree that we should encourage people to
upgrade who are suffering from the leak rather than work around it. I
would like to summarize the status of known leaks for the FAQ so
perhaps you could summarize across the backends what kind of leaks
remain in the --without-pymalloc with the known problems fixed (eg the
gtk upgrade). If you could simply send me an update for the memory
leak FAQ (don't worry about the formatting, I can take care of that)
that would be great. Or if you are feeling doubly adventurous, you
can simply update the FAQ in the htdocs/faq.html.template svn document
and commit it along with your other changes.

Will do.

Cheers,
Mike

···

On 7/2/07, Michael Droettboom <mdroe@...31...> wrote:

Eric Firing wrote:

I also made memleak_gui.py more flexible with arguments. For example, here are tests with three backends, a generous number of loops, and suppression of intermediate output:

Those changes are really helpful. I just added code to display the total number of objects in the Python interpreter (len(gc.get_objects()) with each iteration as well, as that can be useful. (It doesn't rule out memory leaks, but if it is increasing, that is definitely a problem.)

I also added a commandline option to print out any cycles involving uncollectable objects, and added the necessary function to do so to cbook.py.

Cheers,
Mike

Michael Droettboom wrote:

Eric Firing wrote:

I also made memleak_gui.py more flexible with arguments. For example, here are tests with three backends, a generous number of loops, and suppression of intermediate output:

Those changes are really helpful. I just added code to display the total number of objects in the Python interpreter (len(gc.get_objects()) with each iteration as well, as that can be useful. (It doesn't rule out memory leaks, but if it is increasing, that is definitely a problem.)

I also added a commandline option to print out any cycles involving uncollectable objects, and added the necessary function to do so to cbook.py.

Cheers,
Mike

Mike,

Good, thank you.

I just committed a change to the output formatting of memleak_gui so that if you redirect it to a file, that file can be loaded with pylab.load() in case you want to plot the columns. (At least this is true if you don't use the -c option.)

Yesterday, before your commits, I compared memleak_gui with stock Python 2.4 versus stock 2.5 (both from ubuntu feisty) and found very little difference in the OS memory numbers.

Eric

Eric Firing wrote:

I just committed a change to the output formatting of memleak_gui so that if you redirect it to a file, that file can be loaded with pylab.load() in case you want to plot the columns. (At least this is true if you don't use the -c option.)

Great. Sorry for stomping on that :wink:

Yesterday, before your commits, I compared memleak_gui with stock Python 2.4 versus stock 2.5 (both from ubuntu feisty) and found very little difference in the OS memory numbers.

Are they still increasing linearly? I'm still seeing some mystery leaks with Gtk, Qt4 and (much smaller) on Tk. Qt and Wx seem fine here. Unfortunately Qt4 crashes valgrind, so it's not of much use.

I'm curious whether your results match that. I'm not terribly surprised that 2.4 isn't different from 2.5, since the case in which entire memory pools are freed in 2.5 is probably hard to trigger.

Cheers,
Mike

Michael Droettboom wrote:

Eric Firing wrote:

I just committed a change to the output formatting of memleak_gui so that if you redirect it to a file, that file can be loaded with pylab.load() in case you want to plot the columns. (At least this is true if you don't use the -c option.)

Great. Sorry for stomping on that :wink:

Yesterday, before your commits, I compared memleak_gui with stock Python 2.4 versus stock 2.5 (both from ubuntu feisty) and found very little difference in the OS memory numbers.

Are they still increasing linearly? I'm still seeing some mystery leaks with Gtk, Qt4 and (much smaller) on Tk. Qt and Wx seem fine here.

Attached are runs with gtk, wx, qtagg, and tkagg. Quite a variety of results: tkagg is best, with only slow memory growth and a constant number of python objects; qtagg grows by 2.2k per loop, with no increase in python object count; wx (which is built on gtk) consumes 3.5k per loop, with an increasing object count; gtk consumes 1.8k per loop with an increasing object count.

All runs are on stock ubuntu feisty python 2.5.

Eric

memleak_gtk.asc (4.83 KB)

memleak_wx.asc (4.83 KB)

memleak_qtagg.asc (4.83 KB)

memleak_tkagg.asc (4.83 KB)

···

Unfortunately Qt4 crashes valgrind, so it's not of much use.
I'm curious whether your results match that. I'm not terribly surprised that 2.4 isn't different from 2.5, since the case in which entire memory pools are freed in 2.5 is probably hard to trigger.

Cheers,
Mike

I am swamped at work, and have not been able to follow this thread closely.
But I just updated from svn and ran memleak_gui.py with qt4:

# columns are: iteration, OS memory (k), number of python objects

···

On Tuesday 03 July 2007 04:33:46 pm Eric Firing wrote:

Michael Droettboom wrote:
> Eric Firing wrote:
>> I just committed a change to the output formatting of memleak_gui so
>> that if you redirect it to a file, that file can be loaded with
>> pylab.load() in case you want to plot the columns. (At least this is
>> true if you don't use the -c option.)
>
> Great. Sorry for stomping on that :wink:
>
>> Yesterday, before your commits, I compared memleak_gui with stock
>> Python 2.4 versus stock 2.5 (both from ubuntu feisty) and found very
>> little difference in the OS memory numbers.
>
> Are they still increasing linearly? I'm still seeing some mystery leaks
> with Gtk, Qt4 and (much smaller) on Tk. Qt and Wx seem fine here.

Attached are runs with gtk, wx, qtagg, and tkagg. Quite a variety of
results: tkagg is best, with only slow memory growth and a constant
number of python objects; qtagg grows by 2.2k per loop, with no increase
in python object count; wx (which is built on gtk) consumes 3.5k per
loop, with an increasing object count; gtk consumes 1.8k per loop with
an increasing object count.

All runs are on stock ubuntu feisty python 2.5.

Eric

> Unfortunately Qt4 crashes valgrind, so it's not of much use.
> I'm curious whether your results match that. I'm not terribly surprised
> that 2.4 isn't different from 2.5, since the case in which entire memory
> pools are freed in 2.5 is probably hard to trigger.

#
   0 37364 53792
  10 37441 53792
  20 37441 53792
  30 37525 53792
  40 37483 53792
  50 37511 53792
  60 37539 53792
  70 37568 53792
  80 37596 53792
  90 37624 53792
100 37653 53792
# columns above are: iteration, OS memory (k), number of python objects
#
# uncollectable list: []
#
# Backend Qt4Agg, toolbar toolbar2
# Averaging over loops 30 to 100
# Memory went from 37525k to 37653k
# Average memory consumed per loop: 1.8286k bytes

Darren

Eric Firing wrote:

Attached are runs with gtk, wx, qtagg, and tkagg. Quite a variety of results: tkagg is best, with only slow memory growth and a constant number of python objects; qtagg grows by 2.2k per loop, with no increase in python object count; wx (which is built on gtk) consumes 3.5k per loop, with an increasing object count; gtk consumes 1.8k per loop with an increasing object count.

All runs are on stock ubuntu feisty python 2.5.

Thanks for these results. Unfortunately, I'm seeing different results here. [dagnabbit!] None of them have an increasing object count for me, which leads me to suspect there's some version difference between your environment and mine that isn't being accounted for.

Gtk[Agg|Cairo] -- 1.3k per loop.
Wx[Agg] -- 0.010k per loop
QtAgg -- 2.3k per loop (which is in the same ballpark as your result)
Qt4Agg -- 1.4k per loop (which seems to be in the same ballpark as Darren Dale's result)
TkAgg -- 0.29k per loop

I don't know if the size of memory per loop is directly comparable between your environment and mine, but certainly the shape of the curve, and whether the number of Python objects is growing is very relevant.

I made some more commits to SVN on 07/03/07 necessary for recent versions of gtk+ and qt. Did you (by any chance) not get those patches? It would also be interesting to know which versions of the toolkits you have, as they are probably different from mine. Is it safe to assume that they are all the stock Ubuntu feisty packages? In any case, I have updated memleak_gui.py to display the relevant toolkit versions. I've also attached a script to display the toolkit versions. Its output on my machine is:

# pygtk version: (2, 10, 4), gtk version: (2, 10, 9)
# PyQt4 version: 4.2, Qt version 40300
# pyqt version: 3.17.2, qt version: 30303
# wxPython version: 2.8.4.0
# Tkinter version: $Revision: 50704 $, Tk version: 8.4, Tcl version: 8.4

Cheers,
Mike

toolkit_versions.py (1.01 KB)

memleak_gtk_mdroe.asc (4.88 KB)

memleak_gtkagg_mdroe.asc (4.89 KB)

memleak_gtkcairo_mdroe.asc (4.89 KB)

memleak_qt4agg_mdroe.asc (4.87 KB)

memleak_qtagg_mdroe.asc (4.87 KB)

memleak_tkagg_mdroe.asc (4.9 KB)

memleak_wx_mdroe.asc (4.86 KB)

memleak_wxagg_mdroe.asc (4.86 KB)

Michael Droettboom wrote:

Eric Firing wrote:

Attached are runs with gtk, wx, qtagg, and tkagg. Quite a variety of results: tkagg is best, with only slow memory growth and a constant number of python objects; qtagg grows by 2.2k per loop, with no increase in python object count; wx (which is built on gtk) consumes 3.5k per loop, with an increasing object count; gtk consumes 1.8k per loop with an increasing object count.

All runs are on stock ubuntu feisty python 2.5.

Thanks for these results. Unfortunately, I'm seeing different results here. [dagnabbit!] None of them have an increasing object count for me, which leads me to suspect there's some version difference between your environment and mine that isn't being accounted for.

Gtk[Agg|Cairo] -- 1.3k per loop.
Wx[Agg] -- 0.010k per loop
QtAgg -- 2.3k per loop (which is in the same ballpark as your result)
Qt4Agg -- 1.4k per loop (which seems to be in the same ballpark as Darren Dale's result)
TkAgg -- 0.29k per loop

I don't know if the size of memory per loop is directly comparable between your environment and mine, but certainly the shape of the curve, and whether the number of Python objects is growing is very relevant.

I made some more commits to SVN on 07/03/07 necessary for recent versions of gtk+ and qt. Did you (by any chance) not get those patches? It would also be interesting to know which versions of the toolkits you have, as they are probably different from mine. Is it safe to assume that they are all the stock Ubuntu feisty packages? In any case, I have updated memleak_gui.py to display the relevant toolkit versions. I've also attached a script to display the toolkit versions. Its output on my machine is:

# pygtk version: (2, 10, 4), gtk version: (2, 10, 9)
# PyQt4 version: 4.2, Qt version 40300
# pyqt version: 3.17.2, qt version: 30303
# wxPython version: 2.8.4.0
# Tkinter version: $Revision: 50704 $, Tk version: 8.4, Tcl version: 8.4

Here is mine--not very different from yours:

# pygtk version: (2, 10, 4), gtk version: (2, 10, 11)
# PyQt4 version: 4.1, Qt version 40202
# pyqt version: 3.17, qt version: 30307
# wxPython version: 2.8.1.1
# Tkinter version: $Revision: 50704 $, Tk version: 8.4, Tcl version: 8.4

Everything is stock ubuntu feisty. wx is built on gtk.
I'm pretty sure I did the tests after your 07/03/07 commits.

I just updated from svn and tried to rerun the wx test, but ran into an error:

efiring@...340...:~/programs/py/mpl/tests$ python ../matplotlib_units/unit/memleak_gui.py -dwx -s1000 -e2000 > ~/temp/memleak_wx_0705.asc
Traceback (most recent call last):
   File "../matplotlib_units/unit/memleak_gui.py", line 58, in <module>
     pylab.close(fig)
   File "/usr/local/lib/python2.5/site-packages/matplotlib/pylab.py", line 742, in close
     _pylab_helpers.Gcf.destroy(manager.num)
   File "/usr/local/lib/python2.5/site-packages/matplotlib/_pylab_helpers.py", line 28, in destroy
     figManager.destroy()
   File "/usr/local/lib/python2.5/site-packages/matplotlib/backends/backend_wx.py", line 1403, in destroy
     self.frame.Destroy()
   File "/usr/local/lib/python2.5/site-packages/matplotlib/backends/backend_wx.py", line 1362, in Destroy
     wxapp.Yield()
NameError: global name 'wxapp' is not defined

Eric

Eric Firing wrote:

I just updated from svn and tried to rerun the wx test, but ran into an error:

efiring@...340...:~/programs/py/mpl/tests$ python wxapp.Yield()
NameError: global name 'wxapp' is not defined

I think I just saw a note that Ken had committed a patch that a user had provided that kept the wx back-end from re-starting an event loop if there was one already running -- maybe that has something to do with this bug?

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Yes -- the global wxapp variable was removed (a very good thing). I just committed a patch to fix this crash (r3460)

Cheers,
Mike

Christopher Barker wrote:

···

Eric Firing wrote:
  

I just updated from svn and tried to rerun the wx test, but ran into an error:

efiring@...340...:~/programs/py/mpl/tests$ python wxapp.Yield()
NameError: global name 'wxapp' is not defined
    
I think I just saw a note that Ken had committed a patch that a user had provided that kept the wx back-end from re-starting an event loop if there was one already running -- maybe that has something to do with this bug?

-Chris

Mike,

New exception:

efiring@...340...:~/programs/py/mpl/tests$ python ../matplotlib_units/unit/memleak_gui.py -dwx -s1000 -e2000 > ~/temp/memleak_wx_0705.asc
Traceback (most recent call last):
   File "../matplotlib_units/unit/memleak_gui.py", line 58, in <module>
     pylab.close(fig)
   File "/usr/local/lib/python2.5/site-packages/matplotlib/pylab.py", line 742, in close
     _pylab_helpers.Gcf.destroy(manager.num)
   File "/usr/local/lib/python2.5/site-packages/matplotlib/_pylab_helpers.py", line 28, in destroy
     figManager.destroy()
   File "/usr/local/lib/python2.5/site-packages/matplotlib/backends/backend_wx.py", line 1405, in destroy
     self.frame.Destroy()
   File "/usr/local/lib/python2.5/site-packages/matplotlib/backends/backend_wx.py", line 1364, in Destroy
     wxapp.Yield()
AttributeError: 'listiterator' object has no attribute 'Yield'

Eric