There is a fix in r8341. It passes the regression tests, and all of the event handling examples I tried seem to still work.
It seems that many places in matplotlib were never disconnecting callbacks, and these callbacks keep references to the destination objects alive.
Unfortunately, it's not quite obvious where the "disconnect" calls should be added -- the lifetime of objects isn't very symmetrical. For example, the "units" callback is set up by Lines2D inside of its "set_axes" method, but there is no "remove_axes" method in which to put the disconnect. Tracking down all of the ways in which a line could be removed from an axes seems daunting.
Instead, my solution is to store weak references to the methods stored in the CallbackRegistry -- that way the CallbackRegistry won't leak references like it does now. Since the Python stdlib weakref module doesn't directly support weak references to bound methods, the whole thing is a bit hairy -- but I think it's a more permanent solution than trying to ensure that all callbacks get explicitly disconnected.
As this change is rather fundamental and may have unintended consequences, please play with it in your contexts and let me know if you see anything strange.
On 05/28/2010 10:47 AM, Michael Droettboom wrote:
I'm on to something -- some callbacks are being created that are never
self._xcid = ax.xaxis.callbacks.connect('units', self.recache_always)
gets called twice. This is problematic because the id of the first
connection is simply lost. Also, there doesn't seem to be any code to
attempt to remove either of them.
I'm looking into it further -- forcibly deleting these callbacks reduces
the reference count on the line object, but doesn't seem to completely
eliminate the leak.
On 05/28/2010 10:12 AM, John Hunter wrote:
On Fri, May 28, 2010 at 3:18 AM, Pearu Peterson<pearu@...20...> wrote:
In an application that updates a plot with
new experimental data, say, every second and the experiment
can last hours, I have tried two approaches:
1) clear axes and plot new experimental data - this is
slow and takes too much cpu resources.
2) remove lines and plot new experimental data - this is
fast enough but unfortunately there seems to be a memory
leakage, the application runs out of memory.
Here follows a simple script that demonstrates the
from numpy.testing.utils import memusage
import matplotlib.pyplot as plt
x = range (1000)
axes1 = plt.figure().add_subplot( 111 )
y = numpy.random.rand (len (x))
for line in axes1.lines:
if line.get_label ()=='data':
# no leak, but slow
axes1.plot(x, y, 'b', label='data')
print memusage (), len (axes1.lines)
When running the script, the memory usage
is increasing by 132 kbytes per iteration, that is,
with an hour this example application will consume
464MB RAM while no new data has been generated. In real
application this effect will be even worse.
So, I am looking for an advice how to avoid
this memory leakage without clearing axes.
Hey Pearu -- thanks for the report. We'll try and track down and fix
this leak. In the interim, would an acceptable work around for you be
to *reuse* an existing line by calling set_data on it. That way you
wouldn't have to do the add/remove that is causing your leak. Have
you confirmed this leak on various backends (eg Agg, PDF, PS)?
Matplotlib-devel mailing list
Science Software Branch
Space Telescope Science Institute
Baltimore, Maryland, USA