Possible memory leak?

Matplotlib Users:

It seems matplotlib plotting has a relatively small memory leak. My

experiments suggest it leaks between 5K and 8K bytes of RAM for ever plot

redraw. For example, in one experiment, plotting the same buffer (so as to not

allocate new memory) every second for a period of about 12 hours resulted in

memory usage (physical RAM) increasing by approximately 223MB, which is about

5.3K per replot. The plotting code is:

class PlotPanel(wx.Panel):

def init(self, parent):

wx.Panel.init(self, parent, wx.ID_ANY,

style=wx.BORDER_THEME|wx.TAB_TRAVERSAL)

self._figure = MplFigure(dpi=None)

self._canvas = MplCanvas(self, -1, self._figure)

self._axes = self._figure.add_subplot(1,1,1)

sizer = wx.BoxSizer(wx.VERTICAL)

sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5)

self.SetSizer(sizer)

def draw(self, channel, seconds):

self._axes.clear()

self._axes.plot(channel, seconds)

self._canvas.draw()

draw() is called every second with the same channels and seconds

numpy.array buffers.

In my case, this leak, though relatively small, becomes a serious issue since

my software often runs for long periods of time (days) plotting data streamed

from a data acquisition unit.

Any suggestions will help. Am I miss understanding something here? Maybe I

need to call some obscure function to free memory, or something?

My testing environment:

  • Windws XP SP3, Intel Core 2 Duo @ 2.33GHz, 1.96 GB RAM

  • Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit

(Intel)] on win32

  • matplotlib version 1.0.0

  • numpy 1.4.1

  • wxPython version 2.8.11.0

The complete test program follows.

Thanks,

Caleb

from random import random

from datetime import datetime

import os

import time

import win32api

import win32con

import win32process

import wx

import numpy

import matplotlib as mpl

from matplotlib.figure import Figure as MplFigure

from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg as MplCanvas

def get_process_memory_info(process_id):

memory = {}

process = None

try:

process = win32api.OpenProcess(

win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,

False, process_id);

if process is not None:

return win32process.GetProcessMemoryInfo(process)

finally:

if process:

win32api.CloseHandle(process)

return memory

meg = 1024.0 * 1024.0

class PlotPanel(wx.Panel):

def init(self, parent):

wx.Panel.init(self, parent, wx.ID_ANY,

style=wx.BORDER_THEME|wx.TAB_TRAVERSAL)

self._figure = MplFigure(dpi=None)

self._canvas = MplCanvas(self, -1, self._figure)

self._axes = self._figure.add_subplot(1,1,1)

sizer = wx.BoxSizer(wx.VERTICAL)

sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5)

self.SetSizer(sizer)

def draw(self, channel, seconds):

self._axes.clear()

self._axes.plot(channel, seconds)

self._canvas.draw()

class TestFrame(wx.Frame):

def init(self, parent, id, title):

wx.Frame.init(

self, parent, id, title, wx.DefaultPosition, (600, 400))

self.testDuration = 60 * 60 * 24

self.startTime = 0

self.channel = numpy.sin(numpy.arange(1000) * random())

self.seconds = numpy.arange(len(self.channel))

self.plotPanel = PlotPanel(self)

sizer = wx.BoxSizer(wx.VERTICAL)

sizer.Add(self.plotPanel, 1 ,wx.EXPAND)

self.SetSizer(sizer)

self._timer = wx.Timer(self)

self.Bind(wx.EVT_TIMER, self._onTimer, self._timer)

self._timer.Start(1000)

print "starting memory: ",\

get_process_memory_info(os.getpid())[“WorkingSetSize”]/meg

def _onTimer(self, evt):

if self.startTime == 0:

self.startTime = time.time()

if (time.time() - self.startTime) >= self.testDuration:

self._timer.Stop()

self.plotPanel.draw(self.channel, self.seconds)

t = datetime.now()

memory = get_process_memory_info(os.getpid())

print “time: {0}, working: {1:f}”.format(

t, memory[“WorkingSetSize”]/meg)

class MyApp(wx.App):

def OnInit(self):

frame = TestFrame(None, wx.ID_ANY, “Memory Leak”)

self.SetTopWindow(frame)

frame.Show(True)

return True

if name == ‘main’:

app = MyApp(0)

app.MainLoop()

Caleb,

Interesting analysis. One possible source of a leak would be some sort of dangling reference that still hangs around even though the plot objects have been cleared. By the time of the matplotlib 1.0.0 release, we did seem to clear out pretty much all of these, but it is possible there are still some lurking about. We should probably run your script against the latest svn to see how the results compare.

Another possibility might be related to numpy. However this is the draw statement, so I don’t know how much numpy is used in there. The latest refactor work in numpy has revealed some memory leaks that have existed, so who knows?

Might be interesting to try making equivalent versions of this script using different backends, and different package versions to possibly isolate the source of the memory leak.

Thanks for your observations,

Ben Root

···

On Thu, Nov 18, 2010 at 1:11 PM, Caleb Constantine <cadamantine@…287…> wrote:

Matplotlib Users:

It seems matplotlib plotting has a relatively small memory leak. My

experiments suggest it leaks between 5K and 8K bytes of RAM for ever plot

redraw. For example, in one experiment, plotting the same buffer (so as to not

allocate new memory) every second for a period of about 12 hours resulted in

memory usage (physical RAM) increasing by approximately 223MB, which is about

5.3K per replot. The plotting code is:

class PlotPanel(wx.Panel):

def init(self, parent):

wx.Panel.init(self, parent, wx.ID_ANY,

style=wx.BORDER_THEME|wx.TAB_TRAVERSAL)

self._figure = MplFigure(dpi=None)

self._canvas = MplCanvas(self, -1, self._figure)

self._axes = self._figure.add_subplot(1,1,1)

sizer = wx.BoxSizer(wx.VERTICAL)

sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5)

self.SetSizer(sizer)

def draw(self, channel, seconds):

self._axes.clear()

self._axes.plot(channel, seconds)

self._canvas.draw()

draw() is called every second with the same channels and seconds

numpy.array buffers.

In my case, this leak, though relatively small, becomes a serious issue since

my software often runs for long periods of time (days) plotting data streamed

from a data acquisition unit.

Any suggestions will help. Am I miss understanding something here? Maybe I

need to call some obscure function to free memory, or something?

My testing environment:

  • Windws XP SP3, Intel Core 2 Duo @ 2.33GHz, 1.96 GB RAM
  • Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit

(Intel)] on win32

  • matplotlib version 1.0.0
  • numpy 1.4.1
  • wxPython version 2.8.11.0

The complete test program follows.

Thanks,

Caleb

from random import random

from datetime import datetime

import os

import time

import win32api

import win32con

import win32process

import wx

import numpy

import matplotlib as mpl

from matplotlib.figure import Figure as MplFigure

from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg as MplCanvas

def get_process_memory_info(process_id):

memory = {}

process = None

try:

process = win32api.OpenProcess(

win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,

False, process_id);

if process is not None:

return win32process.GetProcessMemoryInfo(process)

finally:

if process:

win32api.CloseHandle(process)

return memory

meg = 1024.0 * 1024.0

class PlotPanel(wx.Panel):

def init(self, parent):

wx.Panel.init(self, parent, wx.ID_ANY,

style=wx.BORDER_THEME|wx.TAB_TRAVERSAL)

self._figure = MplFigure(dpi=None)

self._canvas = MplCanvas(self, -1, self._figure)

self._axes = self._figure.add_subplot(1,1,1)

sizer = wx.BoxSizer(wx.VERTICAL)

sizer.Add(self._canvas, 1, wx.EXPAND|wx.TOP, 5)

self.SetSizer(sizer)

def draw(self, channel, seconds):

self._axes.clear()

self._axes.plot(channel, seconds)

self._canvas.draw()

class TestFrame(wx.Frame):

def init(self, parent, id, title):

wx.Frame.init(

self, parent, id, title, wx.DefaultPosition, (600, 400))

self.testDuration = 60 * 60 * 24

self.startTime = 0

self.channel = numpy.sin(numpy.arange(1000) * random())

self.seconds = numpy.arange(len(self.channel))

self.plotPanel = PlotPanel(self)

sizer = wx.BoxSizer(wx.VERTICAL)

sizer.Add(self.plotPanel, 1 ,wx.EXPAND)

self.SetSizer(sizer)

self._timer = wx.Timer(self)

self.Bind(wx.EVT_TIMER, self._onTimer, self._timer)

self._timer.Start(1000)

print "starting memory: ",\

get_process_memory_info(os.getpid())[“WorkingSetSize”]/meg

def _onTimer(self, evt):

if self.startTime == 0:

self.startTime = time.time()

if (time.time() - self.startTime) >= self.testDuration:

self._timer.Stop()

self.plotPanel.draw(self.channel, self.seconds)

t = datetime.now()

memory = get_process_memory_info(os.getpid())

print “time: {0}, working: {1:f}”.format(

t, memory[“WorkingSetSize”]/meg)

class MyApp(wx.App):

def OnInit(self):

frame = TestFrame(None, wx.ID_ANY, “Memory Leak”)

self.SetTopWindow(frame)

frame.Show(True)

return True

if name == ‘main’:

app = MyApp(0)

app.MainLoop()

In our experience, many of the GUI backends have some leak, and these
are in the GUI and not in mpl. Caleb, can you see if you can
replicate the leak with your example code using the agg backend (no
GUI). If so, could you post the code that exposes the leak. if not,
I'm afraid it is in wx and you might need to deal with the wx
developers.

JDH

···

On Thu, Nov 18, 2010 at 2:20 PM, Benjamin Root <ben.root@...1304...> wrote:

Interesting analysis. One possible source of a leak would be some sort of
dangling reference that still hangs around even though the plot objects have
been cleared. By the time of the matplotlib 1.0.0 release, we did seem to
clear out pretty much all of these, but it is possible there are still some
lurking about. We should probably run your script against the latest svn to
see how the results compare.

Heh. Good timing! I just fixed a bug in Chaco involving a leaking cycle that the garbage collector could not clean up. The lesson of my tale of woe is that even if there is no leak when you run without wxPython, that doesn't mean that wxPython is the culprit.

If any object in the connected graph containing a cycle (even if it does not directly participate in the cycle) has an __del__ method in pure Python, then the garbage collector will not clean up that cycle for safety reasons. Read the docs for the gc module for details. We use SWIG to wrap Agg and SWIG adds __del__ methods for all of its classes. wxPython uses SWIG and has the same problems. If there is a cycle which can reach a wxPython object, the cycle will leak. The actual cycle may be created by matplotlib, though.

You can determine if this is the case pretty easily, though. Call gc.collect() then examine the list gc.garbage. This will contain all of those objects with a __del__ that prevented a cycle from being collected.

I recommend using objgraph to diagram the graph of references to those objects. It's invaluable to actually see what's going on.

   http://pypi.python.org/pypi/objgraph

···

On 11/18/10 5:05 PM, John Hunter wrote:

On Thu, Nov 18, 2010 at 2:20 PM, Benjamin Root<ben.root@...1304...> wrote:

Interesting analysis. One possible source of a leak would be some sort of
dangling reference that still hangs around even though the plot objects have
been cleared. By the time of the matplotlib 1.0.0 release, we did seem to
clear out pretty much all of these, but it is possible there are still some
lurking about. We should probably run your script against the latest svn to
see how the results compare.

In our experience, many of the GUI backends have some leak, and these
are in the GUI and not in mpl. Caleb, can you see if you can
replicate the leak with your example code using the agg backend (no
GUI). If so, could you post the code that exposes the leak. if not,
I'm afraid it is in wx and you might need to deal with the wx
developers.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

I conducted a couple more experiments taking into consideration suggestions

made in responses to my original post (thanks for the response).

First, I ran my original test (as close to it as possible anyway) using the

Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory usage

increased by 86MB. That’s about 5.3K per redraw. Very similar to my original

experiment. As suggested, I called gc.collect() after each iteration. It

returned 67 for every iteration (no increase), although len(gc.garbage)

reported 0 each iteration.

Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 times. Memory

usage fluctuated over time, but essentially did not increase: starting at

32.54MB and ending at 32.79MB. gc.collect() reported 0 after each iteration

as did len(gc.garbage).

Attached are images of plots showing change in memory usage over time for each

experiment.

Any comments would be appreciated.

Following is the code for each experiment.

Agg

tkagg_memory_usage.png

agg_memory_usage.png

···

On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben.root@…1304…> wrote:

Caleb,

Interesting analysis. One possible source of a leak would be some sort of dangling reference that still hangs around even though the plot objects have been cleared. By the time of the matplotlib 1.0.0 release, we did seem to clear out pretty much all of these, but it is possible there are still some lurking about. We should probably run your script against the latest svn to see how the results compare.

Another possibility might be related to numpy. However this is the draw statement, so I don’t know how much numpy is used in there. The latest refactor work in numpy has revealed some memory leaks that have existed, so who knows?

Might be interesting to try making equivalent versions of this script using different backends, and different package versions to possibly isolate the source of the memory leak.

Thanks for your observations,

Ben Root


from random import random

from datetime import datetime

import os

import gc

import time

import win32api

import win32con

import win32process

import numpy

import matplotlib

matplotlib.use(“Agg”)

from matplotlib.figure import Figure

from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas

def get_process_memory_info(process_id):

memory = {}

process = None

try:

process = win32api.OpenProcess(

win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,

False, process_id);

if process is not None:

return win32process.GetProcessMemoryInfo(process)

finally:

if process:

win32api.CloseHandle(process)

return memory

meg = 1024.0 * 1024.0

figure = Figure(dpi=None)

canvas = FigureCanvas(figure)

axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):

axes.clear()

axes.plot(channel, seconds)

canvas.print_figure(‘test.png’)

channel = numpy.sin(numpy.arange(1000) * random())

seconds = numpy.arange(len(channel))

testDuration = 60 * 60 * 3

startTime = time.time()

print "starting memory: ", \

get_process_memory_info(os.getpid())[“WorkingSetSize”]/meg

while (time.time() - startTime) < testDuration:

draw(channel, seconds)

t = datetime.now()

memory = get_process_memory_info(os.getpid())

print “time: {0}, working: {1:f}, collect: {2}, garbage: {3}”.format(

t,

memory[“WorkingSetSize”]/meg,

gc.collect(),

len(gc.garbage) )

time.sleep(0.5)

TkAgg


from random import random

from datetime import datetime

import sys

import os

import gc

import time

import win32api

import win32con

import win32process

import numpy

import matplotlib

matplotlib.use(“TkAgg”)

from matplotlib.figure import Figure

from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \

as FigureCanvas

import Tkinter as tk

def get_process_memory_info(process_id):

memory = {}

process = None

try:

process = win32api.OpenProcess(

win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,

False, process_id);

if process is not None:

return win32process.GetProcessMemoryInfo(process)

finally:

if process:

win32api.CloseHandle(process)

return memory

meg = 1024.0 * 1024.0

rootTk = tk.Tk()

rootTk.wm_title(“TKAgg Memory Leak”)

figure = Figure()

canvas = FigureCanvas(figure, master=rootTk)

axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):

axes.clear()

axes.plot(channel, seconds)

channel = numpy.sin(numpy.arange(1000) * random())

seconds = numpy.arange(len(channel))

testDuration = 60 * 60 * 3

startTime = time.time()

print "starting memory: ", \

get_process_memory_info(os.getpid())[“WorkingSetSize”]/meg

draw(channel, seconds)

canvas.show()

canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

rate = 500

def on_tick():

canvas.get_tk_widget().after(rate, on_tick)

if (time.time() - startTime) >= testDuration:

return

draw(channel, seconds)

t = datetime.now()

memory = get_process_memory_info(os.getpid())

print “time: {0}, working: {1:f}, collect: {2}, garbage: {3}”.format(

t,

memory[“WorkingSetSize”]/meg,

gc.collect(),

len(gc.garbage) )

canvas.get_tk_widget().after(rate, on_tick)

tk.mainloop()

Sorry for the double post; it seems the first is not displaying
correctly on SourceForge.

I conducted a couple more experiments taking into consideration suggestions
made in responses to my original post (thanks for the response).

First, I ran my original test (as close to it as possible anyway) using the
Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory usage
increased by 86MB. That's about 5.3K per redraw. Very similar to my original
experiment. As suggested, I called gc.collect() after each iteration. It
returned 67 for every iteration (no increase), although len(gc.garbage)
reported 0 each iteration.

Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 times. Memory
usage fluctuated over time, but essentially did not increase: starting at
32.54MB and ending at 32.79MB. gc.collect() reported 0 after each iteration
as did len(gc.garbage).

Attached are images of plots showing change in memory usage over time for each
experiment.

Any comments would be appreciated.

Following is the code for each experiment.

Agg

tkagg_memory_usage.png

agg_memory_usage.png

···

On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben.root@...1304...> wrote:

Caleb,

Interesting analysis. One possible source of a leak would be some sort of dangling reference that still hangs around even though the plot objects have been cleared. By the time of the matplotlib 1.0.0 release, we did seem to clear out pretty much all of these, but it is possible there are still some lurking about. We should probably run your script against the latest svn to see how the results compare.

Another possibility might be related to numpy. However this is the draw statement, so I don't know how much numpy is used in there. The latest refactor work in numpy has revealed some memory leaks that have existed, so who knows?

Might be interesting to try making equivalent versions of this script using different backends, and different package versions to possibly isolate the source of the memory leak.

Thanks for your observations,
Ben Root

-----

from random import random
from datetime import datetime
import os
import gc
import time
import win32api
import win32con
import win32process

import numpy

import matplotlib
matplotlib.use("Agg")
from matplotlib.figure import Figure
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas

def get_process_memory_info(process_id):
    memory = {}
    process = None
    try:
        process = win32api.OpenProcess(
            win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
            False, process_id);
        if process is not None:
            return win32process.GetProcessMemoryInfo(process)
    finally:
        if process:
            win32api.CloseHandle(process)
    return memory

meg = 1024.0 * 1024.0

figure = Figure(dpi=None)
canvas = FigureCanvas(figure)
axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):
    axes.clear()
    axes.plot(channel, seconds)
    canvas.print_figure('test.png')

channel = numpy.sin(numpy.arange(1000) * random())
seconds = numpy.arange(len(channel))
testDuration = 60 * 60 * 3
startTime = time.time()

print "starting memory: ", \
    get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

while (time.time() - startTime) < testDuration:
    draw(channel, seconds)

    t = datetime.now()
    memory = get_process_memory_info(os.getpid())
    print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format(
        t,
        memory["WorkingSetSize"]/meg,
        gc.collect(),
        len(gc.garbage) )

    time.sleep(0.5)

TkAgg
---------
from random import random
from datetime import datetime
import sys
import os
import gc
import time
import win32api
import win32con
import win32process

import numpy

import matplotlib
matplotlib.use("TkAgg")
from matplotlib.figure import Figure
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \
    as FigureCanvas

import Tkinter as tk

def get_process_memory_info(process_id):
    memory = {}
    process = None
    try:
        process = win32api.OpenProcess(
            win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
            False, process_id);
        if process is not None:
            return win32process.GetProcessMemoryInfo(process)
    finally:
        if process:
            win32api.CloseHandle(process)
    return memory

meg = 1024.0 * 1024.0

rootTk = tk.Tk()
rootTk.wm_title("TKAgg Memory Leak")

figure = Figure()
canvas = FigureCanvas(figure, master=rootTk)
axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):
    axes.clear()
    axes.plot(channel, seconds)

channel = numpy.sin(numpy.arange(1000) * random())
seconds = numpy.arange(len(channel))

testDuration = 60 * 60 * 3
startTime = time.time()

print "starting memory: ", \
    get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

draw(channel, seconds)
canvas.show()
canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

rate = 500

def on_tick():
    canvas.get_tk_widget().after(rate, on_tick)

    if (time.time() - startTime) >= testDuration:
        return

    draw(channel, seconds)

    t = datetime.now()
    memory = get_process_memory_info(os.getpid())
    print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format(
        t,
        memory["WorkingSetSize"]/meg,
        gc.collect(),
        len(gc.garbage) )

canvas.get_tk_widget().after(rate, on_tick)
tk.mainloop()

Caleb,

Thanks for doing all of this investigation and providing something

easy to reproduce.

With the help of valgrind, I believe I've tracked it down to a bug

in PyCXX, the Python/C++ interface tool matplotlib uses.

I have attached a patch that seems to remove the leak for me, but as

I’m not a PyCXX expert, I’m not comfortable with committing it to
the repository just yet. ** I’m hoping you and/or some other
developers could test it on their systems (a fully clean re-build
is required) and report any problems back. ** I also plan to
raise this question on the PyCXX mailing list to get any thoughts
they may have.

Cheers,
Mike

cxx_memleak.patch (574 Bytes)

···

<ben.root@…1304…>http://p.sf.net/sfu/msIE9-sfdev2dev


Matplotlib-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/matplotlib-users

Interesting results. I would like to try these tests on a Linux machine to see if there is a difference, but I don’t know what the equivalent functions would be to some of the win32 calls. Does anybody have a reference for such things?

Ben Root

···

On Fri, Nov 19, 2010 at 3:14 PM, Caleb Constantine <cadamantine@…287…> wrote:

On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben.root@…1304…> wrote:

Caleb,

Interesting analysis. One possible source of a leak would be some sort of dangling reference that still hangs around even though the plot objects have been cleared. By the time of the matplotlib 1.0.0 release, we did seem to clear out pretty much all of these, but it is possible there are still some lurking about. We should probably run your script against the latest svn to see how the results compare.

Another possibility might be related to numpy. However this is the draw statement, so I don’t know how much numpy is used in there. The latest refactor work in numpy has revealed some memory leaks that have existed, so who knows?

Might be interesting to try making equivalent versions of this script using different backends, and different package versions to possibly isolate the source of the memory leak.

Thanks for your observations,

Ben Root

Sorry for the double post; it seems the first is not displaying

correctly on SourceForge.

I conducted a couple more experiments taking into consideration suggestions

made in responses to my original post (thanks for the response).

First, I ran my original test (as close to it as possible anyway) using the

Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory usage

increased by 86MB. That’s about 5.3K per redraw. Very similar to my original

experiment. As suggested, I called gc.collect() after each iteration. It

returned 67 for every iteration (no increase), although len(gc.garbage)

reported 0 each iteration.

Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 times. Memory

usage fluctuated over time, but essentially did not increase: starting at

32.54MB and ending at 32.79MB. gc.collect() reported 0 after each iteration

as did len(gc.garbage).

Attached are images of plots showing change in memory usage over time for each

experiment.

Any comments would be appreciated.

Following is the code for each experiment.

Agg


from random import random

from datetime import datetime

import os

import gc
import time

import win32api

import win32con

import win32process

import numpy

import matplotlib

matplotlib.use(“Agg”)

from matplotlib.figure import Figure

from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas

def get_process_memory_info(process_id):

memory = {}

process = None

try:

    process = win32api.OpenProcess(

        win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,

        False, process_id);

    if process is not None:

        return win32process.GetProcessMemoryInfo(process)

finally:

    if process:

        win32api.CloseHandle(process)

return memory

meg = 1024.0 * 1024.0

figure = Figure(dpi=None)

canvas = FigureCanvas(figure)

axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):

axes.clear()

axes.plot(channel, seconds)

canvas.print_figure('test.png')

channel = numpy.sin(numpy.arange(1000) * random())

seconds = numpy.arange(len(channel))

testDuration = 60 * 60 * 3

startTime = time.time()

print "starting memory: ", \

get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

while (time.time() - startTime) < testDuration:

draw(channel, seconds)

t = datetime.now()

memory = get_process_memory_info(os.getpid())

print “time: {0}, working: {1:f}, collect: {2}, garbage: {3}”.format(

    t,

memory[“WorkingSetSize”]/meg,

    gc.collect(),

    len(gc.garbage) )



time.sleep(0.5)

TkAgg


from random import random

from datetime import datetime

import sys

import os

import gc

import time

import win32api

import win32con

import win32process

import numpy

import matplotlib

matplotlib.use(“TkAgg”)

from matplotlib.figure import Figure

from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \

as FigureCanvas

import Tkinter as tk

def get_process_memory_info(process_id):

memory = {}

process = None

try:

    process = win32api.OpenProcess(

        win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,

        False, process_id);

    if process is not None:

        return win32process.GetProcessMemoryInfo(process)

finally:

    if process:

        win32api.CloseHandle(process)

return memory

meg = 1024.0 * 1024.0

rootTk = tk.Tk()

rootTk.wm_title(“TKAgg Memory Leak”)

figure = Figure()

canvas = FigureCanvas(figure, master=rootTk)

axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):

axes.clear()

axes.plot(channel, seconds)

channel = numpy.sin(numpy.arange(1000) * random())

seconds = numpy.arange(len(channel))

testDuration = 60 * 60 * 3

startTime = time.time()

print "starting memory: ", \

get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

draw(channel, seconds)

canvas.show()

canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

rate = 500

def on_tick():

canvas.get_tk_widget().after(rate, on_tick)



if (time.time() - startTime) >= testDuration:

    return

draw(channel, seconds)

t = datetime.now()

memory = get_process_memory_info(os.getpid())

print “time: {0}, working: {1:f}, collect: {2}, garbage: {3}”.format(

    t,

memory[“WorkingSetSize”]/meg,

    gc.collect(),

    len(gc.garbage) )

canvas.get_tk_widget().after(rate, on_tick)

tk.mainloop()

     >
     > Caleb,
     >
     > Interesting analysis. One possible source of a leak would be
    some sort of dangling reference that still hangs around even though
    the plot objects have been cleared. By the time of the matplotlib
    1.0.0 release, we did seem to clear out pretty much all of these,
    but it is possible there are still some lurking about. We should
    probably run your script against the latest svn to see how the
    results compare.
     >
     > Another possibility might be related to numpy. However this is
    the draw statement, so I don't know how much numpy is used in there.
    The latest refactor work in numpy has revealed some memory leaks
    that have existed, so who knows?
     >
     > Might be interesting to try making equivalent versions of this
    script using different backends, and different package versions to
    possibly isolate the source of the memory leak.
     >
     > Thanks for your observations,
     > Ben Root
     >

    Sorry for the double post; it seems the first is not displaying
    correctly on SourceForge.

    I conducted a couple more experiments taking into consideration
    suggestions
    made in responses to my original post (thanks for the response).

    First, I ran my original test (as close to it as possible anyway)
    using the
    Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory
    usage
    increased by 86MB. That's about 5.3K per redraw. Very similar to my
    original
    experiment. As suggested, I called gc.collect() after each iteration. It
    returned 67 for every iteration (no increase), although len(gc.garbage)
    reported 0 each iteration.

    Second, I ran a test targeting TkAgg for 3 hours, plotting 21374
    times. Memory
    usage fluctuated over time, but essentially did not increase:
    starting at
    32.54MB and ending at 32.79MB. gc.collect() reported 0 after each
    iteration
    as did len(gc.garbage).

    Attached are images of plots showing change in memory usage over
    time for each
    experiment.

    Any comments would be appreciated.

    Following is the code for each experiment.

    Agg
    -----

    from random import random
    from datetime import datetime
    import os
    import gc
    import time
    import win32api
    import win32con
    import win32process

    import numpy

    import matplotlib
    matplotlib.use("Agg")
    from matplotlib.figure import Figure
    from matplotlib.backends.backend_agg import FigureCanvasAgg as
    FigureCanvas

    def get_process_memory_info(process_id):
        memory = {}
        process = None
        try:
            process = win32api.OpenProcess(
                win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
                False, process_id);
            if process is not None:
                return win32process.GetProcessMemoryInfo(process)
        finally:
            if process:
                win32api.CloseHandle(process)
        return memory

    meg = 1024.0 * 1024.0

    figure = Figure(dpi=None)
    canvas = FigureCanvas(figure)
    axes = figure.add_subplot(1,1,1)

    def draw(channel, seconds):
        axes.clear()
        axes.plot(channel, seconds)
        canvas.print_figure('test.png')

    channel = numpy.sin(numpy.arange(1000) * random())
    seconds = numpy.arange(len(channel))
    testDuration = 60 * 60 * 3
    startTime = time.time()

    print "starting memory: ", \
        get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

    while (time.time() - startTime) < testDuration:
        draw(channel, seconds)

        t = datetime.now()
        memory = get_process_memory_info(os.getpid())
        print "time: {0}, working: {1:f}, collect: {2}, garbage:
    {3}".format(
            t,
            memory["WorkingSetSize"]/meg,
            gc.collect(),
            len(gc.garbage) )

        time.sleep(0.5)

    TkAgg
    ---------
    from random import random
    from datetime import datetime
    import sys
    import os
    import gc
    import time
    import win32api
    import win32con
    import win32process

    import numpy

    import matplotlib
    matplotlib.use("TkAgg")
    from matplotlib.figure import Figure
    from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \
        as FigureCanvas

    import Tkinter as tk

    def get_process_memory_info(process_id):
        memory = {}
        process = None
        try:
            process = win32api.OpenProcess(
                win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
                False, process_id);
            if process is not None:
                return win32process.GetProcessMemoryInfo(process)
        finally:
            if process:
                win32api.CloseHandle(process)
        return memory

    meg = 1024.0 * 1024.0

    rootTk = tk.Tk()
    rootTk.wm_title("TKAgg Memory Leak")

    figure = Figure()
    canvas = FigureCanvas(figure, master=rootTk)
    axes = figure.add_subplot(1,1,1)

    def draw(channel, seconds):
        axes.clear()
        axes.plot(channel, seconds)

    channel = numpy.sin(numpy.arange(1000) * random())
    seconds = numpy.arange(len(channel))

    testDuration = 60 * 60 * 3
    startTime = time.time()

    print "starting memory: ", \
        get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

    draw(channel, seconds)
    canvas.show()
    canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

    rate = 500

    def on_tick():
        canvas.get_tk_widget().after(rate, on_tick)

        if (time.time() - startTime) >= testDuration:
            return

        draw(channel, seconds)

        t = datetime.now()
        memory = get_process_memory_info(os.getpid())
        print "time: {0}, working: {1:f}, collect: {2}, garbage:
    {3}".format(
            t,
            memory["WorkingSetSize"]/meg,
            gc.collect(),
            len(gc.garbage) )

    canvas.get_tk_widget().after(rate, on_tick)
    tk.mainloop()

Interesting results. I would like to try these tests on a Linux machine
to see if there is a difference, but I don't know what the equivalent
functions would be to some of the win32 calls. Does anybody have a
reference for such things?

Do you need win32 calls, or do you just need to read the memory usage? If the latter, see cbook.report_memory().

Eric

···

On 11/22/2010 06:15 AM, Benjamin Root wrote:

On Fri, Nov 19, 2010 at 3:14 PM, Caleb Constantine > <cadamantine@…287… <mailto:cadamantine@…287…>> wrote:
    On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben.root@…1304… > <mailto:ben.root@…1304…>> wrote:

Ben Root

I tried out the script using cbook.report_memory() with and without the patch. The patch certainly made the leak much slower. I am still finding a very slow leak at approximately 0.03226 MiB per 5 minutes in the resident set size.

Ben Root

···

On Mon, Nov 22, 2010 at 11:32 AM, Eric Firing <efiring@…202…> wrote:

On 11/22/2010 06:15 AM, Benjamin Root wrote:

On Fri, Nov 19, 2010 at 3:14 PM, Caleb Constantine > > <cadamantine@…287… mailto:cadamantine@...287...> wrote:

On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root <ben.root@...1304... > >     <mailto:ben.root@...1304...>> wrote:
 >
 > Caleb,
 >
 > Interesting analysis.  One possible source of a leak would be
some sort of dangling reference that still hangs around even though
the plot objects have been cleared.  By the time of the matplotlib
1.0.0 release, we did seem to clear out pretty much all of these,
but it is possible there are still some lurking about.  We should
probably run your script against the latest svn to see how the
results compare.
 >
 > Another possibility might be related to numpy.  However this is
the draw statement, so I don't know how much numpy is used in there.
The latest refactor work in numpy has revealed some memory leaks
that have existed, so who knows?
 >
 > Might be interesting to try making equivalent versions of this
script using different backends, and different package versions to
possibly isolate the source of the memory leak.
 >
 > Thanks for your observations,
 > Ben Root
 >
Sorry for the double post; it seems the first is not displaying
correctly on SourceForge.
I conducted a couple more experiments taking into consideration
suggestions
made in responses to my original post (thanks for the response).
First, I ran my original test (as close to it as possible anyway)
using the
Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory
usage
increased by 86MB. That's about 5.3K per redraw. Very similar to my
original
experiment. As suggested, I called gc.collect() after each iteration. It
returned 67 for every iteration (no increase), although len(gc.garbage)
reported 0 each iteration.
Second, I ran a test targeting TkAgg for 3 hours, plotting 21374
times. Memory
usage fluctuated over time, but essentially did not increase:
starting at
32.54MB and ending at 32.79MB. gc.collect() reported 0 after each
iteration
as did len(gc.garbage).
Attached are images of plots showing change in memory usage over
time for each
experiment.
Any comments would be appreciated.
Following is the code for each experiment.
Agg
-----
from random import random
from datetime import datetime
import os
import gc
import time
import win32api
import win32con
import win32process
import numpy
import matplotlib
matplotlib.use("Agg")
from matplotlib.figure import Figure
from matplotlib.backends.backend_agg import FigureCanvasAgg as
FigureCanvas
def get_process_memory_info(process_id):
    memory = {}
    process = None
    try:
        process = win32api.OpenProcess(
            win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
            False, process_id);
        if process is not None:
            return win32process.GetProcessMemoryInfo(process)
    finally:
        if process:
            win32api.CloseHandle(process)
    return memory
meg = 1024.0 * 1024.0
figure = Figure(dpi=None)
canvas = FigureCanvas(figure)
axes = figure.add_subplot(1,1,1)
def draw(channel, seconds):
    axes.clear()
    axes.plot(channel, seconds)
    canvas.print_figure('test.png')
channel = numpy.sin(numpy.arange(1000) * random())
seconds = numpy.arange(len(channel))
testDuration = 60 * 60 * 3
startTime = time.time()
print "starting memory: ", \
    get_process_memory_info(os.getpid())["WorkingSetSize"]/meg
while (time.time() - startTime) < testDuration:
    draw(channel, seconds)
    t = datetime.now()
    memory = get_process_memory_info(os.getpid())
    print "time: {0}, working: {1:f}, collect: {2}, garbage:
{3}".format(
        t,
        memory["WorkingSetSize"]/meg,
        gc.collect(),
        len(gc.garbage) )
    time.sleep(0.5)
TkAgg
---------
from random import random
from datetime import datetime
import sys
import os
import gc
import time
import win32api
import win32con
import win32process
import numpy
import matplotlib
matplotlib.use("TkAgg")
from matplotlib.figure import Figure
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \
    as FigureCanvas
import Tkinter as tk
def get_process_memory_info(process_id):
    memory = {}
    process = None
    try:
        process = win32api.OpenProcess(
            win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
            False, process_id);
        if process is not None:
            return win32process.GetProcessMemoryInfo(process)
    finally:
        if process:
            win32api.CloseHandle(process)
    return memory
meg = 1024.0 * 1024.0
rootTk = tk.Tk()
rootTk.wm_title("TKAgg Memory Leak")
figure = Figure()
canvas = FigureCanvas(figure, master=rootTk)
axes = figure.add_subplot(1,1,1)
def draw(channel, seconds):
    axes.clear()
    axes.plot(channel, seconds)
channel = numpy.sin(numpy.arange(1000) * random())
seconds = numpy.arange(len(channel))
testDuration = 60 * 60 * 3
startTime = time.time()
print "starting memory: ", \
    get_process_memory_info(os.getpid())["WorkingSetSize"]/meg
draw(channel, seconds)
canvas.show()
canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)
rate = 500
def on_tick():
    canvas.get_tk_widget().after(rate, on_tick)
    if (time.time() - startTime) >= testDuration:
        return
    draw(channel, seconds)
    t = datetime.now()
    memory = get_process_memory_info(os.getpid())
    print "time: {0}, working: {1:f}, collect: {2}, garbage:
{3}".format(
        t,
        memory["WorkingSetSize"]/meg,
        gc.collect(),
        len(gc.garbage) )
canvas.get_tk_widget().after(rate, on_tick)
tk.mainloop()

Interesting results. I would like to try these tests on a Linux machine

to see if there is a difference, but I don’t know what the equivalent

functions would be to some of the win32 calls. Does anybody have a

reference for such things?

Do you need win32 calls, or do you just need to read the memory usage?

If the latter, see cbook.report_memory().

Eric

Ben Root

This picks up from a thread of the same name between 18 Nov 2010 and
22 Nov 2010.

Release 1.0.1 of matplotlib has made significant gains in reducing the
memory leak (thanks!!), but it did not
eliminate the problem entirely. Recall, the TkAgg back-end does not
have any leak, so we know this particular
leak is in matplotlib or wxPython.

Here are the results of some tests.

Matplotlib 1.0.0

- 1 hour
- Plotted 3595 times, about 1Hz
- Memory usage increased by about 18.7MB (59.96 - 41.25), or about
5.3K per redraw.

Matplotlib 1.0.1

- 1 hour
- Plotted 3601 times, about 1Hz
- Memory usage increased by about 1.4MB (42.98 - 41.59), or about
0.40K per redraw.

- 12 hour
- Plotted 43201 times, about 1Hz
- Memory usage increased by about 13.3MB (54.32 - 41.01), or about
0.32K per redraw.

As stated before, for a process plotting data for long periods of
time, this becomes an issue.

Caleb

There's a lot of moving parts here. Running your script again is showing some leaks in valgrind that weren't there before, but a number of the underlying libraries have changed on my system since then (memory leaks tend to be Whac-a-mole sometimes...)

Which versions of the following are you running, and on what platform -- some variant of MS-Windows if I recall correctly?

Python
Numpy
wxPython
Tkinter

Mike

···

On 04/19/2011 10:25 AM, Caleb Constantine wrote:

This picks up from a thread of the same name between 18 Nov 2010 and
22 Nov 2010.

Release 1.0.1 of matplotlib has made significant gains in reducing the
memory leak (thanks!!), but it did not
eliminate the problem entirely. Recall, the TkAgg back-end does not
have any leak, so we know this particular
leak is in matplotlib or wxPython.

Here are the results of some tests.

Matplotlib 1.0.0

- 1 hour
- Plotted 3595 times, about 1Hz
- Memory usage increased by about 18.7MB (59.96 - 41.25), or about
5.3K per redraw.

Matplotlib 1.0.1

- 1 hour
- Plotted 3601 times, about 1Hz
- Memory usage increased by about 1.4MB (42.98 - 41.59), or about
0.40K per redraw.

- 12 hour
- Plotted 43201 times, about 1Hz
- Memory usage increased by about 13.3MB (54.32 - 41.01), or about
0.32K per redraw.

As stated before, for a process plotting data for long periods of
time, this becomes an issue.

Caleb

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
   
--
Michael Droettboom
Science Software Branch
Space Telescope Science Institute
Baltimore, Maryland, USA

Windows XP SP 3
Python - 2.6.6
Numpy - 1.4.1
wxPython - 2.8.11.0
Tkinter - $Revision: 73770 $

I'll install new versions of Numpy and wxPython (and maybe Python) and
try again.

···

On Tue, Apr 19, 2011 at 1:01 PM, Michael Droettboom <mdroe@...86...> wrote:

There's a lot of moving parts here. Running your script again is
showing some leaks in valgrind that weren't there before, but a number
of the underlying libraries have changed on my system since then (memory
leaks tend to be Whac-a-mole sometimes...)

Which versions of the following are you running, and on what platform --
some variant of MS-Windows if I recall correctly?

Python
Numpy
wxPython
Tkinter

Ok. I have a RHEL5 Linux box with Python 2.7.1.

With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD, I did see a leak -- I submitted a pull request to Numpy here:

   https://github.com/numpy/numpy/pull/76

I get the same results (no leaks) running your wx, tk and agg scripts (with the Windows-specific stuff removed).

FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008.

So the variables are the platform and the version of Python. Perhaps it's one of those two things?

Mike

···

On 04/19/2011 12:34 PM, Caleb Constantine wrote:

On Tue, Apr 19, 2011 at 1:01 PM, Michael Droettboom<mdroe@...86...> wrote:
   

There's a lot of moving parts here. Running your script again is
showing some leaks in valgrind that weren't there before, but a number
of the underlying libraries have changed on my system since then (memory
leaks tend to be Whac-a-mole sometimes...)

Which versions of the following are you running, and on what platform --
some variant of MS-Windows if I recall correctly?

Python
Numpy
wxPython
Tkinter
     

Windows XP SP 3
Python - 2.6.6
Numpy - 1.4.1
wxPython - 2.8.11.0
Tkinter - $Revision: 73770 $

I'll install new versions of Numpy and wxPython (and maybe Python) and
try again.

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
   
--
Michael Droettboom
Science Software Branch
Space Telescope Science Institute
Baltimore, Maryland, USA

Consider the following:

     matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0,
Windows XP SP3

     - 1 hour
     - Plotted 3601 times, about 1Hz
     - Memory usage increased by about 1.16MB (41.39 - 40.23), or
about 0.33K per redraw

It seems the same memory leak exists. Given you don't have this issue
on Linux with the same Python configuration, I can only assume it is
related to some Windows specific code somewhere. I'll run for a longer
period of time just in case, but I don't expect the results to be
different.

···

On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom <mdroe@...86...> wrote:

Ok. I have a RHEL5 Linux box with Python 2.7.1.

With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD,
I did see a leak -- I submitted a pull request to Numpy here:

https://github.com/numpy/numpy/pull/76

I get the same results (no leaks) running your wx, tk and agg scripts
(with the Windows-specific stuff removed).

FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008.

So the variables are the platform and the version of Python. Perhaps
it's one of those two things?

Mike

One way to rule out Windows-specific code may be to run with the Agg backend only (without wx). Have you plotted the memory growth? This amount of memory growth is well within the pool allocation sizes that Python routinely uses. Does the value of len(gc.get_objects()) grow over time?

Mike

···

On 04/20/2011 07:48 AM, Caleb Constantine wrote:

On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<mdroe@...86...> wrote:

Ok. I have a RHEL5 Linux box with Python 2.7.1.

With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD,
I did see a leak -- I submitted a pull request to Numpy here:

   https://github.com/numpy/numpy/pull/76

I get the same results (no leaks) running your wx, tk and agg scripts
(with the Windows-specific stuff removed).

FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008.

So the variables are the platform and the version of Python. Perhaps
it's one of those two things?

Mike

Consider the following:

      matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0,
Windows XP SP3

      - 1 hour
      - Plotted 3601 times, about 1Hz
      - Memory usage increased by about 1.16MB (41.39 - 40.23), or
about 0.33K per redraw

It seems the same memory leak exists. Given you don't have this issue
on Linux with the same Python configuration, I can only assume it is
related to some Windows specific code somewhere. I'll run for a longer
period of time just in case, but I don't expect the results to be
different.

New results follows.

matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3

agg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 28.79 - 27.57 = 1.22 MB
- len(gc.get_objects()): 23424 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.26KB

tkagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 33.22 - 33.32 = -0.1 MB
- len(gc.get_objects()): 24182 at beginning and end
- Plot of memory growth: very irregular (up and down), but a line fit
  has a slope of about 0.025KB (I could run longer and see if slope
approaches 0)

wxagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 43.28 - 41.80 = 1.5 MB
- len(gc.get_objects()): 41473 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.32KB

···

On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom <mdroe@...86...> wrote:

On 04/20/2011 07:48 AM, Caleb Constantine wrote:

On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<mdroe@...86...> wrote:

Ok. I have a RHEL5 Linux box with Python 2.7.1.

With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD,
I did see a leak -- I submitted a pull request to Numpy here:

https://github.com/numpy/numpy/pull/76

I get the same results (no leaks) running your wx, tk and agg scripts
(with the Windows-specific stuff removed).

FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008.

So the variables are the platform and the version of Python. Perhaps
it's one of those two things?

Mike

Consider the following:

  matplotlib 1\.0\.1, numpy 1\.5\.1, python 2\.7\.1, wxPython 2\.8\.11\.0,

Windows XP SP3

  \- 1 hour
  \- Plotted 3601 times, about 1Hz
  \- Memory usage increased by about 1\.16MB \(41\.39 \- 40\.23\), or

about 0.33K per redraw

It seems the same memory leak exists. Given you don't have this issue
on Linux with the same Python configuration, I can only assume it is
related to some Windows specific code somewhere. I'll run for a longer
period of time just in case, but I don't expect the results to be
different.

One way to rule out Windows-specific code may be to run with the Agg
backend only (without wx). Have you plotted the memory growth? This
amount of memory growth is well within the pool allocation sizes that
Python routinely uses. Does the value of len(gc.get_objects()) grow
over time?

Thanks. These are very useful results.

The fact that gc.get_objects() remains constant suggests to me that this is not a simple case of holding on to a Python reference longer than we intend to. Instead, this is either a C-side reference counting bug, or a genuine C malloc-and-never-free bug. Puzzlingly, valgrind usually does a very good job of finding such bugs, but is turning up nothing for me. Will have to scratch my head a little bit longer and see if I can come up with a proper experiment that will help me get to the bottom of this.

Cheers,
Mike

···

On 04/20/2011 11:27 AM, Caleb Constantine wrote:

On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom<mdroe@...86...> wrote:

On 04/20/2011 07:48 AM, Caleb Constantine wrote:

On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<mdroe@...86...> wrote:

Ok. I have a RHEL5 Linux box with Python 2.7.1.

With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD,
I did see a leak -- I submitted a pull request to Numpy here:

    https://github.com/numpy/numpy/pull/76

I get the same results (no leaks) running your wx, tk and agg scripts
(with the Windows-specific stuff removed).

FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008.

So the variables are the platform and the version of Python. Perhaps
it's one of those two things?

Mike

Consider the following:

       matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0,
Windows XP SP3

       - 1 hour
       - Plotted 3601 times, about 1Hz
       - Memory usage increased by about 1.16MB (41.39 - 40.23), or
about 0.33K per redraw

It seems the same memory leak exists. Given you don't have this issue
on Linux with the same Python configuration, I can only assume it is
related to some Windows specific code somewhere. I'll run for a longer
period of time just in case, but I don't expect the results to be
different.

One way to rule out Windows-specific code may be to run with the Agg
backend only (without wx). Have you plotted the memory growth? This
amount of memory growth is well within the pool allocation sizes that
Python routinely uses. Does the value of len(gc.get_objects()) grow
over time?

New results follows.

matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3

agg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 28.79 - 27.57 = 1.22 MB
- len(gc.get_objects()): 23424 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.26KB

tkagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 33.22 - 33.32 = -0.1 MB
- len(gc.get_objects()): 24182 at beginning and end
- Plot of memory growth: very irregular (up and down), but a line fit
   has a slope of about 0.025KB (I could run longer and see if slope
approaches 0)

wxagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 43.28 - 41.80 = 1.5 MB
- len(gc.get_objects()): 41473 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.32KB

For completeness, I ran more tests over a 10 hour period at an
increased redraw rate. Details follows. Note tkagg memory usage is
flat, agg and wxagg are not.

matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3

agg
- 52214 redraws
- Memory usage: 27.55 - 43.46 = 15.22 MB
- len(gc.get_objects()): 23424 at beginning and end
- Plot of memory growth: linear, increasing with slope of 0.31KB

tkagg
- 71379 redraws
- Memory usage: 30.47 - 30.25 = 0.22 MB
- len(gc.get_objects()): 24171 at beginning, 24182 at end, but mostly
  constant at 24182
- Plot of memory growth: very irregular (up and down), but a line fit
  has a slope of about 0.0002KB.

wxagg
- 72001 redraws
- Memory usage: 62.08 - 40.10 = 21.98 MB
- len(gc.get_objects()): 41473 at beginning and end
- Plot of memory growth: linear, increasing with slope of 0.31KB

···

On Wed, Apr 20, 2011 at 1:04 PM, Michael Droettboom <mdroe@...86...> wrote:

On 04/20/2011 11:27 AM, Caleb Constantine wrote:

On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom<mdroe@...86...> wrote:

On 04/20/2011 07:48 AM, Caleb Constantine wrote:

On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<mdroe@...86...> wrote:

Ok. I have a RHEL5 Linux box with Python 2.7.1.

With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD,
I did see a leak -- I submitted a pull request to Numpy here:

https://github.com/numpy/numpy/pull/76

I get the same results (no leaks) running your wx, tk and agg scripts
(with the Windows-specific stuff removed).

FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008.

So the variables are the platform and the version of Python. Perhaps
it's one of those two things?

Mike

Consider the following:

   matplotlib 1\.0\.1, numpy 1\.5\.1, python 2\.7\.1, wxPython 2\.8\.11\.0,

Windows XP SP3

   \- 1 hour
   \- Plotted 3601 times, about 1Hz
   \- Memory usage increased by about 1\.16MB \(41\.39 \- 40\.23\), or

about 0.33K per redraw

It seems the same memory leak exists. Given you don't have this issue
on Linux with the same Python configuration, I can only assume it is
related to some Windows specific code somewhere. I'll run for a longer
period of time just in case, but I don't expect the results to be
different.

One way to rule out Windows-specific code may be to run with the Agg
backend only (without wx). Have you plotted the memory growth? This
amount of memory growth is well within the pool allocation sizes that
Python routinely uses. Does the value of len(gc.get_objects()) grow
over time?

New results follows.

matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3

agg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 28.79 - 27.57 = 1.22 MB
- len(gc.get_objects()): 23424 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.26KB

tkagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 33.22 - 33.32 = -0.1 MB
- len(gc.get_objects()): 24182 at beginning and end
- Plot of memory growth: very irregular (up and down), but a line fit
has a slope of about 0.025KB (I could run longer and see if slope
approaches 0)

wxagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 43.28 - 41.80 = 1.5 MB
- len(gc.get_objects()): 41473 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.32KB

Thanks. These are very useful results.

The fact that gc.get_objects() remains constant suggests to me that this
is not a simple case of holding on to a Python reference longer than we
intend to. Instead, this is either a C-side reference counting bug, or
a genuine C malloc-and-never-free bug. Puzzlingly, valgrind usually
does a very good job of finding such bugs, but is turning up nothing for
me. Will have to scratch my head a little bit longer and see if I can
come up with a proper experiment that will help me get to the bottom of
this.

Ok. I think I've found a leak in the way the spines' paths were being updated.

https://github.com/matplotlib/matplotlib/pull/89

Can you apply the patch there and let me know how it improves things for you?

Cheers,
Mike

···

On 04/21/2011 08:35 AM, Caleb Constantine wrote:

On Wed, Apr 20, 2011 at 1:04 PM, Michael Droettboom<mdroe@...86...> wrote:
   

On 04/20/2011 11:27 AM, Caleb Constantine wrote:
     

On Wed, Apr 20, 2011 at 9:29 AM, Michael Droettboom<mdroe@...86...> wrote:
       

On 04/20/2011 07:48 AM, Caleb Constantine wrote:
         

On Tue, Apr 19, 2011 at 2:25 PM, Michael Droettboom<mdroe@...86...> wrote:
           

Ok. I have a RHEL5 Linux box with Python 2.7.1.

With Numpy 1.4.1 and 1.5.1 I don't see any leaks. With Numpy git HEAD,
I did see a leak -- I submitted a pull request to Numpy here:

     https://github.com/numpy/numpy/pull/76

I get the same results (no leaks) running your wx, tk and agg scripts
(with the Windows-specific stuff removed).

FWIW, I have wxPython 2.8.11.0 and Tkinter rev 81008.

So the variables are the platform and the version of Python. Perhaps
it's one of those two things?

Mike
             

Consider the following:

        matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0,
Windows XP SP3

        - 1 hour
        - Plotted 3601 times, about 1Hz
        - Memory usage increased by about 1.16MB (41.39 - 40.23), or
about 0.33K per redraw

It seems the same memory leak exists. Given you don't have this issue
on Linux with the same Python configuration, I can only assume it is
related to some Windows specific code somewhere. I'll run for a longer
period of time just in case, but I don't expect the results to be
different.
           

One way to rule out Windows-specific code may be to run with the Agg
backend only (without wx). Have you plotted the memory growth? This
amount of memory growth is well within the pool allocation sizes that
Python routinely uses. Does the value of len(gc.get_objects()) grow
over time?

New results follows.

matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3

agg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 28.79 - 27.57 = 1.22 MB
- len(gc.get_objects()): 23424 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.26KB

tkagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 33.22 - 33.32 = -0.1 MB
- len(gc.get_objects()): 24182 at beginning and end
- Plot of memory growth: very irregular (up and down), but a line fit
    has a slope of about 0.025KB (I could run longer and see if slope
approaches 0)

wxagg
- 3601 redraws (1 hour), about 1Hz
- Memory usage: 43.28 - 41.80 = 1.5 MB
- len(gc.get_objects()): 41473 at beginning and end
- Plot of memory growth: roughly linear, increasing with slope of 0.32KB
       

Thanks. These are very useful results.

The fact that gc.get_objects() remains constant suggests to me that this
is not a simple case of holding on to a Python reference longer than we
intend to. Instead, this is either a C-side reference counting bug, or
a genuine C malloc-and-never-free bug. Puzzlingly, valgrind usually
does a very good job of finding such bugs, but is turning up nothing for
me. Will have to scratch my head a little bit longer and see if I can
come up with a proper experiment that will help me get to the bottom of
this.

For completeness, I ran more tests over a 10 hour period at an
increased redraw rate. Details follows. Note tkagg memory usage is
flat, agg and wxagg are not.

matplotlib 1.0.1, numpy 1.5.1, python 2.7.1, wxPython 2.8.11.0, Windows XP SP3

agg
- 52214 redraws
- Memory usage: 27.55 - 43.46 = 15.22 MB
- len(gc.get_objects()): 23424 at beginning and end
- Plot of memory growth: linear, increasing with slope of 0.31KB

tkagg
- 71379 redraws
- Memory usage: 30.47 - 30.25 = 0.22 MB
- len(gc.get_objects()): 24171 at beginning, 24182 at end, but mostly
   constant at 24182
- Plot of memory growth: very irregular (up and down), but a line fit
   has a slope of about 0.0002KB.

wxagg
- 72001 redraws
- Memory usage: 62.08 - 40.10 = 21.98 MB
- len(gc.get_objects()): 41473 at beginning and end
- Plot of memory growth: linear, increasing with slope of 0.31KB

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
   
--
Michael Droettboom
Science Software Branch
Space Telescope Science Institute
Baltimore, Maryland, USA