Legend is eating up a lot of memory

mbrennwa · January 23, 2021, 7:56pm

I have a Python program that gets data from a measurement instrument and plots the data using matplotlib. A separate thread is used to trigger updating of the plots at fixed time intervals.

After a while, the program will take up huge amounts of memory (gigabytes after a few hours). If I put some pressure on the system by running another application that will consume a lot of memory, my Python program will at some point start to release the excessive memory used up by matplotlib (ending up at about 60 MB or so). Releasing the memory does not seem to have any negative effects on the operation of my program. This tells me that the large junk of memory used by matplotlib is not vital (if not useless) in my application.

The simplified code below is a self-contained example that illustrates the effect. The issue does not happen if the code is modified to skip the axes.legend(…) call, i.e., if no legend is drawn. You can run the code with / without the legend by setting use_legend to True or False on line 111.

I believe the unnecessary memory usage related to the legend is a bug, or at least something very obscure that seems wrong to me. How can this issue be avoided or fixed?

import wx
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg as FigureCanvas
import time
from numpy.random import rand
from threading import Thread
from setproctitle import setproctitle


print ( 'Matplotlib version: ' + matplotlib.__version__ )


class measurement_instrument():
# measurement instrument

	def __init__(self, num_val):
		self.val1 = None
		self.val2 = None
		self._num_val = num_val
		
	def read(self):
		# get new measurement data:
		u = rand(1)[0]
		if u < 0.3:
			self.val1 = rand(self._num_val)
			self.val2 = None
		elif u < 0.7:
			self.val1 = None
			self.val2 = rand(self._num_val)
		else:
			self.val1 = rand(self._num_val)
			self.val2 = rand(self._num_val)


class measurement_thread(Thread):
# background thread that takes measurements at fixed time intervals

	def __init__(self, instrument):
		Thread.__init__(self)
		self._instrument = instrument
		self.start()
		
	def run(self):
		while True:
			self._instrument.read()
			time.sleep(0.01)


class plots_frame(wx.Frame):

	def __init__(self, instrument, use_legend):

		# Access to measurement instrument (to get data for plotting):
		self._instrument = instrument
		
		# Create the window:		
		wx.Frame.__init__(self, parent=None, title='Instrument Data', style=wx.DEFAULT_FRAME_STYLE)
		
		# Create a wx panel to hold the plots
		p  = wx.Panel(self)
		p.SetBackgroundColour(wx.NullColour)
		
		# Set up figure:
		self.figure = plt.figure()
		self.axes = self.figure.add_subplot(1,1,1)
		self.axes.grid()	
		self.canvas = FigureCanvas(p, -1, self.figure)
		
		self._do_legend = use_legend # set this to False for no legend in the plot

		# wx sizer / widget arrangement within the window:
		sizer = wx.BoxSizer(wx.VERTICAL)
		sizer.Add(self.canvas, 1, wx.EXPAND)
		p.SetSizer(sizer) # apply the sizer to the panel
		sizer.SetSizeHints(self) # inform frame about size hints for the panel (min. plot canvas size and stuff)

		# install wx.Timer to update the data plot:
		self.Bind(wx.EVT_TIMER, self.on_plot_data) # bind to TIMER event
		self.plot_timer = wx.Timer(self) # timer to trigger plots
		wx.CallAfter(self.plot_timer.Start,100)
		
		# Show the wx.Frame
		self.Show()

	def on_plot_data(self,event):

		# Remove the old lines that are currently in the plot:
		while len(self.axes.lines) > 0: self.axes.lines[0].remove()
		
		# Plot the new instrument data:
		if self._instrument.val1 is not None:
			self.axes.plot(self._instrument.val1, label='val-1', color='r')
		if self._instrument.val2 is not None:
			self.axes.plot(self._instrument.val2, label='val-2', color='b')

		# Legend (commenting out this line will avoid the memory issue!):
		if self._do_legend:
			leg = self.axes.get_legend() # get the old legend, if there is one
			if leg is not None:
				leg.remove() # delete the old legend, if there is one
			self.axes.legend(loc=1) # show new legend

		# Refresh the plot window:
		self.canvas.draw()
		self.Refresh()


########## main:

use_legend = True # use this to turn the legend on / off

if use_legend:
	setproctitle('memorytest_with_legend')
else:
	setproctitle('memorytest_without_legend')

app = wx.App()
instrument   = measurement_instrument(num_val=300)
dataplots    = plots_frame(instrument, use_legend)
measurements = measurement_thread(instrument)

app.MainLoop()

jklymak · January 23, 2021, 10:21pm

Someone who uses wx will have to help with the minimal example and why it appears to grow in memory. But in general its preferable to draw your lines and make the legend on __init__ and then just update the x and y data on the line objects via set_xdata and set_ydata: matplotlib.lines.Line2D — Matplotlib 3.3.3 documentation

If you don’t have any data on init, that is probably fine, you can just set to 0, 0 or some such.

In general, I don’t know enough about wx or python garbage collection to understand if the memory growing is really a problem. If it is released when you need it, why are you worried about it? So long as you don’t start swapping to disk, it seems the memory management should be fine.

Finally, Matplotlib is not in general thread safe: How-to — Matplotlib 3.3.3 documentation

mbrennwa · January 24, 2021, 2:00am

I can’t just do the legend once at init, because the number of lines in the plot may change along the way. That’s why the legend needs to be updated / re-done every time as in the code example I posted. My real application is quite a bit more involved in this regard.

Sure, the memory gets released if the system REALLY needs it. However, hoarding gigabytes of unnecessary memory for no good reason is just wrong. There must be a better way that is friendlier with the system resources.

Please note that the matplotlib part of the code is NOT executed in a separate thread. The wx.CallAfter(…) call triggers the execution of the plot code in the main thread of the program.

I don’t see why wx would be related to the growing memory. Just run the code and see for yourself how the matplotlib legend makes all the difference.

jklymak · January 24, 2021, 4:51am

Maybe just try clearing the figure each draw? You aren’t saving much by keeping the axes around if you are changing every artist.

I don’t have wx so I cannot run your example but maybe someone else can.

The source code for legend is on GitHub. If you track down a bug that would be very helpful.

mbrennwa · January 24, 2021, 10:58am

I am not deeply familiar with the matplotlib internals, but a quick look at the source code at github showed the following:

(1) The legend source code at matplotlib/legend.py at master · matplotlib/matplotlib · GitHub defines the Legend class by deriving it from the Artist class. The Legend class does not have it’s own definition of the remove() function, so it just inherits the one from the Artist class.

(2) Looking at the Artist class definition at matplotlib/artist.py at master · matplotlib/matplotlib · GitHub shows this:

def remove(self):
        ...
        # Note: there is no support for removing the artist's legend entry.
        ...
        # TODO: add legend support
        ...

As far as I can tell from these findings, the remove() method is not properly implemented / supported for the Legend class in matplotlib. It is therefore no surprise that that the “removing” a an existing legend does not release the memory associated with the legend.

mbrennwa · January 24, 2021, 11:04am

How can I delete an existing legend so that the memory assiciated with it gets freed?

jklymak · January 24, 2021, 4:00pm

If there is something in general wrong with the remove method can it be replicated without the app?

anntzer.lee · January 24, 2021, 4:08pm

The comment regarding legends is a red herring IMO; legends can be properly removed (at least they’re intended to be so), the comment rather says that when an artist gets removed, the corresponding legend entry is not removed as well.
I believe the issue is a valid one and I reopened the original issue.