I have been using Matplotlib and Python for ~10 years or so and consider myself quite skilled. I just recently realized something that puzzled me and want to share my thoughts:
All Matplotlib tutorials, guides and examples usually start with import matplotlib.pyplot as plt
. That’s fine and what I have been doing since day 0. I also very much like the object-oriented way of making plots and a typical plotting code can then be something like:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0.0, 1.0)
y = x**2
fig, ax = plt.subplots(1, 1)
ax.plot(x, y)
fig.savefig("parabola.pdf")
Which according to my own observations, is more or less a “textbook example” of very simple Matplotlib usage.
I have a program that generate plots from simulation results. Usually just a few, but at some rare occasions there can be hundreds. Recently I got the warning:
RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`).
fig, ax = plt.subplots(1, 1)
And I was puzzled… Why do I have 20 figures open??? What?
I have my plotting nicely organized in a separate function, and my fig
and ax
objects are not automatically refcounted and deleted by the interpreter’s garbage collection???
I have been aware that there are the very-old-school and in my opinion extremely cumbersome matlab-like plotting functions like plt.plot(...)
, plt.xlabel(...)
which require an explicit closing, like in Matlab, but I have always been under the impression that the modern object-oriented interfaces through fig
and ax
were not affected by this.
A solution to the memory leak and warning is to close the figure with plt.close(fig)
apparently. However, this counteract some of the beauty of the OO-interfaces.
But the aim of the post is to understand: how could I have been so wrong for 10 years?
When I look at more or less every Matplotlib example I understand: They all begin with import matplotlib.pyplot as plt
and none of them ends with plt.close(fig)
. There are almost no examples on the Matplotlib gallery that ends with closing the figure properly. In my opinion this is training users in creating deliberate memory leaks!
In the end my preferred solution is currently to stop using pyplot
and instead do:
import matplotlib.figure as figure
import numpy as np
x = np.linspace(0.0, 1.0)
y = x**2
fig = figure.Figure()
ax = fig.subplots(1, 1)
ax.plot(x, y)
fig.savefig("parabola.pdf")
then the fig
and ax
seems to be properly refcounted and deleted when they go out of scope as any other Python object usually is (am i right??).
In my opinion the latter variant is far more elegant. For anyone that is skilled in OO programming and Python this is way more intuitive than to have this external teardown-method plt.close(fig)
that must be called manually on the fig
object (what about the `ax).
My summary:
- Shouldn’t the examples be more correct and close their figures when finished, to educate the users? Why are there almost no examples that close the figure properly?
- Why is not
import matplotlib.figure as figure
the default, modern way of using Matplotlib? Why is there not more examples using this alone without thepyplot
singelton interfaces?