Where should I store custom cycler objects?

brian.mcfee · March 26, 2024, 5:35pm

I have a package that provides a number of custom data annotation visualizers, which are more or less intended to act similarly to standard matplotlib artists (plot, scatter, etc). For example, a common use pattern looks something like:

>>> ax = plt.gca()
>>> mypackage.plot_annotation(reference_data, ax=ax)
>>> mypackage.plot_annotation(estimate_data, ax=ax)

which adds two visualizations to the same axes object. Ideally, these should automatically iterate through a property cycler (just like ax.plot() would) and not require the user to explicitly set style parameters.

In the past, I’ve done some dirty things like hijacking the axes’ line or patches_for_fill cyclers, but that’s both ugly and no longer viable. Clearly I need to be constructing a custom cycler object, but this raises the question: where should the property cycler object live?

I see two options here, and would like some input to help me figure out which is the least bad:

Maintain a global (session-level) hash mapping axes to my custom cycler object.
Hack the custom cycler object into the axes object as a new attribute, eg ax.my_fancy_cyler = ...

I have a slight preference toward option #2, if only because it’s one fewer thing to manage from my side. However, I also recognize that adding attributes into someone else’s objects could backfire, but I’m not sure how likely that is to happen. Any thoughts or recommendations? Is there a viable third option that I’m not seeing?

brian.mcfee · April 16, 2024, 3:14pm

I sank probably more time into this than I should have, and have not yet come to what I would consider a viable solution here. Notes so far:

Approach 1: self-managed registry

This on the surface seems like the least bad idea, as it forces us to play by the rules.

The first problem I hit with this is that maintaining a dictionary of axes objects could lead to some pretty severe memory bloat over time (eg generating many plots in an interactive session), unless we have some way of managing garbage collection when axes are destroyed. This is a pretty common usage for the code in question, so letting the registry run wild is not a viable option.

In principle, we could use a weakref.WeakKeyDictionary to manage this, and it should clean up automatically when the key (axes object) is removed - however, this fails because axes objects are not properly hashable. (I attempted several workarounds via proxy hashes, but nothing panned out here.)

The only alternative I can see to this would be to register a callback function that removes the key from the dict when the figure is destroyed, but that would preclude any other callback functions from being attached to this signal.

Approach 2: make a sandbox in the axes object

This “works”, as far as I can tell, but as stated above, it’s obviously bad form. I was specifically worried about things like marshalling and serialization here. While these don’t seem likely to be implemented any time soon (judging from the status of MEP25 and other discussions of serialization in the issues), I’d still prefer to avoid this solution if at all possible.

Can anyone out there opine on the above, confirm if my thinking is correct, or suggest alternatives I haven’t considered?

tacaswell · April 25, 2024, 6:22pm

however, this fails because axes objects are not properly hashable. (I attempted several workarounds via proxy hashes, but nothing panned out here.)

Can you say more about this? You should be able to use Axes objects as keys in a dictionary (and quick testing with 3.8.4 works).