I am working on the figure tests for sunpy (and if I make it work, astropy) and have run into a bit of a wall when trying to test against development matplotlib. My objective with this is to be able to get reliable figure comparison tests just using tox (pip) to pin versions, and not have to rely on docker or conda etc to get non-python deps like freetype.
I am making use of the fact the binary wheels shipped for mpl, statically link against freetype to provide a stable version of freetype. I am using the version reported by
matplotlib.ft2type.__freetype_version__ to check that it’s the version we expect for the reference images. So far this approach seems to be working, the tests seem stable against the reference images.
I also tried basically the same approach for the git version of mpl. The default behaviour seems to be that when installing from a checkout freetype is downloaded and built into the extension, mpl then reports the same freetype version for all subsequent builds. It seems however that there are frequent shifts in the images, which I would not expect from just movement in mpl itself. (see this for an example https://55079-2165383-gh.circle-artifacts.com/0/.tmp/py37-figure-devdeps/figure_test_images/fig_comparison.html )
Any advice people could give me on the behaviour of building dev mpl, or other things I might need to look at ensuring aren’t shifting in my test setup would be really helpful.