Suggestion: move the test baseline images out of git

I suppose this is the current place to discuss development ideas?


We currently have our baseline result images for tests in git, which is suboptimal. Changing the images leads to an accumulation of binary files in git history, which:

  • will little by little make git slower to use
  • makes code reviewers ask for intermediate images to be squashed away, and perhaps hesitate to accept changes that require changes to images
  • locks us to one version of FreeType and any other dependencies that affect text rendering (such as Harfbuzz suggested in #14912)

The current system has some benefits:

  • new images (at least png images) are seen together with the pull request so the reviewer can decide if they are correct
  • checking out an older revision automatically gives you the corresponding result images

Suggested solution

Here’s a proposal for an alternative system. Set up a place for developers and CI systems to store, download and view image files. This could be an S3 bucket or a pretty simple web app, or even a separate git repository. Give the images similar names as in the current git repository, except allow multiple versions, so perhaps the names could be something like


Instead of keeping the images in git, store just the allowed versions of each image along with the corresponding test case:

@image_comparison(['nan_path'], style='default', remove_text=True,
                  extensions=['pdf', 'svg', 'eps', 'png'],
                  correct=['1.pdf', '2.pdf', '1.svg', '2.png', '1.eps'])
def test_nan_isolated_points():

This would imply that 1.png would no longer be considered a correct result, but both 1.pdf and 2.pdf are acceptable. The older 1.png would still be available in case an older branch refers to it.

Changing the test images in a PR would happen by uploading a new image (say 3.png) which would be a more lightweight process than code review, and the PR would just include the identifier of the new image. We could build a Github bot that detects changed image references and adds a preview in a comment.

The tests could work by downloading the referenced images and comparing to them with the existing code (just allowing for different versions of the same image), and released versions of Matplotlib could be accompanied by a separate image archive so that offline testing is possible.

What do you think?

Jounni, there is a google summer of code student working on this now w @anntzer and there are a couple of issues and draft PRs. If you had a chance to check those out you could make suggestions there? I have not adequately followed the details to know how close their plans are to your suggestions above.

I guess this might be the central issue:

Yes, thats right. Sorry to not give that link - I was on my phone :wink: