I suppose this is the current place to discuss development ideas?
Problem
We currently have our baseline result images for tests in git, which is suboptimal. Changing the images leads to an accumulation of binary files in git history, which:
- will little by little make git slower to use
- makes code reviewers ask for intermediate images to be squashed away, and perhaps hesitate to accept changes that require changes to images
- locks us to one version of FreeType and any other dependencies that affect text rendering (such as Harfbuzz suggested in #14912)
The current system has some benefits:
- new images (at least png images) are seen together with the pull request so the reviewer can decide if they are correct
- checking out an older revision automatically gives you the corresponding result images
Suggested solution
Here’s a proposal for an alternative system. Set up a place for developers and CI systems to store, download and view image files. This could be an S3 bucket or a pretty simple web app, or even a separate git repository. Give the images similar names as in the current git repository, except allow multiple versions, so perhaps the names could be something like
test_path/nan_path/1.pdf
test_path/nan_path/2.pdf
test_path/nan_path/1.svg
test_path/nan_path/1.png
test_path/nan_path/2.png
test_path/nan_path/1.eps
Instead of keeping the images in git, store just the allowed versions of each image along with the corresponding test case:
@image_comparison(['nan_path'], style='default', remove_text=True,
extensions=['pdf', 'svg', 'eps', 'png'],
correct=['1.pdf', '2.pdf', '1.svg', '2.png', '1.eps'])
def test_nan_isolated_points():
...
This would imply that 1.png
would no longer be considered a correct result, but both 1.pdf
and 2.pdf
are acceptable. The older 1.png
would still be available in case an older branch refers to it.
Changing the test images in a PR would happen by uploading a new image (say 3.png
) which would be a more lightweight process than code review, and the PR would just include the identifier of the new image. We could build a Github bot that detects changed image references and adds a preview in a comment.
The tests could work by downloading the referenced images and comparing to them with the existing code (just allowing for different versions of the same image), and released versions of Matplotlib could be accompanied by a separate image archive so that offline testing is possible.
What do you think?