On testing with other FreeType versions

Hi all,

Downstream in Fedora (and maybe Debian), they are running into issues with
testing and text. Fedora 26 has FreeType 2.7.1 and Fedora 27 & Rawhide has
FreeType 2.8. Fedora 25 uses 2.6.5, but it will be EOL in the next week.
Many other distros are also transitioning to these newer FreeType as well
[1] and I think anaconda recently added 2.8 too.

With 2.7.1, a few tests fail (rms < 1) and it is straightforward to patch
that [2]. With 2.8 though, over 800 tests fail [3] ranging up to ~80 rms
[4]. This is a bit harder to paper over.

I see a few ways to mitigate the problem, with varying
advantages/disadvantages:

1. Bundle the older version in the Matplotlib package like we do with
tests. I don't really believe this to be a viable option for downstream,
but I'm just mentioning it to be thorough. There are already a few (minor)
security issues in the one we test against.
2. Inject older FreeType just to run tests on the package. Again I don't
like this idea. The point of running tests is to be sure that the version
in a distro works *in that distro*. Testing with something a user could
never install seems useless.
3. Re-create all our current and future test images with 2.8. While this is
most future-proof, adding over 800 images is going to bloat the repo quite
a bit.
4. Create some sort of side repo with test images for other FreeType
releases. This would reduce bloat in the main repo but be somewhat more
work. Thus I'd only suggest doing so for tags.

I dislike the first two options as they would be repetitive across distros
(unless they just stopped testing altogether), but the last two are not
without work for us.

Opinions? Alternative ideas?

[1] https://repology.org/metapackage/freetype/versions
[2] https://github.com/QuLogic/matplotlib/commit/cfdc
835923407810bd087f60332cdc8cdcb23f05
[3] https://kojipkgs.fedoraproject.org//work/tasks/3137/23623137/build.log
[4] https://gist.github.com/QuLogic/477055a847a44cd444a0932432acffd1

···

--
Elliott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20171211/aee39dbb/attachment.html>

Hi,

1. Sticking to testing with the old FreeType.

Injecting an older FreeType version is "relatively" easy to do (the
question is whether you want to do it...). Naively, one could just set
LD_PRELOAD to /path/to/libfreetype.so, but that will also affect
subprocesses such as imagemagick, which (IIRC) don't like that, so instead
the correct way is to ensure that the Python process calls
`dlopen('/path/to/libfreetype.so', RTLD_GLOBAL)` which forces symbol
resolution *in this process* to first check the given path, but does not
affect subprocess (alternatively, one could remove LD_PRELOAD from the
environment before calling the subprocess but that seems messier).
Fortunately, dlopen is "effectively" available under the name of
`ctypes.CDLL` in Python.

I have a proof of principle somewhere that patches the testing framework to
1) ensure that an old freetype is built (basically moving the
local_freetype implemenation from setupext to the main lib), and 2) loads
it as above.

Another relevant issue is the manylinux wheels, which must somehow embed a
libfreetype. Currently I believe this is done via static linking. This is
not so great if you also want to load freetype for other reasons; for
example mplcairo (which loads freetype via cairo) currently cannot work
with local_freetype builds due to symbol conflicts. I believe that
switching to the standard manylinux approach (which is to include the
shared object in a hidden folder and set RPATH appropriately) would work
better (and allow us to strip out the static linking code).

2. Switching to newer FreeTypes.

I don't think committing all test images to the main repo is really a
viable option: FreeType is also making new releases every once in a while
and different Linuxes have different versions (
https://pkgs.org/download/freetype gives 2.8.1 (Arch, Debian Sid), 2.8
(Fedora 27, Ubuntu 17.10), 2.6.3 (Debian 9, OpenSUSE 42.3), 2.6.1 (Ubuntu
16.04 LTS) and that's only a few).

I do believe that adding tooling that generates the test images to a side
repo for each tag + FreeType version (say, using the FT versions of the
major distros at the time of the tag) may be reasonable.

3. Side note.

If #9763 (or #5414) gets accepted (new FT wrappers), they will also require
a new generation of the test images: ft2font currently generates "wiggly
baselines" in certain cases (see example in #5414), and try as I might
(i.e. not so much) I could not reproduce them in the new wrapper :slight_smile:

Antony

2017-12-10 21:44 GMT-08:00 Elliott Sales de Andrade <
quantum.analyst at gmail.com>:

Hi all,

Downstream in Fedora (and maybe Debian), they are running into issues with
testing and text. Fedora 26 has FreeType 2.7.1 and Fedora 27 & Rawhide has
FreeType 2.8. Fedora 25 uses 2.6.5, but it will be EOL in the next week.
Many other distros are also transitioning to these newer FreeType as well
[1] and I think anaconda recently added 2.8 too.

With 2.7.1, a few tests fail (rms < 1) and it is straightforward to patch
that [2]. With 2.8 though, over 800 tests fail [3] ranging up to ~80 rms
[4]. This is a bit harder to paper over.

I see a few ways to mitigate the problem, with varying
advantages/disadvantages:

1. Bundle the older version in the Matplotlib package like we do with
tests. I don't really believe this to be a viable option for downstream,
but I'm just mentioning it to be thorough. There are already a few (minor)
security issues in the one we test against.
2. Inject older FreeType just to run tests on the package. Again I don't
like this idea. The point of running tests is to be sure that the version
in a distro works *in that distro*. Testing with something a user could
never install seems useless.
3. Re-create all our current and future test images with 2.8. While this
is most future-proof, adding over 800 images is going to bloat the repo
quite a bit.
4. Create some sort of side repo with test images for other FreeType
releases. This would reduce bloat in the main repo but be somewhat more
work. Thus I'd only suggest doing so for tags.

I dislike the first two options as they would be repetitive across distros
(unless they just stopped testing altogether), but the last two are not
without work for us.

Opinions? Alternative ideas?

[1] https://repology.org/metapackage/freetype/versions
[2] https://github.com/QuLogic/matplotlib/commit/cfdc8359234
07810bd087f60332cdc8cdcb23f05
[3] https://kojipkgs.fedoraproject.org//work/tasks/3137/23623137/build.log
[4] https://gist.github.com/QuLogic/477055a847a44cd444a0932432acffd1

--
Elliott

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20171210/8c0e8fcb/attachment.html>

There may also be an interesting machine learning problem here to use a
more intelligent criteria for determining if an image has failed.

By changing the freetype version we have a bunch of images that do fail
pixel comparison that should not and by slightly modifying tests or
shuffling the test <-> result image mapping we can generate as many do
fail and should fail cases as we need.

Tom

···

On Mon, Dec 11, 2017 at 1:40 AM Antony Lee <antony.lee at berkeley.edu> wrote:

Hi,

1. Sticking to testing with the old FreeType.

Injecting an older FreeType version is "relatively" easy to do (the
question is whether you want to do it...). Naively, one could just set
LD_PRELOAD to /path/to/libfreetype.so, but that will also affect
subprocesses such as imagemagick, which (IIRC) don't like that, so instead
the correct way is to ensure that the Python process calls
`dlopen('/path/to/libfreetype.so', RTLD_GLOBAL)` which forces symbol
resolution *in this process* to first check the given path, but does not
affect subprocess (alternatively, one could remove LD_PRELOAD from the
environment before calling the subprocess but that seems messier).
Fortunately, dlopen is "effectively" available under the name of
`ctypes.CDLL` in Python.

I have a proof of principle somewhere that patches the testing framework
to 1) ensure that an old freetype is built (basically moving the
local_freetype implemenation from setupext to the main lib), and 2) loads
it as above.

Another relevant issue is the manylinux wheels, which must somehow embed a
libfreetype. Currently I believe this is done via static linking. This is
not so great if you also want to load freetype for other reasons; for
example mplcairo (which loads freetype via cairo) currently cannot work
with local_freetype builds due to symbol conflicts. I believe that
switching to the standard manylinux approach (which is to include the
shared object in a hidden folder and set RPATH appropriately) would work
better (and allow us to strip out the static linking code).

2. Switching to newer FreeTypes.

I don't think committing all test images to the main repo is really a
viable option: FreeType is also making new releases every once in a while
and different Linuxes have different versions (
https://pkgs.org/download/freetype gives 2.8.1 (Arch, Debian Sid), 2.8
(Fedora 27, Ubuntu 17.10), 2.6.3 (Debian 9, OpenSUSE 42.3), 2.6.1 (Ubuntu
16.04 LTS) and that's only a few).

I do believe that adding tooling that generates the test images to a side
repo for each tag + FreeType version (say, using the FT versions of the
major distros at the time of the tag) may be reasonable.

3. Side note.

If #9763 (or #5414) gets accepted (new FT wrappers), they will also
require a new generation of the test images: ft2font currently generates
"wiggly baselines" in certain cases (see example in #5414), and try as I
might (i.e. not so much) I could not reproduce them in the new wrapper :slight_smile:

Antony

2017-12-10 21:44 GMT-08:00 Elliott Sales de Andrade <
quantum.analyst at gmail.com>:

Hi all,

Downstream in Fedora (and maybe Debian), they are running into issues
with testing and text. Fedora 26 has FreeType 2.7.1 and Fedora 27 & Rawhide
has FreeType 2.8. Fedora 25 uses 2.6.5, but it will be EOL in the next
week. Many other distros are also transitioning to these newer FreeType as
well [1] and I think anaconda recently added 2.8 too.

With 2.7.1, a few tests fail (rms < 1) and it is straightforward to patch
that [2]. With 2.8 though, over 800 tests fail [3] ranging up to ~80 rms
[4]. This is a bit harder to paper over.

I see a few ways to mitigate the problem, with varying
advantages/disadvantages:

1. Bundle the older version in the Matplotlib package like we do with
tests. I don't really believe this to be a viable option for downstream,
but I'm just mentioning it to be thorough. There are already a few (minor)
security issues in the one we test against.
2. Inject older FreeType just to run tests on the package. Again I don't
like this idea. The point of running tests is to be sure that the version
in a distro works *in that distro*. Testing with something a user could
never install seems useless.
3. Re-create all our current and future test images with 2.8. While this
is most future-proof, adding over 800 images is going to bloat the repo
quite a bit.
4. Create some sort of side repo with test images for other FreeType
releases. This would reduce bloat in the main repo but be somewhat more
work. Thus I'd only suggest doing so for tags.

I dislike the first two options as they would be repetitive across
distros (unless they just stopped testing altogether), but the last two are
not without work for us.

Opinions? Alternative ideas?

[1] https://repology.org/metapackage/freetype/versions
[2]
https://github.com/QuLogic/matplotlib/commit/cfdc835923407810bd087f60332cdc8cdcb23f05
[3]
https://kojipkgs.fedoraproject.org//work/tasks/3137/23623137/build.log
[4] https://gist.github.com/QuLogic/477055a847a44cd444a0932432acffd1

--
Elliott

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20171211/e1467f53/attachment-0001.html>

Debian is also interested in this duscyssuib, we have a local patch to
increase the RMS of several tests, but i agree with Elliot the number
of failing tests is getting bigger and bigger and just bumping the
threshold is not always the best way (as we risk to make the testsuite
pass just to not feel bad, without actually spotting real issues).

I remember a similar conversation in the past, and the idea of
providing multiple sets of reference images, built with different
freetype versions, and to release them as an additional tarball the
downstream distribution can download and bundle up with the python
code release (remember debian dont want to download stuff during the
build, which is where we run the test suite).

i also like a lot Thomas' idea of having AI/ML actually inspect the
image and say if they are alike "enough" for the test to pass instead
of a pixel-by-pixel comparison, but it may be a long time effort (GSOC
maybe?) and we should also keep an eye on how long the test suite run
time it will be (mpl is already long enough to build as is lol)

···

On Mon, Dec 11, 2017 at 11:32 AM, Thomas Caswell <tcaswell at gmail.com> wrote:

There may also be an interesting machine learning problem here to use a more
intelligent criteria for determining if an image has failed.

By changing the freetype version we have a bunch of images that do fail
pixel comparison that should not and by slightly modifying tests or
shuffling the test <-> result image mapping we can generate as many do fail
and should fail cases as we need.

Tom

On Mon, Dec 11, 2017 at 1:40 AM Antony Lee <antony.lee at berkeley.edu> wrote:

Hi,

1. Sticking to testing with the old FreeType.

Injecting an older FreeType version is "relatively" easy to do (the
question is whether you want to do it...). Naively, one could just set
LD_PRELOAD to /path/to/libfreetype.so, but that will also affect
subprocesses such as imagemagick, which (IIRC) don't like that, so instead
the correct way is to ensure that the Python process calls
`dlopen('/path/to/libfreetype.so', RTLD_GLOBAL)` which forces symbol
resolution *in this process* to first check the given path, but does not
affect subprocess (alternatively, one could remove LD_PRELOAD from the
environment before calling the subprocess but that seems messier).
Fortunately, dlopen is "effectively" available under the name of
`ctypes.CDLL` in Python.

I have a proof of principle somewhere that patches the testing framework
to 1) ensure that an old freetype is built (basically moving the
local_freetype implemenation from setupext to the main lib), and 2) loads it
as above.

Another relevant issue is the manylinux wheels, which must somehow embed a
libfreetype. Currently I believe this is done via static linking. This is
not so great if you also want to load freetype for other reasons; for
example mplcairo (which loads freetype via cairo) currently cannot work with
local_freetype builds due to symbol conflicts. I believe that switching to
the standard manylinux approach (which is to include the shared object in a
hidden folder and set RPATH appropriately) would work better (and allow us
to strip out the static linking code).

2. Switching to newer FreeTypes.

I don't think committing all test images to the main repo is really a
viable option: FreeType is also making new releases every once in a while
and different Linuxes have different versions
(https://pkgs.org/download/freetype gives 2.8.1 (Arch, Debian Sid), 2.8
(Fedora 27, Ubuntu 17.10), 2.6.3 (Debian 9, OpenSUSE 42.3), 2.6.1 (Ubuntu
16.04 LTS) and that's only a few).

I do believe that adding tooling that generates the test images to a side
repo for each tag + FreeType version (say, using the FT versions of the
major distros at the time of the tag) may be reasonable.

3. Side note.

If #9763 (or #5414) gets accepted (new FT wrappers), they will also
require a new generation of the test images: ft2font currently generates
"wiggly baselines" in certain cases (see example in #5414), and try as I
might (i.e. not so much) I could not reproduce them in the new wrapper :slight_smile:

Antony

2017-12-10 21:44 GMT-08:00 Elliott Sales de Andrade
<quantum.analyst at gmail.com>:

Hi all,

Downstream in Fedora (and maybe Debian), they are running into issues
with testing and text. Fedora 26 has FreeType 2.7.1 and Fedora 27 & Rawhide
has FreeType 2.8. Fedora 25 uses 2.6.5, but it will be EOL in the next week.
Many other distros are also transitioning to these newer FreeType as well
[1] and I think anaconda recently added 2.8 too.

With 2.7.1, a few tests fail (rms < 1) and it is straightforward to patch
that [2]. With 2.8 though, over 800 tests fail [3] ranging up to ~80 rms
[4]. This is a bit harder to paper over.

I see a few ways to mitigate the problem, with varying
advantages/disadvantages:

1. Bundle the older version in the Matplotlib package like we do with
tests. I don't really believe this to be a viable option for downstream, but
I'm just mentioning it to be thorough. There are already a few (minor)
security issues in the one we test against.
2. Inject older FreeType just to run tests on the package. Again I don't
like this idea. The point of running tests is to be sure that the version in
a distro works *in that distro*. Testing with something a user could never
install seems useless.
3. Re-create all our current and future test images with 2.8. While this
is most future-proof, adding over 800 images is going to bloat the repo
quite a bit.
4. Create some sort of side repo with test images for other FreeType
releases. This would reduce bloat in the main repo but be somewhat more
work. Thus I'd only suggest doing so for tags.

I dislike the first two options as they would be repetitive across
distros (unless they just stopped testing altogether), but the last two are
not without work for us.

Opinions? Alternative ideas?

[1] https://repology.org/metapackage/freetype/versions
[2]
https://github.com/QuLogic/matplotlib/commit/cfdc835923407810bd087f60332cdc8cdcb23f05
[3]
https://kojipkgs.fedoraproject.org//work/tasks/3137/23623137/build.log
[4] https://gist.github.com/QuLogic/477055a847a44cd444a0932432acffd1

--
Elliott

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Sandro "morph" Tosi
My website: http://sandrotosi.me/
Me at Debian: http://wiki.debian.org/SandroTosi
G+: https://plus.google.com/u/0/+SandroTosi

I see a few ways to mitigate the problem, with varying advantages/disadvantages:

1. Bundle the older version in the Matplotlib package like we do with tests. I don't really believe this to be a viable option for downstream, but I'm just mentioning it to be thorough. There are already a few (minor) security issues in the one we test against.
2. Inject older FreeType just to run tests on the package. Again I don't like this idea. The point of running tests is to be sure that the version in a distro works *in that distro*. Testing with something a user could never install seems useless.
3. Re-create all our current and future test images with 2.8. While this is most future-proof, adding over 800 images is going to bloat the repo quite a bit.
4. Create some sort of side repo with test images for other FreeType releases. This would reduce bloat in the main repo but be somewhat more work. Thus I'd only suggest doing so for tags.

I dislike the first two options as they would be repetitive across distros (unless they just stopped testing altogether), but the last two are not without work for us.

I guess something like 4 makes sense to me. You guys have more experience than I do, but? it seems testing *most* of the repo with a fixed FreeType would be fine. The point of most of the tests is to catch Matplotlib bugs, and the exact font being rendered doesn?t matter too much. Then there could be a much smaller separate repo for the tests that depend on the font being rendered, and to test that downstream distributions work.

Cheers, Jody

···

Opinions? Alternative ideas?

[1] https://repology.org/metapackage/freetype/versions
[2] https://github.com/QuLogic/matplotlib/commit/cfdc835923407810bd087f60332cdc8cdcb23f05
[3] https://kojipkgs.fedoraproject.org//work/tasks/3137/23623137/build.log
[4] https://gist.github.com/QuLogic/477055a847a44cd444a0932432acffd1

--
Elliott
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

--
Jody Klymak
http://web.uvic.ca/~jklymak/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20171211/39176350/attachment.html>

Hi all,

Downstream in Fedora (and maybe Debian), they are running into issues with
testing and text. Fedora 26 has FreeType 2.7.1 and Fedora 27 & Rawhide has
FreeType 2.8. Fedora 25 uses 2.6.5, but it will be EOL in the next week.
Many other distros are also transitioning to these newer FreeType as well
[1] and I think anaconda recently added 2.8 too.

With 2.7.1, a few tests fail (rms < 1) and it is straightforward to patch
that [2]. With 2.8 though, over 800 tests fail [3] ranging up to ~80 rms
[4]. This is a bit harder to paper over.

I see a few ways to mitigate the problem, with varying
advantages/disadvantages:

1. Bundle the older version in the Matplotlib package like we do with
tests. I don't really believe this to be a viable option for downstream,
but I'm just mentioning it to be thorough. There are already a few (minor)
security issues in the one we test against.
2. Inject older FreeType just to run tests on the package. Again I don't
like this idea. The point of running tests is to be sure that the version
in a distro works *in that distro*. Testing with something a user could
never install seems useless.
3. Re-create all our current and future test images with 2.8. While this
is most future-proof, adding over 800 images is going to bloat the repo
quite a bit.
4. Create some sort of side repo with test images for other FreeType
releases. This would reduce bloat in the main repo but be somewhat more
work. Thus I'd only suggest doing so for tags.

One more thing I forgot to mention is ImageHash [5] which is used by Iris
for image tests using the following strategy [6]:

   - using a perceptual 'image hash' of the outputs
   <https://github.com/JohannesBuchner/imagehash> as the basis for checking
   test results.
   - storing the hashes of 'known accepted results' for each test in a
   database in the repo
   - storing associated reference images for each hash value in a separate
   public repository,
<https://github.com/SciTools/test-images-scitools>allowing
   human-eye judgement of 'valid equivalent' results.
   - a new version of the 'iris/tests/idiff.py' assists in comparing
   proposed new 'correct' result images with the existing accepted ones.

While this does reduce load in the main repo itself, it does increase the
cognitive load for developers. Iris has a small core group of developers
and much fewer drive-by contributions compared to Matplotlib, so I'm not
sure we want to be doing this full idea. (Note also their repo is LGPL3, so
please don't copy anything from there.) Using ImageHash might still be
useful instead of RMS, though it may generalize things too much.

I dislike the first two options as they would be repetitive across distros

(unless they just stopped testing altogether), but the last two are not
without work for us.

Opinions? Alternative ideas?

[1] https://repology.org/metapackage/freetype/versions
[2] https://github.com/QuLogic/matplotlib/commit/cfdc8359234
07810bd087f60332cdc8cdcb23f05
[3] https://kojipkgs.fedoraproject.org//work/tasks/3137/23623137/build.log
[4] https://gist.github.com/QuLogic/477055a847a44cd444a0932432acffd1

[5] https://pypi.python.org/pypi/ImageHash
[6] https://github.com/SciTools/iris/blob/master/docs/iris/src/
developers_guide/graphics_tests.rst#graphics-testing-strategy

···

On 11 December 2017 at 00:44, Elliott Sales de Andrade < quantum.analyst at gmail.com> wrote:

--
Elliott
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20171211/77faa6ed/attachment.html>

Another idea might be to make most text a unique color that then could be not included in some image diffs. Sure there may be the occasional image that uses that colour but a few missed pixels won?t matter too much.

···

Sent from my iPhone

On Dec 11, 2017, at 6:20 PM, Elliott Sales de Andrade <quantum.analyst at gmail.com> wrote:

On 11 December 2017 at 00:44, Elliott Sales de Andrade <quantum.analyst at gmail.com> wrote:
Hi all,

Downstream in Fedora (and maybe Debian), they are running into issues with testing and text. Fedora 26 has FreeType 2.7.1 and Fedora 27 & Rawhide has FreeType 2.8. Fedora 25 uses 2.6.5, but it will be EOL in the next week. Many other distros are also transitioning to these newer FreeType as well [1] and I think anaconda recently added 2.8 too.

With 2.7.1, a few tests fail (rms < 1) and it is straightforward to patch that [2]. With 2.8 though, over 800 tests fail [3] ranging up to ~80 rms [4]. This is a bit harder to paper over.

I see a few ways to mitigate the problem, with varying advantages/disadvantages:

1. Bundle the older version in the Matplotlib package like we do with tests. I don't really believe this to be a viable option for downstream, but I'm just mentioning it to be thorough. There are already a few (minor) security issues in the one we test against.
2. Inject older FreeType just to run tests on the package. Again I don't like this idea. The point of running tests is to be sure that the version in a distro works *in that distro*. Testing with something a user could never install seems useless.
3. Re-create all our current and future test images with 2.8. While this is most future-proof, adding over 800 images is going to bloat the repo quite a bit.
4. Create some sort of side repo with test images for other FreeType releases. This would reduce bloat in the main repo but be somewhat more work. Thus I'd only suggest doing so for tags.

One more thing I forgot to mention is ImageHash [5] which is used by Iris for image tests using the following strategy [6]:
using a perceptual 'image hash' of the outputs as the basis for checking test results.
storing the hashes of 'known accepted results' for each test in a database in the repo
storing associated reference images for each hash value in a separate public repository, allowing human-eye judgement of 'valid equivalent' results.
a new version of the 'iris/tests/idiff.py' assists in comparing proposed new 'correct' result images with the existing accepted ones.
While this does reduce load in the main repo itself, it does increase the cognitive load for developers. Iris has a small core group of developers and much fewer drive-by contributions compared to Matplotlib, so I'm not sure we want to be doing this full idea. (Note also their repo is LGPL3, so please don't copy anything from there.) Using ImageHash might still be useful instead of RMS, though it may generalize things too much.

I dislike the first two options as they would be repetitive across distros (unless they just stopped testing altogether), but the last two are not without work for us.

Opinions? Alternative ideas?

[1] https://repology.org/metapackage/freetype/versions
[2] https://github.com/QuLogic/matplotlib/commit/cfdc835923407810bd087f60332cdc8cdcb23f05
[3] https://kojipkgs.fedoraproject.org//work/tasks/3137/23623137/build.log
[4] https://gist.github.com/QuLogic/477055a847a44cd444a0932432acffd1

[5] https://pypi.python.org/pypi/ImageHash
[6] https://github.com/SciTools/iris/blob/master/docs/iris/src/developers_guide/graphics_tests.rst#graphics-testing-strategy

--
Elliott
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel at python.org
https://mail.python.org/mailman/listinfo/matplotlib-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20171211/26301dcb/attachment-0001.html>