Insight into pytest-mpl font differences

Looking for some insights into possible reasons for font differences I am now seeing, and ideas as to what I might do about this.

The back story: I use compare_images() from in my regression tests, comparing test generated images to reference images that were generated a while ago; (see tests here). These comparisons don’t work on every machine, but I’ve got them working well enough that they work fine and catch problems where I do most of my work, and on Travis, which runs these tests every time I enter a Pull Request.

The current story: This week I upgraded my WSL/Ubuntu18 to WSL-2/Ubuntu20. Once I got everything working, I made a fresh clone of my mplfinance repository and ran pytest. Almost all of the tests failed, but the only differences are slight size and weight difference in the fonts.

Here is an example:

Reference Image:

Test Image:

Image Diff:

So I am asking to see if anyone has any insights as to how I can find, or verify, which fonts changed, if any, and/or how I can avoid this situation in the future. I would really prefer not to have to go back to WSL (from WSL2). If I have to go back to Ubuntu18 that would be ok, I suppose.

As an additional point, which may or may not be relevant: I did notice the only test images that passed where those in which the font size and weight were NOT specified in rcParams (as they are in most tests). To be extra clear however, although the tests specify a font size and weight in rcParams, these values have NOT changed. The only change was WSL->WSL2 and Ubuntu18->Ubuntu20. I’m just saying that it is possible (perhaps) that specifying the font via rcParams exposes a sensitivity to the difference (gets the font from a different place?).

Does anyone have any suggestions where I should be looking and/or what I should do next to investigate this situation? Thanks for reading and thinking about this. And thanks in advance for any insights you may have! --Daniel

We test with a set version of freetype. I assume you are as well? You can check your setup.cfg.

I agree freetype is a canditate for what changed. If the underlying version of Matplotlib changed you may be catching subtle changes in how we do the font matching.

How are you installing Matplotlib?

This is exactly the kind of insight I was hoping for. I am completely unfamiliar with freetype. Will go read the docs and look into it. Definitely a strong suspect. Thanks!

By the way, to what “setup.cfg” are you referring?

If I recall, on this latest upgrade (WSL2 and Ubuntu20), I believe I got matplotlib initially by installing Anaconda (via, and afterwards did pip install --upgrade matplotlib to bring me up to v3.3.0 (which I’m pretty sure is where I was at previous to this OS upgrade).

setup.cfg.template is in the matplotlib root directory - if you edit it, then pip install -e . will heed its instructions…

@jklymak @tacaswell
Jody, Tom- Finally got around to looking into this. I think my matplotlib is using it’s own local install of freetype (as it should) so I’m confused as to why I may be getting different results. matplotlib was installed with pip install --upgrade matplotlib. But I’m not sure how to verify if that is really being used:

dino@DINO:~$ python -c 'import matplotlib.ft2font as ft;print(ft.__freetype_version__)'
dino@DINO:~$ conda list freetype
# packages in environment at /home/dino/anaconda3:
# Name                    Version                   Build  Channel
freetype                  2.10.2               h5ab3b9f_0

Any ideas? Thanks. --Daniel

So, after the above, I decided to take conda out of the equation. I deleted and created a fresh new WSL-2 install of Ubuntu20 and also Ubuntu18. For both of the I did:

sudo apt update
sudo apt upgrade
sudo apt get python3-pip
pip3 install matplotlib
pip3 install mplfinance
git clone
sudo apt install python3-pytest
cd mplfinance
python3 -m pytest

The above ran my mplfinance tests, and in both cases I got the same results where the fonts are now either slightly different. If I compare the images visually (flipping back and forth on my screen) it appears that the new test image definitely has larger text.

I’m kind of at a loss here for what to try next. Please correct me if I’m wrong: installing matplotlib via pip install should result in matplotlib using only __freetype_version__ 2.6.1 , right?

What to try next? Thanks.

P.S. Note, I also tried the above, i.e. fresh install of matplotlib and mplfinance, on an existing install on ubuntu18, and it worked fine. It’s the new install of Ubuntu that has the problem. I must be doing something different. Works fine on Travis too.

I just thought of something else. The tests use savefig to save the plots to png files, and then compare them. Is it possible something changed there?

Sorry for the slow reply.

It is not clear to me that using 2 conda-less installs of Ubuntu18 and Ubuntu20 that they are the same as each other (but different from the saved images) or if they are different from each other with ane (or neither) matching the save images.

If they match each other and not the saved image I suspect that the issue is your baseline images were saved using a different (likely newer from conda) version of freetype and your option are to re-generate them or to push the allowed RMS way up.

If you are going to regenerate them, consider testing via the svg or pdf formats. In those cases we save check the vector format in and then rasterize to compare at test-time which will avoid this problem.

We also have a GSOC student this summer ( working on adjusting the machinery so that we can generate the test images on the developer machine (or use pre-generated ones) to make us a bit more robust to these issues.

If they are different from each other, can you try deleting ~/.cache/matplotlib/fontlist* ? Are you using a non-default font for this figure? I could believe that there is some difference in the fonts installed by default in different ubuntu versions.

Thanks. @tacaswell … Tom, can you confirm, if I install matplotlib with pip install --upgrade matplotlib, will it use the system installed freetype, or a matploblib built freetype?
Thanks. --Daniel

Should be our home-built freetype in the wheels on pypi.

I’ll summarize what we know so far, and include some additional information:

  • I have 4 different environments; only one exhibits the problem.

  • In all cases, matplotlib was installed from PyPi (pip install matplotlib)

    • this suggests the issue is not with multiple versions of freetype. (See Tom’s comment)
  • The 4 environments are:

      1. Red Hat Enterprise Linux Server 6.8 Santiago (x86_64 on Xeon E7-4890 v2 @ 2.80GHz) WORKS FINE
      2. Ubuntu 18.04.5 LTS Bionic Beaver (x86_64 Xeon E5-1620 0 @ 3.60GHz) WORKS FINE
      3. Travis CI Ubuntu 16.04 Xenial (x86_64 on AMD64) WORKS FINE (example:
      4. WSL Ubuntu 18.04.5 LTS (x86_64 on Xeon E5-1620 0 @ 3.60GHz) THIS IS THE ONE NOT WORKING

  • My original WSL Ubuntu 18 install WAS WORKING FINE.
    (This is the environment that was used to generate the Reference Images on 2020-06-28.
    These same reference images work fine on the first three environments, but fail on the WSL re-install.)

  • I am focusing on one particular set of regression tests just to keep the set small and manageable.
    Of the 10 tests in that set, 2 pass and 8 fail.
    The 8 fail because the Test Image Axes Label fonts are bigger now (in the WSL environment) than they are in the Reference images or in the other environments (as can be seen in the images above).

  • The 8 failed tests all use a style customization that appears to be handled differently in the new WSL environment than it is in all of the other environments.

  • I have narrowed down the potentially relevant style customizations to one or more of the following:'seaborn-darkgrid')
  plt.rcParams.update( {'axes.labelsize'   : 'large',
                        'axes.labelweight' : 'semibold',
                        'font.weight'      :  'medium',
                        'font.size'        : 12.0      } )
  • Although the 2 tests that succeed also have their own style customizations,
    none of those customizations involve axes.labelsize, axes.labelweight, font.weight, nor font.size.
    This lends credence to the idea that one (or more) of the above rcParams customizations is(are) handled dfferently in the re-installed WSL environment than in all of the other environments. Furthermore …
  • By adding the following customizations, I was able to get half of the 8 failed tests to pass, and the remaining 4, upon visual inspection, appear closer to the Reference Images. This further lends credence to the idea that the above rcParams customizations is(are) handled dfferently in the re-installed WSL environment than in all of the other environments.'seaborn-darkgrid')
  plt.rcParams.update( {'axes.labelsize'   :  12.25,          # WAS 'large'
                        'axes.labelweight' : 'semibold',
                        'font.weight'      :  'medium',
                        'font.size'        :  12.0,      
                        'xtick.labelsize'  :  10.75,          # NEW
                        'ytick.labelsize'  :  10.75      } )  # NEW
  • What I’ve done so far:
    • Multiple WSL and WSL-2 installs, with Ubuntu 16, 18, 20. Results are all the same.
    • Clean partition re-install of Windows 10 (without Microsoft Insider build) followed by WSL Ubuntu 18 (no conda)
      This should match my original WSL install that was working; but this re-install still fails.
    • Downgrade system installed freetype. NO EFFECT
    • Delete ~/.cache/matplotlib on both WORKING and NOT WORKING environments. NO EFFECT
      (working keeps working, and not working keeps not working).
    • Try multiple versions of matplotlib (3.1, 3.2, 3.3) NO EFFECT
      (working keeps working, and not working keeps not working).
    • I have not verified (as Tom suggested) that test images that fail with different distros and other tweaks to the WSL environment match each other other than a cursory visual inspection, but the fact that all tests continue to work in all other environments (with multiple versions of matplotlib), and the fact that matplotlib was always installed from PyPi, leads me to discount the idea that I am accessing different versions of freetype. More likely, I think, is the idea that freetype, or something else, is accessing something in the environment that has changed and that something is used somehow within the implementation the rcParams customizations that I have in the 8 failed tests.

Thanks once again to everyone reading this. If you have an insights into the inner workings of matplotlib, and what I might suspect or look at to see what may be different in the one environment that is failing, I will greatly appreciate any guidance or suggestions you may give.

This is a rather wild issue! I’m also pretty convinced that it is not freetype (which is sad because we know how to solve that problem).

My next guess is that we are not finding the right fonts for semibold / something is wrong with the fonts. Could you check the output of matplotlib.font_manager.findfont (see for the implementation) on the various systems?

It looks like seaborn-dark sets the font to Ariel (well a long list of fallbacks that starts with Ariel . If Ariel (or one of its weight variants) is missing that maybe the source of the problem.

Thanks for pointing a direction to go next. Yes, this is a wild one. Haven’t had one this challenging in a while. Having a lot of fun trying to track it down. Very much appreciate your guidance!!