Hi,
I write to you to discuss a somewhat commonly enountered issue regarding font-cache and matplotlib:
/cle60up07/python/2.7.14/matplotlib/2.1.0/lib/python2.7/site-packages/matplotlib-2.1.0-py2.7-linux-x86_64.egg/matplotlib/font_manager.py:279: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment. 'Matplotlib is building the font cache using fc-list. '
However, I haven’t managed to find an answer specific to my question, which is – “When does matplotlib require building the font cache?” Some notes on how and where I am running my plot routine:
- I am running matplotlib/2.1.0
- I am running multiple instances of the routine that invokes matplotlib.pyplot on multiple nodes of a cluster.
- MPLCONFIGDIR is not set explicitly,
- I see a fontList.cache file in my ${HOME}/.matplotlib area.
- Of the 36 instances of the plot jobs running on 36 independent nodes, only 4 Nodes showed the above Warning message.
The trouble was these jobs went on for 18 hours after which it had to be killed forcefully to release the nodes back for use. These 4 jobs were by no means the first ones to have run on the cluster. In fact the earlier jobs ran fine without having to build the font-cache.
q1. One of the hypothesis is that multiple matplotlib.pyplot instances (running simultaneously) is trying to create/delete the font-caches leading to conflicts. Would you be surprised if this was to be the case?
q2. A workaround proposed is to define the environment variable MPLCONFIGDIR and point it to a /tmp directory local to the nodes (basically to the shared memory). Would you approve of this workaround?
q3. Personally, I suspect that ${HOME}/.matplotlib area could have momentarily been inaccessible for those nodes (transient issue). Would this not lead to the rebuilding of font-caches?
So my real question:
q4. What must have been the cause(s) in your opinion that could force Matplotlib to rebuild the font cache when in fact jobs preceding the ones showing the Warning, ran fine?
PS : I am unable to reproduce this issue. I have been running this code hundreds of times on various data sets since the last few years and never faced this problem. So I am a little perplexed why I saw this a few weeks ago on 4 nodes, and then again, I am unable to reproduce this issue!
Thanks,
wasim