[Matplotlib-users] Running matplotlib on massively parallel compute resources

We've recently seen an issue where someone running multiple instances
of jobs on our supercomputer, all of which have a matplotlib component
that thus runs on the compute nodes, rather than as part of any
post-processing on our anciliary services.

Some of these jobs ended up hanging and, in a number of cases, we have
observed that the hanging process is what we belive to be the matplotlib-
spawned

   fc-list --format=%{file}\n

Is there anything, in the way that matplotlib is written, that might
see race conditions, around access to the per-user font cache, or
other matplotlib data, being created?

Furthermore, is there a way that our users could define a per-job font
cache directory, by using the job-ID, and thereby explcitly avoiding
any inter-job interference resulting from their "massively parallel"
matplotlib invocations?

Here's hoping that matplotlib is the cause, and, if so, that there's an
easy solution, when you know how to use matplotlib.

···

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@python.org
https://mail.python.org/mailman/listinfo/matplotlib-users

I think setting the MPLCONFIGDIR environment variable to a user-writable directory should be good enough. There’s an open PR (https://github.com/matplotlib/matplotlib/pull/15933) which adds a warning in the case where that’s needed.
Antony

···

On Thu, Jan 16, 2020 at 4:41 AM Kevin Buckley kevin.buckley.pawsey.org.au@gmail.com wrote:

We’ve recently seen an issue where someone running multiple instances

of jobs on our supercomputer, all of which have a matplotlib component

that thus runs on the compute nodes, rather than as part of any

post-processing on our anciliary services.

Some of these jobs ended up hanging and, in a number of cases, we have

observed that the hanging process is what we belive to be the matplotlib-

spawned

fc-list --format=%{file}\n

Is there anything, in the way that matplotlib is written, that might

see race conditions, around access to the per-user font cache, or

other matplotlib data, being created?

Furthermore, is there a way that our users could define a per-job font

cache directory, by using the job-ID, and thereby explcitly avoiding

any inter-job interference resulting from their “massively parallel”

matplotlib invocations?

Here’s hoping that matplotlib is the cause, and, if so, that there’s an

easy solution, when you know how to use matplotlib.


Matplotlib-users mailing list

Matplotlib-users@python.org

https://mail.python.org/mailman/listinfo/matplotlib-users

Cheers for that, Antony: seems to do what's needed.

I had seen references to MPLCONFIGDIR in a number of threads relating
to matplotlib but hadn't quite worked out it it was what we needed.

Kevin

···

On 2020/01/17 17:39, Antony Lee wrote:

I think setting the MPLCONFIGDIR environment variable to a user-writable
directory should be good enough. There's an open PR
(Warn if a temporary config/cache dir must be created. by anntzer · Pull Request #15933 · matplotlib/matplotlib · GitHub)
which adds a warning in the case where that's needed.
Antony

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@python.org
https://mail.python.org/mailman/listinfo/matplotlib-users