Corrupted images while saving

Hi,

For my thesis I like many go to the the scipy matplotlib ecosystem for my processing of data.

I started analyzing my data in the past weeks and have made my python script such that i ingest a JSON file with different measurements. Per measurement I do some calculations and plot my data this
is done in parallel using mp.Pool.

Lately I have been getting these kind of images:

The data is plotted but somewhere something goes wrong.
I tried to set the dpi to be the same as the figure but no suffice.

The strange thing which image in the serie is corrupted is random, so my suspicion is that matplotlib’s savefig does not play nice with the thread pool.

Can someone confirm this and what is a proper way to do what i’m trying to do (return a figure handle from the threads and save the figures serially?)

Thanks for reading

I don’t know if this is your problem, but Matplotlib is not thread safe, so I’d do your calculations threaded if you like, save to disk, and then plot the reduced data with a serial job (perhaps many simultaneous serial jobs). Its almost always better not to mix plotting an analysis anyway, particularly for expensive calculations, because when you need to revise your figure for Reviewer 2, you will be a lot happier if you just have to rerun a plotting script on a reduced data set.

Thanks for your reply.

Yeah i read up again a bit better on MPL and thread safety, and your answer solidifies it. Such a shame that its not easily possible.

My python code was already called from a Rust codebase in which I did the dataset selection processing. This was already parallelized so know I’ve refactored that codebase to do the splitting into a single series. Now it works just fine. I was so accustomed to the compiler reminding me something I am not allowed to do.

Thanks

Glad that fixed it. I’m not sure what’s involved in making something with i/o like matplotlib thread safe, but I assume its not trivial.