# Plotting large file (NetCDF)

RE: [Matplotlib-users] Plotting large file (NetCDF)
Hi Ben and Ryan,

I will try to figure out as it works.

Thank you.

Regards,

Raf

···

-----Original Message-----

From: ben.v.root@…287… on behalf of Benjamin Root

Sent: Tue 9/9/2014 3:25 PM

To: Ryan Nelson

Cc: Raffaele Quarta; Matplotlib Users

Subject: Re: [Matplotlib-users] Plotting large file (NetCDF)

Most of the time, you will not need to use meshgrid. Take advantage of

It saves significantly on memory and processing time. Most of

Matplotlib’s plotting functions work well with broadcastable inputs, so

that is a great way to save on memory. NumPy’s ogrid is also a neat tool

When I get a chance, I’ll look through the script for any other obvious

savers.

Cheers!

Ben Root

On Tue, Sep 9, 2014 at 9:02 AM, Ryan Nelson <rnelsonchem@…287…> wrote:

Raffaele,

As Ben pointed out, you might be creating a lot of in memory Numpy arrays

that you probably don’t need/want.

For example, I think (?) slicing all of the variable below:

lons = fh.variables[‘lon’][:]

is making a copy of all that (mmap’ed) data as a Numpy array in memory.

Get rid of the slice ([:]). Of course, these variables are not Numpy

arrays, so you’ll have to change some of your code. For example:

lon_0 = lons.mean()

Will have to become:

lon_0 = np.mean( lons )

If lats and lons are very large sets of data, then meshgrid will make two

very, very large arrays in memory.

For example, try this:

np.meshgrid(np.arange(5), np.arange(5))

The output is two much larger arrays:

[array([[0, 1, 2, 3, 4],

``````    [0, 1, 2, 3, 4],
``````
``````    [0, 1, 2, 3, 4],
``````
``````    [0, 1, 2, 3, 4],
``````
``````    [0, 1, 2, 3, 4]]),
``````

array([[0, 0, 0, 0, 0],

``````    [1, 1, 1, 1, 1],
``````
``````    [2, 2, 2, 2, 2],
``````
``````    [3, 3, 3, 3, 3],
``````
``````    [4, 4, 4, 4, 4]])]
``````

I don’t know Basemap at all, so I don’t know if this is necessary. You

might be able to force the meshgrid output into a memmap file, but I don’t

know how to do that right now. Perhaps someone else has some suggestions.

Hope that helps.

Ryan

On Tue, Sep 9, 2014 at 4:07 AM, Raffaele Quarta < > raffaele.quarta@…4572…> wrote:

Hi Jody and Ben,

I tried to use pcolormesh instead of pcolor and the result is very good!

For what concern with the memory system problem, I wasn’t able to solve it.

When I tried to use the bigger file, I got the same problem. Attached you

will find the script that I’m using to make the plot. May be, I didn’t

understand very well how can I use the mmap function.

Regards,

Raffaele.

-----Original Message-----

From: Jody Klymak [mailto:jklymak@…4192… <jklymak@…4192…>]

Sent: Mon 9/8/2014 5:46 PM

To: Benjamin Root

Cc: Raffaele Quarta; Matplotlib Users

Subject: Re: [Matplotlib-users] Plotting large file (NetCDF)

It looks like you are calling `pcolor`. Can I suggest you try

`pcolormesh`? ii

75 Mb is not a big file!

Cheers, Jody

On Sep 8, 2014, at 7:38 AM, Benjamin Root <ben.root@…3203…04…> wrote:

(Keeping this on the mailing list so that others can benefit)

What might be happening is that you are keeping around too many numpy

arrays in memory than you actually need. Take advantage of memmapping,

which most netcdf tools provide by default. This keeps the data on disk

rather than in RAM. Second, for very large images, I would suggest either

pcolormesh() or just simply imshow() instead of pcolor() as they are more

way more efficient than pcolor(). In addition, it sounds like you are

dealing with re-sampled data (“at different zoom levels”). Does this mean

that you are re-running contour on re-sampled data? I am not sure what the

benefit of doing that is if one could just simply do the contour once at

the highest resolution.

Without seeing any code, though, I can only provide generic suggestions.

Cheers!

Ben Root

On Mon, Sep 8, 2014 at 10:12 AM, Raffaele Quarta < >> raffaele.quarta@…4572…> wrote:

Hi Ben,

sorry for the few details that I gave to you. I’m trying to make a

contour plot of a variable at different zoom levels by using high

resolution data. The aim is to obtain .PNG output images. Actually, I’m

working with big data (NetCDF file, dimension is about 75Mb). The current

Matplotlib version on my UBUNTU 14.04 machine is the 1.3.1 one. My system

has a RAM capacity of 8Gb.

Actually, I’m dealing with memory system problems when I try to make a

plot. I got the error message as follow:

`````` cs = m.pcolor(xi,yi,np.squeeze(t))
``````

File “/usr/lib/pymodules/python2.7/mpl_toolkits/basemap/init.py”,

line 521, in with_transform

``````return plotfunc(self,x,y,data,*args,**kwargs)
``````

File “/usr/lib/pymodules/python2.7/mpl_toolkits/basemap/init.py”,

line 3375, in pcolor

``````x = ma.masked_values(np.where(x > 1.e20,1.e20,x), 1.e20)
``````

File “/usr/lib/python2.7/dist-packages/numpy/ma/core.py”, line 2195,

``````condition = umath.less_equal(mabs(xnew - value), atol + rtol *
``````

mabs(value))

MemoryError

Otherwise, when I try to make a plot of smaller file (such as 5Mb), it

works very well. I believe that it’s not something of wrong in the script.

It might be a memory system problem.

I hope that my message is more clear now.

Thanks for the help.

Regards,

Raffaele

Sent: Mon 9/8/2014 3:19 PM

To: Raffaele Quarta

Cc: Matplotlib Users

Subject: Re: [Matplotlib-users] Plotting large file (NetCDF)

You will need to be more specific… much more specific. What kind of

plot

are you making? How big is your data? What version of matplotlib are you

using? How much RAM do you have available compared to the amount of data

(most slowdowns are actually due to swap-thrashing issues). Matplotlib

can

be used for large data, but there exists some speciality tools for the

truly large datasets. The solution depends on the situation.

Ben Root

On Mon, Sep 8, 2014 at 7:45 AM, Raffaele Quarta < >> raffaele.quarta@…4572…> >> > wrote:

Hi,

I’m working with NetCDF format. When I try to make a plot of very

large

file, I have to wait for a long time for plotting. How can I solve

this?

Isn’t there a solution for this problem?

Raffaele

This email was Virus checked by Astaro Security Gateway.

http://www.sophos.com

Want excitement?

When you want reliability, choose Perforce

Perforce version control. Predictably reliable.

Matplotlib-users mailing list

Matplotlib-users@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-users

This email was Virus checked by Astaro Security Gateway.

http://www.sophos.com

Want excitement?

When you want reliability, choose Perforce

Perforce version control. Predictably reliable.

Matplotlib-users mailing list

Matplotlib-users@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Jody Klymak

http://web.uvic.ca/~jklymak/

Want excitement?

When you want reliability, choose Perforce.

Perforce version control. Predictably reliable.

Matplotlib-users mailing list

Matplotlib-users@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Want excitement?