RE: [Matplotlib-users] Plotting large file (NetCDF)
Hi Ben and Ryan,
I will try to figure out as it works.
Thank you.
Regards,
Raf
···
-----Original Message-----
From: ben.v.root@…287… on behalf of Benjamin Root
Sent: Tue 9/9/2014 3:25 PM
To: Ryan Nelson
Cc: Raffaele Quarta; Matplotlib Users
Subject: Re: [Matplotlib-users] Plotting large file (NetCDF)
Most of the time, you will not need to use meshgrid. Take advantage of
numpy’s broadcasting feature:
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
It saves significantly on memory and processing time. Most of
Matplotlib’s plotting functions work well with broadcastable inputs, so
that is a great way to save on memory. NumPy’s ogrid is also a neat tool
for generating broadcastable grids.
When I get a chance, I’ll look through the script for any other obvious
savers.
Cheers!
Ben Root
On Tue, Sep 9, 2014 at 9:02 AM, Ryan Nelson <rnelsonchem@…287…> wrote:
Raffaele,
As Ben pointed out, you might be creating a lot of in memory Numpy arrays
that you probably don’t need/want.
For example, I think (?) slicing all of the variable below:
lons = fh.variables[‘lon’][:]
is making a copy of all that (mmap’ed) data as a Numpy array in memory.
Get rid of the slice ([:]). Of course, these variables are not Numpy
arrays, so you’ll have to change some of your code. For example:
lon_0 = lons.mean()
Will have to become:
lon_0 = np.mean( lons )
If lats and lons are very large sets of data, then meshgrid will make two
very, very large arrays in memory.
For example, try this:
np.meshgrid(np.arange(5), np.arange(5))
The output is two much larger arrays:
[array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]),
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4]])]
I don’t know Basemap at all, so I don’t know if this is necessary. You
might be able to force the meshgrid output into a memmap file, but I don’t
know how to do that right now. Perhaps someone else has some suggestions.
Hope that helps.
Ryan
On Tue, Sep 9, 2014 at 4:07 AM, Raffaele Quarta < > raffaele.quarta@…4572…> wrote:
Hi Jody and Ben,
thanks for your answers.
I tried to use pcolormesh instead of pcolor and the result is very good!
For what concern with the memory system problem, I wasn’t able to solve it.
When I tried to use the bigger file, I got the same problem. Attached you
will find the script that I’m using to make the plot. May be, I didn’t
understand very well how can I use the mmap function.
Regards,
Raffaele.
-----Original Message-----
From: Jody Klymak [mailto:jklymak@…4192… <jklymak@…4192…>]
Sent: Mon 9/8/2014 5:46 PM
To: Benjamin Root
Cc: Raffaele Quarta; Matplotlib Users
Subject: Re: [Matplotlib-users] Plotting large file (NetCDF)
It looks like you are calling
pcolor
. Can I suggest you try
pcolormesh
? ii
75 Mb is not a big file!
Cheers, Jody
On Sep 8, 2014, at 7:38 AM, Benjamin Root <ben.root@…3203…04…> wrote:
(Keeping this on the mailing list so that others can benefit)
What might be happening is that you are keeping around too many numpy
arrays in memory than you actually need. Take advantage of memmapping,
which most netcdf tools provide by default. This keeps the data on disk
rather than in RAM. Second, for very large images, I would suggest either
pcolormesh() or just simply imshow() instead of pcolor() as they are more
way more efficient than pcolor(). In addition, it sounds like you are
dealing with re-sampled data (“at different zoom levels”). Does this mean
that you are re-running contour on re-sampled data? I am not sure what the
benefit of doing that is if one could just simply do the contour once at
the highest resolution.
Without seeing any code, though, I can only provide generic suggestions.
Cheers!
Ben Root
On Mon, Sep 8, 2014 at 10:12 AM, Raffaele Quarta < >> raffaele.quarta@…4572…> wrote:
Hi Ben,
sorry for the few details that I gave to you. I’m trying to make a
contour plot of a variable at different zoom levels by using high
resolution data. The aim is to obtain .PNG output images. Actually, I’m
working with big data (NetCDF file, dimension is about 75Mb). The current
Matplotlib version on my UBUNTU 14.04 machine is the 1.3.1 one. My system
has a RAM capacity of 8Gb.
Actually, I’m dealing with memory system problems when I try to make a
plot. I got the error message as follow:
cs = m.pcolor(xi,yi,np.squeeze(t))
File “/usr/lib/pymodules/python2.7/mpl_toolkits/basemap/init.py”,
line 521, in with_transform
return plotfunc(self,x,y,data,*args,**kwargs)
File “/usr/lib/pymodules/python2.7/mpl_toolkits/basemap/init.py”,
line 3375, in pcolor
x = ma.masked_values(np.where(x > 1.e20,1.e20,x), 1.e20)
File “/usr/lib/python2.7/dist-packages/numpy/ma/core.py”, line 2195,
in masked_values
condition = umath.less_equal(mabs(xnew - value), atol + rtol *
mabs(value))
MemoryError
Otherwise, when I try to make a plot of smaller file (such as 5Mb), it
works very well. I believe that it’s not something of wrong in the script.
It might be a memory system problem.
I hope that my message is more clear now.
Thanks for the help.
Regards,
Raffaele
Sent: Mon 9/8/2014 3:19 PM
To: Raffaele Quarta
Cc: Matplotlib Users
Subject: Re: [Matplotlib-users] Plotting large file (NetCDF)
You will need to be more specific… much more specific. What kind of
plot
are you making? How big is your data? What version of matplotlib are you
using? How much RAM do you have available compared to the amount of data
(most slowdowns are actually due to swap-thrashing issues). Matplotlib
can
be used for large data, but there exists some speciality tools for the
truly large datasets. The solution depends on the situation.
Ben Root
On Mon, Sep 8, 2014 at 7:45 AM, Raffaele Quarta < >> raffaele.quarta@…4572…> >> > wrote:
Hi,
I’m working with NetCDF format. When I try to make a plot of very
large
file, I have to wait for a long time for plotting. How can I solve
this?
Isn’t there a solution for this problem?
Raffaele
–
This email was Virus checked by Astaro Security Gateway.
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
Matplotlib-users mailing list
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
–
This email was Virus checked by Astaro Security Gateway.
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
Matplotlib-users mailing list
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
–
Jody Klymak
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce.
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
Matplotlib-users mailing list
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce.
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
Matplotlib-users mailing list
https://lists.sourceforge.net/lists/listinfo/matplotlib-users