Fetching a data slab with NetCDFFile

Thanks much for the quick response. I updated both matplotlib and
basemap (now at 0.99.5) via svn and noticed the new netcdftime.py. First, from within site-packages/mpl_toolkits/basemap,

$ grep date2index *.py
__init__.py::func:`date2index`: compute a time variable index
corresponding to a date.
__init__.py:def date2index(dates, nctime, calendar='proleptic_gregorian'):
__init__.py: return netcdftime.date2index(dates, nctime, calendar=None)
netcdftime.py:def date2index(dates, nctime, calendar=None, select='exact'):
netcdftime.py: date2index(dates, nctime, calendar=None, select='exact')

so there seems to be some disagreement between __init__.py and
netcdftime.py concerning the presence of the "select" argument. When I
call date2index with the "select" keyword arg I get

In [24]: ix0 = date2index(date0,timedata,timedata.calendar,select='nearest')

···

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

TypeError: date2index() got an unexpected keyword argument 'select'

-----------------------

This detail aside, I am still having difficulty with date2index, but
annoyingly, I seem to get different error messages with different datasets. I'll illustrate two here, starting with the one I initially posted about. (See note below regarding this data.)

In [3]: from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
In [10]: from mpl_toolkits import basemap
In [11]: print basemap.__version__
0.99.5
In [24]: fname0 = ‘http://esgcet.llnl.gov/dap/
In [25]: fname1 = ‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
In [26]: fname = fname0+fname1
In [28]: datobj = ncf(fname)
In [33]: datobj.variables[‘tas’].shape
Out[33]: (1680, 90, 144)
In [34]: timedata = datobj.variables[‘time’]
In [35]: from datetime import datetime as dt
In [36]: date0 = dt(1951,1,16,12,0,0)
In [37]: print date0
1951-01-16 12:00:00
In [38]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------
ClientError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
    3931 Returns an index or a sequence of indices.
    3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
    3934
    3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
    1006 # If the times do not correspond, then it means that the times
    1007 # are not increasing uniformly and we try the bisection method.
-> 1008 if not _check_index(index, dates, nctime, calendar):
    1009
    1010 # Use the bisection method. Assumes the dates are ordered.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in _check_index(indices, dates, nctime, calendar)
     959 return False
     960
--> 961 t = nctime[indices]
     962 return numpy.all( num2date(t, nctime.units, calendar) == dates)
     963

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdf.pyc in __getitem__(self, index)
      65
      66 def __getitem__(self, index):
---> 67 datout = squeeze(self._var.__getitem__(index))
      68 # automatically
      69 # - remove singleton dimensions

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/dtypes.pyc in __getitem__(self, key)
     409 def __getitem__(self, key):
     410 # Return data from the array.
--> 411 return self.data[key]
     412
     413 def __setitem__(self, key, item):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/proxy.pyc in __getitem__(self, index)
     112
     113 # Fetch data.
--> 114 resp, data = openurl(url, self.cache, self.username, self.password)
     115
     116 # First lines are ASCII information that end with 'Data:\n'.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/util/http.pyc in openurl(url, cache, username, password)
      19 m = re.search('code = (?P<code>\d+);\s*message = "(?P<msg>.*)"', data, re.DOTALL | re.MULTILINE)
      20 msg = 'Server error %(code)s: "%(msg)s"' % m.groupdict()
---> 21 raise ClientError(msg)
      22
      23 return resp, data

ClientError: 'Server error 0: "invalid literal for int(): [1113"'

-----------------------------

Note that this is a different error than previously reported. Also, the correct time index is still 1080:

In [40]: taxvals = datobj.variables['time'][:]

In [41]: num2date(taxvals[1080],timedata.units,timedata.calendar)
Out[41]: 1951-01-16 12:00:00

-----------------------------

This dataset, generated by one of the IPCC models, is password-protected, but could be a good target for decoding, since it is typical of a large class of climate models, that generate a lot of analytical activity. To get a password (they're free) one must register. Info is here: http://www-pcmdi.llnl.gov/ipcc/info_for_analysts.php. Follow "How to access..." then "Register to download output." Once you get a userid and password they can be inserted in the NetCDFFile call, voila. Note that there is a new iteration of IPCC coming down the pike; new model files to become widely available starting in 2010.

------------------------------

The underlying data is available via ftp. I fetched it and extracted a small slab, which is available at http://iri.columbia.edu/~amg/test/gfdl_test.nc. The CDAT package can digest this file; first time step is plotted here: http://iri.columbia.edu/~amg/test/gfdl_test_time0.png. The dates can also be read by this package, and run from Aug 1950 to Aug 1951, inclusive (13 mos). So the file does not seem to be garbage.

In [16]: datobj = ncf('gfdl_test.nc')
In [17]: timedata = datobj.variables['time']
In [18]: date0 = dt(1951,1,16,12,0,0)
In [19]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
    3931 Returns an index or a sequence of indices.
    3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
    3934
    3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
    1011 import bisect
    1012
-> 1013 index = numpy.array([bisect.bisect_left(nctime, n) for n in num], int)
    1014
    1015 nomatch = num2date(nctime[index], nctime.units) != dates

TypeError: object of type 'netcdf_variable' has no len()

Investigating the time axis,

In [20]: taxvals = timedata[:]

In [21]: taxvals
Out[21]: array([ 32712.5, 32743. , 32773.5, 32804. , 32834.5, 32865.5, 32895. , 32924.5, 32955. , 32985.5, 33016. , 33046.5, 33077.5])

In [22]: num2date(taxvals,timedata.units,timedata.calendar)
Out[22]: array([1950-08-16 12:00:00, 1950-09-16 00:00:00, 1950-10-16 12:00:00,
        1950-11-16 00:00:00, 1950-12-16 12:00:00, 1951-01-16 12:00:00,
        1951-02-15 00:00:00, 1951-03-16 12:00:00, 1951-04-16 00:00:00,
        1951-05-16 12:00:00, 1951-06-16 00:00:00, 1951-07-16 12:00:00,
        1951-08-16 12:00:00], dtype=object)

Which agrees with what CDAT sees.

-------------------------

I think this is enough for now. I also had problems opening data files whose time units were like "months since xxxx-xx-xx," since the "months" unit does not seem to be supported. ("years since..." could also be useful in some cases.) But maybe one or two things at a time is enough!

Thanks for any assistance/advice!

Best,

Arthur

Jeff Whitaker wrote:

David Huard wrote:

Arthur,

I wrote the date2index function and I think what you are seeing is a bug that I fixed a couple of months ago. By using the latest version of netcdf4-python, not only should this bug disappear, but you'll also find that date2index now supports different selection methods: 'exact', 'before', 'after', 'nearest', that should help with your use case.

If this does not fix the problem you are seeing, I'd appreciate having a copy of the file and code to reproduce the problem and find a solution.

HTH,

David Huard

Arthur: I've just updated basemap svn with David's latest version of date2index, so another option is to update basemap from svn. Or, even simpler, just drop the attached netcdftime.py file in lib/mpl_toolkits/basemap (replacing the old one) and run python setup.py install.

-Jeff

On Mon, Sep 7, 2009 at 9:27 AM, Arthur M. Greene <amg@…1985… >> <mailto:amg@…1985…>> wrote:

    Hi All,

    The problem is not with fetching the data slice itself, but
    finding the
    correct indices to specify, particularly with the time dimension. The
    below examples refer to a remote dataset that I can open and slice
    using
    indices, as in

    slice = remoteobj.variables['tas'][:120,20:40,30:50].

    However, I have problems when trying to use the syntax in
    plotsst.py or
    pnganim.py (from the examples) to find time indices:

    In [107]: from datetime import datetime as dt
    In [108]: date0 = dt(1951,1,1,0)
    In [110]: print date0
    1951-01-01 00:00:00

    In [125]: timedata = remoteobj.variables['time']
    In [126]: nt0 = date2index(date0,timedata)
    ---------------------------------------------------------------------------

    AssertionError Traceback (most recent
    call last)

    /home/amg/work/nhmm/<ipython console> in <module>()

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc

    in date2index(dates, nctime, calendar)
       3924 Returns an index or a sequence of indices.
       3925 """
    -> 3926 return netcdftime.date2index(dates, nctime, calendar=None)
       3927
       3928 def maskoceans(lonsin,latsin,datain,inlands=False):

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in date2index(dates, nctime, calendar)
        986
        987 # Perform check again.
    --> 988 _check_index(index, dates, nctime, calendar)
        989
        990 # convert numpy scalars or single element arrays to python
    ints.

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in _check_index(indices, dates, nctime, calendar)
        941 for n,i in enumerate(indices):
        942 t[n] = nctime[i]
    --> 943 assert numpy.all( num2date(t, nctime.units, calendar)
    == dates)
        944
        945

    AssertionError:

    ---------------------------------------------------------

    It turns out that date0 corresponds best to index 1080:

    In [139]: remoteobj.variables['time'][1080]
    Out[139]: 32865.5

    In [141]: num2date(32865.5,timedata.units,timedata.calendar)
    Out[141]: 1951-01-16 12:00:00

    This isn't the _exact_ date and time I had specified, but

    In [142]: date0 = dt(1951,01,16,12,00,00)
    In [143]: print date0
    1951-01-16 12:00:00

    In [144]: date2index(date0,timedata,timedata.calendar)

    produces the same AssertionError. Where is the problem?

    What I would _like_ to do is to issue a simple call using coordinates
    rather than the indices, of the form:

    slice = variable[date0:date1,[plev],lat0:lat1,lon0:lon1],

    or similar, preferably without writing a whole module just to find the
    correct indices. I need to fetch similar slices from a group of
    models,
    having time axes that may each be defined slightly differently --
    different calendars, time point set at a different day of the month,
    etc. (It's monthly data and I'm specifying only monthly bounds, even
    though the calendar may be defined as "days since 1860...") I need to
    automate the process so I get back the correct slab regardless.

    Suggestions appreciated!

    Thx,

    Arthur

    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
    Arthur M. Greene, Ph.D.
    The International Research Institute for Climate and Society (IRI)
    The Earth Institute, Columbia University, Lamont Campus

    amg at iri dot columbia dot edu | http://iri.columbia.edu
    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

    ------------------------------------------------------------------------------

    Let Crystal Reports handle the reporting - Free Crystal Reports
    2008 30-Day
    trial. Simplify your report design, integration and deployment -
    and focus on
    what you do best, core application coding. Discover what's new with
    Crystal Reports now. http://p.sf.net/sfu/bobj-july
    _______________________________________________
    Matplotlib-users mailing list
    Matplotlib-users@lists.sourceforge.net
    <mailto:Matplotlib-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/matplotlib-users

------------------------------------------------------------------------

------------------------------------------------------------------------------

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
  
--
*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
Arthur M. Greene, Ph.D.
The International Research Institute for Climate and Society (IRI)
The Earth Institute, Columbia University, Lamont Campus
Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA
Tel: 845-680-4436 | Fax: 845-680-4865
amg@...1985... | http://iri.columbia.edu
*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

Arthur M. Greene wrote:

Thanks much for the quick response. I updated both matplotlib and
basemap (now at 0.99.5) via svn and noticed the new netcdftime.py. First, from within site-packages/mpl_toolkits/basemap,

$ grep date2index *.py
__init__.py::func:`date2index`: compute a time variable index
corresponding to a date.
__init__.py:def date2index(dates, nctime, calendar='proleptic_gregorian'):
__init__.py: return netcdftime.date2index(dates, nctime, calendar=None)
netcdftime.py:def date2index(dates, nctime, calendar=None, select='exact'):
netcdftime.py: date2index(dates, nctime, calendar=None, select='exact')

so there seems to be some disagreement between __init__.py and
netcdftime.py concerning the presence of the "select" argument. When I
call date2index with the "select" keyword arg I get

In [24]: ix0 = date2index(date0,timedata,timedata.calendar,select='nearest')
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

TypeError: date2index() got an unexpected keyword argument 'select'

Arthur: I forgot to update the wrapper function in __init__.py - that's fixed now if you do an svn update. Concerning your other problems below, using your test case exposed a couple of other bugs, but it still doesn't work. The basic problem is that the date2index function was designed to work with netCDF4 variable objects (http://code.google.com/p/netcdf4-python), and the netcdf file/variable objects that are produced by pupynere/pydap (the pure python netcdf /dap reader included in basemap) don't quite behave the same way. Using netCDF4, I can get your gfdl_test.nc case to work with

> cat testdate2index.py

#from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
from netCDF4 import Dataset as ncf
from netCDF4 import date2index, num2date
from mpl_toolkits import basemap
fname0 = ‘http://esgcet.llnl.gov/dap/
fname1 =\
‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
fname = fname0+fname1
#fname = ‘gfdl_test.nc’
print fname
datobj = ncf(fname)
print datobj.variables[‘tas’].shape
timedata = datobj.variables[‘time’]
from datetime import datetime as dt
date0 = dt(1951,1,16,12,0,0)
print num2date(timedata[:],timedata.units,calendar=timedata.calendar)
print date0
nt0 = date2index(date0,timedata,select=‘nearest’)
print nt0
print \
timedata[nt0],num2date(timedata[nt0],timedata.units,calendar=timedata.calendar)

> python testdate2index.py

gfdl_test.nc
(13, 31, 29)
[1950-08-16 12:00:00 1950-09-16 00:00:00 1950-10-16 12:00:00
1950-11-16 00:00:00 1950-12-16 12:00:00 1951-01-16 12:00:00
1951-02-15 00:00:00 1951-03-16 12:00:00 1951-04-16 00:00:00
1951-05-16 12:00:00 1951-06-16 00:00:00 1951-07-16 12:00:00
1951-08-16 12:00:00]
1951-01-16 12:00:00
5
[ 32865.5] [1951-01-16 12:00:00]

Your original example doesn't work because the URL is not an opendap server, it's some kind of CDAT xml file that presumably only CDAT understands.

We will see if we can fix the date2index function included in basemap (if not I will remove it), but for now I recommend using netcdf4-python. It's really a much more robust and feature-rich solution for netcdf reading and writing.

-Jeff

···

-----------------------

This detail aside, I am still having difficulty with date2index, but
annoyingly, I seem to get different error messages with different datasets. I'll illustrate two here, starting with the one I initially posted about. (See note below regarding this data.)

In [3]: from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
In [10]: from mpl_toolkits import basemap
In [11]: print basemap.__version__
0.99.5
In [24]: fname0 = ‘http://esgcet.llnl.gov/dap/
In [25]: fname1 = ‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
In [26]: fname = fname0+fname1
In [28]: datobj = ncf(fname)
In [33]: datobj.variables[‘tas’].shape
Out[33]: (1680, 90, 144)
In [34]: timedata = datobj.variables[‘time’]
In [35]: from datetime import datetime as dt
In [36]: date0 = dt(1951,1,16,12,0,0)
In [37]: print date0
1951-01-16 12:00:00
In [38]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

ClientError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1006 # If the times do not correspond, then it means that the times
   1007 # are not increasing uniformly and we try the bisection method.
-> 1008 if not _check_index(index, dates, nctime, calendar):
   1009
   1010 # Use the bisection method. Assumes the dates are ordered.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in _check_index(indices, dates, nctime, calendar)
    959 return False
    960
--> 961 t = nctime[indices]
    962 return numpy.all( num2date(t, nctime.units, calendar) == dates)
    963

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdf.pyc in __getitem__(self, index)
     65
     66 def __getitem__(self, index):
---> 67 datout = squeeze(self._var.__getitem__(index))
     68 # automatically
     69 # - remove singleton dimensions

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/dtypes.pyc in __getitem__(self, key)
    409 def __getitem__(self, key):
    410 # Return data from the array.
--> 411 return self.data[key]
    412
    413 def __setitem__(self, key, item):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/proxy.pyc in __getitem__(self, index)
    112
    113 # Fetch data.
--> 114 resp, data = openurl(url, self.cache, self.username, self.password)
    115
    116 # First lines are ASCII information that end with 'Data:\n'.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/util/http.pyc in openurl(url, cache, username, password)
     19 m = re.search('code = (?P<code>\d+);\s*message = "(?P<msg>.*)"', data, re.DOTALL | re.MULTILINE)
     20 msg = 'Server error %(code)s: "%(msg)s"' % m.groupdict()
---> 21 raise ClientError(msg)
     22
     23 return resp, data

ClientError: 'Server error 0: "invalid literal for int(): [1113"'

-----------------------------

Note that this is a different error than previously reported. Also, the correct time index is still 1080:

In [40]: taxvals = datobj.variables['time'][:]

In [41]: num2date(taxvals[1080],timedata.units,timedata.calendar)
Out[41]: 1951-01-16 12:00:00

-----------------------------

This dataset, generated by one of the IPCC models, is password-protected, but could be a good target for decoding, since it is typical of a large class of climate models, that generate a lot of analytical activity. To get a password (they're free) one must register. Info is here: http://www-pcmdi.llnl.gov/ipcc/info_for_analysts.php. Follow "How to access..." then "Register to download output." Once you get a userid and password they can be inserted in the NetCDFFile call, voila. Note that there is a new iteration of IPCC coming down the pike; new model files to become widely available starting in 2010.

------------------------------

The underlying data is available via ftp. I fetched it and extracted a small slab, which is available at http://iri.columbia.edu/~amg/test/gfdl_test.nc. The CDAT package can digest this file; first time step is plotted here: http://iri.columbia.edu/~amg/test/gfdl_test_time0.png. The dates can also be read by this package, and run from Aug 1950 to Aug 1951, inclusive (13 mos). So the file does not seem to be garbage.

In [16]: datobj = ncf('gfdl_test.nc')
In [17]: timedata = datobj.variables['time']
In [18]: date0 = dt(1951,1,16,12,0,0)
In [19]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1011 import bisect
   1012
-> 1013 index = numpy.array([bisect.bisect_left(nctime, n) for n in num], int)
   1014
   1015 nomatch = num2date(nctime[index], nctime.units) != dates

TypeError: object of type 'netcdf_variable' has no len()

Investigating the time axis,

In [20]: taxvals = timedata[:]

In [21]: taxvals
Out[21]: array([ 32712.5, 32743. , 32773.5, 32804. , 32834.5, 32865.5, 32895. , 32924.5, 32955. , 32985.5, 33016. , 33046.5, 33077.5])

In [22]: num2date(taxvals,timedata.units,timedata.calendar)
Out[22]: array([1950-08-16 12:00:00, 1950-09-16 00:00:00, 1950-10-16 12:00:00,
       1950-11-16 00:00:00, 1950-12-16 12:00:00, 1951-01-16 12:00:00,
       1951-02-15 00:00:00, 1951-03-16 12:00:00, 1951-04-16 00:00:00,
       1951-05-16 12:00:00, 1951-06-16 00:00:00, 1951-07-16 12:00:00,
       1951-08-16 12:00:00], dtype=object)

Which agrees with what CDAT sees.

-------------------------

I think this is enough for now. I also had problems opening data files whose time units were like "months since xxxx-xx-xx," since the "months" unit does not seem to be supported. ("years since..." could also be useful in some cases.) But maybe one or two things at a time is enough!

Thanks for any assistance/advice!

Best,

Arthur

Jeff Whitaker wrote:

David Huard wrote:

Arthur,

I wrote the date2index function and I think what you are seeing is a bug that I fixed a couple of months ago. By using the latest version of netcdf4-python, not only should this bug disappear, but you'll also find that date2index now supports different selection methods: 'exact', 'before', 'after', 'nearest', that should help with your use case.

If this does not fix the problem you are seeing, I'd appreciate having a copy of the file and code to reproduce the problem and find a solution.

HTH,

David Huard

Arthur: I've just updated basemap svn with David's latest version of date2index, so another option is to update basemap from svn. Or, even simpler, just drop the attached netcdftime.py file in lib/mpl_toolkits/basemap (replacing the old one) and run python setup.py install.

-Jeff

On Mon, Sep 7, 2009 at 9:27 AM, Arthur M. Greene >>> <amg@…1985… <mailto:amg@…1985…>> wrote:

    Hi All,

    The problem is not with fetching the data slice itself, but
    finding the
    correct indices to specify, particularly with the time dimension. The
    below examples refer to a remote dataset that I can open and slice
    using
    indices, as in

    slice = remoteobj.variables['tas'][:120,20:40,30:50].

    However, I have problems when trying to use the syntax in
    plotsst.py or
    pnganim.py (from the examples) to find time indices:

    In [107]: from datetime import datetime as dt
    In [108]: date0 = dt(1951,1,1,0)
    In [110]: print date0
    1951-01-01 00:00:00

    In [125]: timedata = remoteobj.variables['time']
    In [126]: nt0 = date2index(date0,timedata)
    ---------------------------------------------------------------------------

    AssertionError Traceback (most recent
    call last)

    /home/amg/work/nhmm/<ipython console> in <module>()

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc

    in date2index(dates, nctime, calendar)
       3924 Returns an index or a sequence of indices.
       3925 """
    -> 3926 return netcdftime.date2index(dates, nctime, calendar=None)
       3927
       3928 def maskoceans(lonsin,latsin,datain,inlands=False):

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in date2index(dates, nctime, calendar)
        986
        987 # Perform check again.
    --> 988 _check_index(index, dates, nctime, calendar)
        989
        990 # convert numpy scalars or single element arrays to python
    ints.

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in _check_index(indices, dates, nctime, calendar)
        941 for n,i in enumerate(indices):
        942 t[n] = nctime[i]
    --> 943 assert numpy.all( num2date(t, nctime.units, calendar)
    == dates)
        944
        945

    AssertionError:

    ---------------------------------------------------------

    It turns out that date0 corresponds best to index 1080:

    In [139]: remoteobj.variables['time'][1080]
    Out[139]: 32865.5

    In [141]: num2date(32865.5,timedata.units,timedata.calendar)
    Out[141]: 1951-01-16 12:00:00

    This isn't the _exact_ date and time I had specified, but

    In [142]: date0 = dt(1951,01,16,12,00,00)
    In [143]: print date0
    1951-01-16 12:00:00

    In [144]: date2index(date0,timedata,timedata.calendar)

    produces the same AssertionError. Where is the problem?

    What I would _like_ to do is to issue a simple call using coordinates
    rather than the indices, of the form:

    slice = variable[date0:date1,[plev],lat0:lat1,lon0:lon1],

    or similar, preferably without writing a whole module just to find the
    correct indices. I need to fetch similar slices from a group of
    models,
    having time axes that may each be defined slightly differently --
    different calendars, time point set at a different day of the month,
    etc. (It's monthly data and I'm specifying only monthly bounds, even
    though the calendar may be defined as "days since 1860...") I need to
    automate the process so I get back the correct slab regardless.

    Suggestions appreciated!

    Thx,

    Arthur

    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
    Arthur M. Greene, Ph.D.
    The International Research Institute for Climate and Society (IRI)
    The Earth Institute, Columbia University, Lamont Campus

    amg at iri dot columbia dot edu | http://iri.columbia.edu
    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

    ------------------------------------------------------------------------------

    Let Crystal Reports handle the reporting - Free Crystal Reports
    2008 30-Day
    trial. Simplify your report design, integration and deployment -
    and focus on
    what you do best, core application coding. Discover what's new with
    Crystal Reports now. http://p.sf.net/sfu/bobj-july
    _______________________________________________
    Matplotlib-users mailing list
    Matplotlib-users@lists.sourceforge.net
    <mailto:Matplotlib-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/matplotlib-users

------------------------------------------------------------------------

------------------------------------------------------------------------------

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
  
--
Jeffrey S. Whitaker Phone : (303)497-6313
Meteorologist FAX : (303)497-6449
NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker@...259...
325 Broadway Office : Skaggs Research Cntr 1D-113
Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg

Thanks much. I am able to replicate your results using netcdf4.

FYI, I don't believe the xml file is a CDAT creation; rather, it is probably written using CMOR (http://www2-pcmdi.llnl.gov/cmor), which was used to standardize the IPCC model output files, presumably so they could be accessed by a variety of applications via OpenDAP. Hmmmm...

At any rate, I can access the remote data object with netcdf4, but no luck retrieving either data or a time index.

In [94]: datobj = ncf(fname)
In [95]: timedata = datobj.variables['time']
In [97]: taxvals = timedata[1070:1090]
In [99]: print taxvals
[ 32559.5 32590. 32620.5 32651. 32681.5 32712.5 32743. 32773.5
   32804. 32834.5 32865.5 32895. 32924.5 32955. 32985.5 33016.
   33046.5 33077.5 33108. 33138.5]
In [100]: print date2index(date0,timedata.units,timedata.calendar,select='nearest')

···

---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getattr__ (netCDF4.c:13593)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4._get_att (netCDF4.c:1806)()

AttributeError: NetCDF: Attribute not found

In [96]: print datobj.variables['tas'].shape
(1680, 90, 144)
In [101]: testdat = datobj.variables['tas'][0,:,:]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getitem__ (netCDF4.c:14286)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable._get (netCDF4.c:18945)()

RuntimeError: NetCDF: Variable has no data in DAP request

--------------------------

Well, at least the error messages are different...

Thanks again for all the assistance. It would be useful to access the IPCC output with OpenDap at some point.

Best,

Arthur

Jeff Whitaker wrote:

Arthur M. Greene wrote:

Thanks much for the quick response. I updated both matplotlib and
basemap (now at 0.99.5) via svn and noticed the new netcdftime.py. First, from within site-packages/mpl_toolkits/basemap,

$ grep date2index *.py
__init__.py::func:`date2index`: compute a time variable index
corresponding to a date.
__init__.py:def date2index(dates, nctime, calendar='proleptic_gregorian'):
__init__.py: return netcdftime.date2index(dates, nctime, calendar=None)
netcdftime.py:def date2index(dates, nctime, calendar=None, select='exact'):
netcdftime.py: date2index(dates, nctime, calendar=None, select='exact')

so there seems to be some disagreement between __init__.py and
netcdftime.py concerning the presence of the "select" argument. When I
call date2index with the "select" keyword arg I get

In [24]: ix0 = date2index(date0,timedata,timedata.calendar,select='nearest')
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

TypeError: date2index() got an unexpected keyword argument 'select'

Arthur: I forgot to update the wrapper function in __init__.py - that's fixed now if you do an svn update. Concerning your other problems below, using your test case exposed a couple of other bugs, but it still doesn't work. The basic problem is that the date2index function was designed to work with netCDF4 variable objects (http://code.google.com/p/netcdf4-python), and the netcdf file/variable objects that are produced by pupynere/pydap (the pure python netcdf /dap reader included in basemap) don't quite behave the same way. Using netCDF4, I can get your gfdl_test.nc case to work with

> cat testdate2index.py

#from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
from netCDF4 import Dataset as ncf
from netCDF4 import date2index, num2date
from mpl_toolkits import basemap
fname0 = ‘http://esgcet.llnl.gov/dap/
fname1 =\
‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
fname = fname0+fname1
#fname = ‘gfdl_test.nc’
print fname
datobj = ncf(fname)
print datobj.variables[‘tas’].shape
timedata = datobj.variables[‘time’]
from datetime import datetime as dt
date0 = dt(1951,1,16,12,0,0)
print num2date(timedata[:],timedata.units,calendar=timedata.calendar)
print date0
nt0 = date2index(date0,timedata,select=‘nearest’)
print nt0
print \
timedata[nt0],num2date(timedata[nt0],timedata.units,calendar=timedata.calendar)

> python testdate2index.py

gfdl_test.nc
(13, 31, 29)
[1950-08-16 12:00:00 1950-09-16 00:00:00 1950-10-16 12:00:00
1950-11-16 00:00:00 1950-12-16 12:00:00 1951-01-16 12:00:00
1951-02-15 00:00:00 1951-03-16 12:00:00 1951-04-16 00:00:00
1951-05-16 12:00:00 1951-06-16 00:00:00 1951-07-16 12:00:00
1951-08-16 12:00:00]
1951-01-16 12:00:00
5
[ 32865.5] [1951-01-16 12:00:00]

Your original example doesn't work because the URL is not an opendap server, it's some kind of CDAT xml file that presumably only CDAT understands.

We will see if we can fix the date2index function included in basemap (if not I will remove it), but for now I recommend using netcdf4-python. It's really a much more robust and feature-rich solution for netcdf reading and writing.
-Jeff

-----------------------

This detail aside, I am still having difficulty with date2index, but
annoyingly, I seem to get different error messages with different datasets. I'll illustrate two here, starting with the one I initially posted about. (See note below regarding this data.)

In [3]: from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
In [10]: from mpl_toolkits import basemap
In [11]: print basemap.__version__
0.99.5
In [24]: fname0 = ‘http://esgcet.llnl.gov/dap/
In [25]: fname1 = ‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
In [26]: fname = fname0+fname1
In [28]: datobj = ncf(fname)
In [33]: datobj.variables[‘tas’].shape
Out[33]: (1680, 90, 144)
In [34]: timedata = datobj.variables[‘time’]
In [35]: from datetime import datetime as dt
In [36]: date0 = dt(1951,1,16,12,0,0)
In [37]: print date0
1951-01-16 12:00:00
In [38]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

ClientError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1006 # If the times do not correspond, then it means that the times
   1007 # are not increasing uniformly and we try the bisection method.
-> 1008 if not _check_index(index, dates, nctime, calendar):
   1009
   1010 # Use the bisection method. Assumes the dates are ordered.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in _check_index(indices, dates, nctime, calendar)
    959 return False
    960
--> 961 t = nctime[indices]
    962 return numpy.all( num2date(t, nctime.units, calendar) == dates)
    963

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdf.pyc in __getitem__(self, index)
     65
     66 def __getitem__(self, index):
---> 67 datout = squeeze(self._var.__getitem__(index))
     68 # automatically
     69 # - remove singleton dimensions

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/dtypes.pyc in __getitem__(self, key)
    409 def __getitem__(self, key):
    410 # Return data from the array.
--> 411 return self.data[key]
    412
    413 def __setitem__(self, key, item):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/proxy.pyc in __getitem__(self, index)
    112
    113 # Fetch data.
--> 114 resp, data = openurl(url, self.cache, self.username, self.password)
    115
    116 # First lines are ASCII information that end with 'Data:\n'.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/util/http.pyc in openurl(url, cache, username, password)
     19 m = re.search('code = (?P<code>\d+);\s*message = "(?P<msg>.*)"', data, re.DOTALL | re.MULTILINE)
     20 msg = 'Server error %(code)s: "%(msg)s"' % m.groupdict()
---> 21 raise ClientError(msg)
     22
     23 return resp, data

ClientError: 'Server error 0: "invalid literal for int(): [1113"'

-----------------------------

Note that this is a different error than previously reported. Also, the correct time index is still 1080:

In [40]: taxvals = datobj.variables['time'][:]

In [41]: num2date(taxvals[1080],timedata.units,timedata.calendar)
Out[41]: 1951-01-16 12:00:00

-----------------------------

This dataset, generated by one of the IPCC models, is password-protected, but could be a good target for decoding, since it is typical of a large class of climate models, that generate a lot of analytical activity. To get a password (they're free) one must register. Info is here: http://www-pcmdi.llnl.gov/ipcc/info_for_analysts.php. Follow "How to access..." then "Register to download output." Once you get a userid and password they can be inserted in the NetCDFFile call, voila. Note that there is a new iteration of IPCC coming down the pike; new model files to become widely available starting in 2010.

------------------------------

The underlying data is available via ftp. I fetched it and extracted a small slab, which is available at http://iri.columbia.edu/~amg/test/gfdl_test.nc. The CDAT package can digest this file; first time step is plotted here: http://iri.columbia.edu/~amg/test/gfdl_test_time0.png. The dates can also be read by this package, and run from Aug 1950 to Aug 1951, inclusive (13 mos). So the file does not seem to be garbage.

In [16]: datobj = ncf('gfdl_test.nc')
In [17]: timedata = datobj.variables['time']
In [18]: date0 = dt(1951,1,16,12,0,0)
In [19]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1011 import bisect
   1012
-> 1013 index = numpy.array([bisect.bisect_left(nctime, n) for n in num], int)
   1014
   1015 nomatch = num2date(nctime[index], nctime.units) != dates

TypeError: object of type 'netcdf_variable' has no len()

Investigating the time axis,

In [20]: taxvals = timedata[:]

In [21]: taxvals
Out[21]: array([ 32712.5, 32743. , 32773.5, 32804. , 32834.5, 32865.5, 32895. , 32924.5, 32955. , 32985.5, 33016. , 33046.5, 33077.5])

In [22]: num2date(taxvals,timedata.units,timedata.calendar)
Out[22]: array([1950-08-16 12:00:00, 1950-09-16 00:00:00, 1950-10-16 12:00:00,
       1950-11-16 00:00:00, 1950-12-16 12:00:00, 1951-01-16 12:00:00,
       1951-02-15 00:00:00, 1951-03-16 12:00:00, 1951-04-16 00:00:00,
       1951-05-16 12:00:00, 1951-06-16 00:00:00, 1951-07-16 12:00:00,
       1951-08-16 12:00:00], dtype=object)

Which agrees with what CDAT sees.

-------------------------

I think this is enough for now. I also had problems opening data files whose time units were like "months since xxxx-xx-xx," since the "months" unit does not seem to be supported. ("years since..." could also be useful in some cases.) But maybe one or two things at a time is enough!

Thanks for any assistance/advice!

Best,

Arthur

Jeff Whitaker wrote:

David Huard wrote:

Arthur,

I wrote the date2index function and I think what you are seeing is a bug that I fixed a couple of months ago. By using the latest version of netcdf4-python, not only should this bug disappear, but you'll also find that date2index now supports different selection methods: 'exact', 'before', 'after', 'nearest', that should help with your use case.

If this does not fix the problem you are seeing, I'd appreciate having a copy of the file and code to reproduce the problem and find a solution.

HTH,

David Huard

Arthur: I've just updated basemap svn with David's latest version of date2index, so another option is to update basemap from svn. Or, even simpler, just drop the attached netcdftime.py file in lib/mpl_toolkits/basemap (replacing the old one) and run python setup.py install.

-Jeff

On Mon, Sep 7, 2009 at 9:27 AM, Arthur M. Greene >>>> <amg@…1985… <mailto:amg@…1985…>> wrote:

    Hi All,

    The problem is not with fetching the data slice itself, but
    finding the
    correct indices to specify, particularly with the time dimension. The
    below examples refer to a remote dataset that I can open and slice
    using
    indices, as in

    slice = remoteobj.variables['tas'][:120,20:40,30:50].

    However, I have problems when trying to use the syntax in
    plotsst.py or
    pnganim.py (from the examples) to find time indices:

    In [107]: from datetime import datetime as dt
    In [108]: date0 = dt(1951,1,1,0)
    In [110]: print date0
    1951-01-01 00:00:00

    In [125]: timedata = remoteobj.variables['time']
    In [126]: nt0 = date2index(date0,timedata)
    ---------------------------------------------------------------------------

    AssertionError Traceback (most recent
    call last)

    /home/amg/work/nhmm/<ipython console> in <module>()

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc

    in date2index(dates, nctime, calendar)
       3924 Returns an index or a sequence of indices.
       3925 """
    -> 3926 return netcdftime.date2index(dates, nctime, calendar=None)
       3927
       3928 def maskoceans(lonsin,latsin,datain,inlands=False):

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in date2index(dates, nctime, calendar)
        986
        987 # Perform check again.
    --> 988 _check_index(index, dates, nctime, calendar)
        989
        990 # convert numpy scalars or single element arrays to python
    ints.

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in _check_index(indices, dates, nctime, calendar)
        941 for n,i in enumerate(indices):
        942 t[n] = nctime[i]
    --> 943 assert numpy.all( num2date(t, nctime.units, calendar)
    == dates)
        944
        945

    AssertionError:

    ---------------------------------------------------------

    It turns out that date0 corresponds best to index 1080:

    In [139]: remoteobj.variables['time'][1080]
    Out[139]: 32865.5

    In [141]: num2date(32865.5,timedata.units,timedata.calendar)
    Out[141]: 1951-01-16 12:00:00

    This isn't the _exact_ date and time I had specified, but

    In [142]: date0 = dt(1951,01,16,12,00,00)
    In [143]: print date0
    1951-01-16 12:00:00

    In [144]: date2index(date0,timedata,timedata.calendar)

    produces the same AssertionError. Where is the problem?

    What I would _like_ to do is to issue a simple call using coordinates
    rather than the indices, of the form:

    slice = variable[date0:date1,[plev],lat0:lat1,lon0:lon1],

    or similar, preferably without writing a whole module just to find the
    correct indices. I need to fetch similar slices from a group of
    models,
    having time axes that may each be defined slightly differently --
    different calendars, time point set at a different day of the month,
    etc. (It's monthly data and I'm specifying only monthly bounds, even
    though the calendar may be defined as "days since 1860...") I need to
    automate the process so I get back the correct slab regardless.

    Suggestions appreciated!

    Thx,

    Arthur

    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
    Arthur M. Greene, Ph.D.
    The International Research Institute for Climate and Society (IRI)
    The Earth Institute, Columbia University, Lamont Campus

    amg at iri dot columbia dot edu | http://iri.columbia.edu
    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

    ------------------------------------------------------------------------------

    Let Crystal Reports handle the reporting - Free Crystal Reports
    2008 30-Day
    trial. Simplify your report design, integration and deployment -
    and focus on
    what you do best, core application coding. Discover what's new with
    Crystal Reports now. http://p.sf.net/sfu/bobj-july
    _______________________________________________
    Matplotlib-users mailing list
    Matplotlib-users@lists.sourceforge.net
    <mailto:Matplotlib-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/matplotlib-users

------------------------------------------------------------------------

------------------------------------------------------------------------------

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
  
--
*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
Arthur M. Greene, Ph.D.
The International Research Institute for Climate and Society
The Earth Institute, Columbia University, Lamont Campus
Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA
amg at iri dot columbia dot edu | http://iri.columbia.edu
*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

Arthur M. Greene wrote:

Thanks much. I am able to replicate your results using netcdf4.

FYI, I don't believe the xml file is a CDAT creation; rather, it is probably written using CMOR (http://www2-pcmdi.llnl.gov/cmor), which was used to standardize the IPCC model output files, presumably so they could be accessed by a variety of applications via OpenDAP. Hmmmm...

At any rate, I can access the remote data object with netcdf4, but no luck retrieving either data or a time index.

In [94]: datobj = ncf(fname)
In [95]: timedata = datobj.variables['time']
In [97]: taxvals = timedata[1070:1090]
In [99]: print taxvals
[ 32559.5 32590. 32620.5 32651. 32681.5 32712.5 32743. 32773.5
  32804. 32834.5 32865.5 32895. 32924.5 32955. 32985.5 33016.
  33046.5 33077.5 33108. 33138.5]
In [100]: print date2index(date0,timedata.units,timedata.calendar,select='nearest')

Arthur: That's because the timedata variable has no attributes (no calendar or units), and the date2index function looks for these attributes. That's weird though, since that dataset is supposed to be CF compliant. I wonder if openDAP is not handling that xml file correctly.

-Jeff

···

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getattr__ (netCDF4.c:13593)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4._get_att (netCDF4.c:1806)()

AttributeError: NetCDF: Attribute not found

In [96]: print datobj.variables['tas'].shape
(1680, 90, 144)
In [101]: testdat = datobj.variables['tas'][0,:,:]
---------------------------------------------------------------------------

RuntimeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getitem__ (netCDF4.c:14286)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable._get (netCDF4.c:18945)()

RuntimeError: NetCDF: Variable has no data in DAP request

--------------------------

Well, at least the error messages are different...

Thanks again for all the assistance. It would be useful to access the IPCC output with OpenDap at some point.

Best,

Arthur

Jeff Whitaker wrote:

Arthur M. Greene wrote:

Thanks much for the quick response. I updated both matplotlib and
basemap (now at 0.99.5) via svn and noticed the new netcdftime.py. First, from within site-packages/mpl_toolkits/basemap,

$ grep date2index *.py
__init__.py::func:`date2index`: compute a time variable index
corresponding to a date.
__init__.py:def date2index(dates, nctime, calendar='proleptic_gregorian'):
__init__.py: return netcdftime.date2index(dates, nctime, calendar=None)
netcdftime.py:def date2index(dates, nctime, calendar=None, select='exact'):
netcdftime.py: date2index(dates, nctime, calendar=None, select='exact')

so there seems to be some disagreement between __init__.py and
netcdftime.py concerning the presence of the "select" argument. When I
call date2index with the "select" keyword arg I get

In [24]: ix0 = date2index(date0,timedata,timedata.calendar,select='nearest')
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

TypeError: date2index() got an unexpected keyword argument 'select'

Arthur: I forgot to update the wrapper function in __init__.py - that's fixed now if you do an svn update. Concerning your other problems below, using your test case exposed a couple of other bugs, but it still doesn't work. The basic problem is that the date2index function was designed to work with netCDF4 variable objects (http://code.google.com/p/netcdf4-python), and the netcdf file/variable objects that are produced by pupynere/pydap (the pure python netcdf /dap reader included in basemap) don't quite behave the same way. Using netCDF4, I can get your gfdl_test.nc case to work with

> cat testdate2index.py

#from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
from netCDF4 import Dataset as ncf
from netCDF4 import date2index, num2date
from mpl_toolkits import basemap
fname0 = ‘http://esgcet.llnl.gov/dap/
fname1 =\
‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
fname = fname0+fname1
#fname = ‘gfdl_test.nc’
print fname
datobj = ncf(fname)
print datobj.variables[‘tas’].shape
timedata = datobj.variables[‘time’]
from datetime import datetime as dt
date0 = dt(1951,1,16,12,0,0)
print num2date(timedata[:],timedata.units,calendar=timedata.calendar)
print date0
nt0 = date2index(date0,timedata,select=‘nearest’)
print nt0
print \
timedata[nt0],num2date(timedata[nt0],timedata.units,calendar=timedata.calendar)

> python testdate2index.py

gfdl_test.nc
(13, 31, 29)
[1950-08-16 12:00:00 1950-09-16 00:00:00 1950-10-16 12:00:00
1950-11-16 00:00:00 1950-12-16 12:00:00 1951-01-16 12:00:00
1951-02-15 00:00:00 1951-03-16 12:00:00 1951-04-16 00:00:00
1951-05-16 12:00:00 1951-06-16 00:00:00 1951-07-16 12:00:00
1951-08-16 12:00:00]
1951-01-16 12:00:00
5
[ 32865.5] [1951-01-16 12:00:00]

Your original example doesn't work because the URL is not an opendap server, it's some kind of CDAT xml file that presumably only CDAT understands.

We will see if we can fix the date2index function included in basemap (if not I will remove it), but for now I recommend using netcdf4-python. It's really a much more robust and feature-rich solution for netcdf reading and writing.
-Jeff

-----------------------

This detail aside, I am still having difficulty with date2index, but
annoyingly, I seem to get different error messages with different datasets. I'll illustrate two here, starting with the one I initially posted about. (See note below regarding this data.)

In [3]: from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
In [10]: from mpl_toolkits import basemap
In [11]: print basemap.__version__
0.99.5
In [24]: fname0 = ‘http://esgcet.llnl.gov/dap/
In [25]: fname1 = ‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
In [26]: fname = fname0+fname1
In [28]: datobj = ncf(fname)
In [33]: datobj.variables[‘tas’].shape
Out[33]: (1680, 90, 144)
In [34]: timedata = datobj.variables[‘time’]
In [35]: from datetime import datetime as dt
In [36]: date0 = dt(1951,1,16,12,0,0)
In [37]: print date0
1951-01-16 12:00:00
In [38]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

ClientError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1006 # If the times do not correspond, then it means that the times
   1007 # are not increasing uniformly and we try the bisection method.
-> 1008 if not _check_index(index, dates, nctime, calendar):
   1009
   1010 # Use the bisection method. Assumes the dates are ordered.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in _check_index(indices, dates, nctime, calendar)
    959 return False
    960
--> 961 t = nctime[indices]
    962 return numpy.all( num2date(t, nctime.units, calendar) == dates)
    963

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdf.pyc in __getitem__(self, index)
     65
     66 def __getitem__(self, index):
---> 67 datout = squeeze(self._var.__getitem__(index))
     68 # automatically
     69 # - remove singleton dimensions

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/dtypes.pyc in __getitem__(self, key)
    409 def __getitem__(self, key):
    410 # Return data from the array.
--> 411 return self.data[key]
    412
    413 def __setitem__(self, key, item):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/proxy.pyc in __getitem__(self, index)
    112
    113 # Fetch data.
--> 114 resp, data = openurl(url, self.cache, self.username, self.password)
    115
    116 # First lines are ASCII information that end with 'Data:\n'.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/util/http.pyc in openurl(url, cache, username, password)
     19 m = re.search('code = (?P<code>\d+);\s*message = "(?P<msg>.*)"', data, re.DOTALL | re.MULTILINE)
     20 msg = 'Server error %(code)s: "%(msg)s"' % m.groupdict()
---> 21 raise ClientError(msg)
     22
     23 return resp, data

ClientError: 'Server error 0: "invalid literal for int(): [1113"'

-----------------------------

Note that this is a different error than previously reported. Also, the correct time index is still 1080:

In [40]: taxvals = datobj.variables['time'][:]

In [41]: num2date(taxvals[1080],timedata.units,timedata.calendar)
Out[41]: 1951-01-16 12:00:00

-----------------------------

This dataset, generated by one of the IPCC models, is password-protected, but could be a good target for decoding, since it is typical of a large class of climate models, that generate a lot of analytical activity. To get a password (they're free) one must register. Info is here: http://www-pcmdi.llnl.gov/ipcc/info_for_analysts.php. Follow "How to access..." then "Register to download output." Once you get a userid and password they can be inserted in the NetCDFFile call, voila. Note that there is a new iteration of IPCC coming down the pike; new model files to become widely available starting in 2010.

------------------------------

The underlying data is available via ftp. I fetched it and extracted a small slab, which is available at http://iri.columbia.edu/~amg/test/gfdl_test.nc. The CDAT package can digest this file; first time step is plotted here: http://iri.columbia.edu/~amg/test/gfdl_test_time0.png. The dates can also be read by this package, and run from Aug 1950 to Aug 1951, inclusive (13 mos). So the file does not seem to be garbage.

In [16]: datobj = ncf('gfdl_test.nc')
In [17]: timedata = datobj.variables['time']
In [18]: date0 = dt(1951,1,16,12,0,0)
In [19]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1011 import bisect
   1012
-> 1013 index = numpy.array([bisect.bisect_left(nctime, n) for n in num], int)
   1014
   1015 nomatch = num2date(nctime[index], nctime.units) != dates

TypeError: object of type 'netcdf_variable' has no len()

Investigating the time axis,

In [20]: taxvals = timedata[:]

In [21]: taxvals
Out[21]: array([ 32712.5, 32743. , 32773.5, 32804. , 32834.5, 32865.5, 32895. , 32924.5, 32955. , 32985.5, 33016. , 33046.5, 33077.5])

In [22]: num2date(taxvals,timedata.units,timedata.calendar)
Out[22]: array([1950-08-16 12:00:00, 1950-09-16 00:00:00, 1950-10-16 12:00:00,
       1950-11-16 00:00:00, 1950-12-16 12:00:00, 1951-01-16 12:00:00,
       1951-02-15 00:00:00, 1951-03-16 12:00:00, 1951-04-16 00:00:00,
       1951-05-16 12:00:00, 1951-06-16 00:00:00, 1951-07-16 12:00:00,
       1951-08-16 12:00:00], dtype=object)

Which agrees with what CDAT sees.

-------------------------

I think this is enough for now. I also had problems opening data files whose time units were like "months since xxxx-xx-xx," since the "months" unit does not seem to be supported. ("years since..." could also be useful in some cases.) But maybe one or two things at a time is enough!

Thanks for any assistance/advice!

Best,

Arthur

Jeff Whitaker wrote:

David Huard wrote:

Arthur,

I wrote the date2index function and I think what you are seeing is a bug that I fixed a couple of months ago. By using the latest version of netcdf4-python, not only should this bug disappear, but you'll also find that date2index now supports different selection methods: 'exact', 'before', 'after', 'nearest', that should help with your use case.

If this does not fix the problem you are seeing, I'd appreciate having a copy of the file and code to reproduce the problem and find a solution.

HTH,

David Huard

Arthur: I've just updated basemap svn with David's latest version of date2index, so another option is to update basemap from svn. Or, even simpler, just drop the attached netcdftime.py file in lib/mpl_toolkits/basemap (replacing the old one) and run python setup.py install.

-Jeff

On Mon, Sep 7, 2009 at 9:27 AM, Arthur M. Greene >>>>> <amg@…1985… <mailto:amg@…1985…>> wrote:

    Hi All,

    The problem is not with fetching the data slice itself, but
    finding the
    correct indices to specify, particularly with the time dimension. The
    below examples refer to a remote dataset that I can open and slice
    using
    indices, as in

    slice = remoteobj.variables['tas'][:120,20:40,30:50].

    However, I have problems when trying to use the syntax in
    plotsst.py or
    pnganim.py (from the examples) to find time indices:

    In [107]: from datetime import datetime as dt
    In [108]: date0 = dt(1951,1,1,0)
    In [110]: print date0
    1951-01-01 00:00:00

    In [125]: timedata = remoteobj.variables['time']
    In [126]: nt0 = date2index(date0,timedata)
    ---------------------------------------------------------------------------

    AssertionError Traceback (most recent
    call last)

    /home/amg/work/nhmm/<ipython console> in <module>()

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc

    in date2index(dates, nctime, calendar)
       3924 Returns an index or a sequence of indices.
       3925 """
    -> 3926 return netcdftime.date2index(dates, nctime, calendar=None)
       3927
       3928 def maskoceans(lonsin,latsin,datain,inlands=False):

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in date2index(dates, nctime, calendar)
        986
        987 # Perform check again.
    --> 988 _check_index(index, dates, nctime, calendar)
        989
        990 # convert numpy scalars or single element arrays to python
    ints.

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in _check_index(indices, dates, nctime, calendar)
        941 for n,i in enumerate(indices):
        942 t[n] = nctime[i]
    --> 943 assert numpy.all( num2date(t, nctime.units, calendar)
    == dates)
        944
        945

    AssertionError:

    ---------------------------------------------------------

    It turns out that date0 corresponds best to index 1080:

    In [139]: remoteobj.variables['time'][1080]
    Out[139]: 32865.5

    In [141]: num2date(32865.5,timedata.units,timedata.calendar)
    Out[141]: 1951-01-16 12:00:00

    This isn't the _exact_ date and time I had specified, but

    In [142]: date0 = dt(1951,01,16,12,00,00)
    In [143]: print date0
    1951-01-16 12:00:00

    In [144]: date2index(date0,timedata,timedata.calendar)

    produces the same AssertionError. Where is the problem?

    What I would _like_ to do is to issue a simple call using coordinates
    rather than the indices, of the form:

    slice = variable[date0:date1,[plev],lat0:lat1,lon0:lon1],

    or similar, preferably without writing a whole module just to find the
    correct indices. I need to fetch similar slices from a group of
    models,
    having time axes that may each be defined slightly differently --
    different calendars, time point set at a different day of the month,
    etc. (It's monthly data and I'm specifying only monthly bounds, even
    though the calendar may be defined as "days since 1860...") I need to
    automate the process so I get back the correct slab regardless.

    Suggestions appreciated!

    Thx,

    Arthur

    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
    Arthur M. Greene, Ph.D.
    The International Research Institute for Climate and Society (IRI)
    The Earth Institute, Columbia University, Lamont Campus

    amg at iri dot columbia dot edu | http://iri.columbia.edu
    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

    ------------------------------------------------------------------------------

    Let Crystal Reports handle the reporting - Free Crystal Reports
    2008 30-Day
    trial. Simplify your report design, integration and deployment -
    and focus on
    what you do best, core application coding. Discover what's new with
    Crystal Reports now. http://p.sf.net/sfu/bobj-july
    _______________________________________________
    Matplotlib-users mailing list
    Matplotlib-users@lists.sourceforge.net
    <mailto:Matplotlib-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/matplotlib-users

------------------------------------------------------------------------

------------------------------------------------------------------------------

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
  
--
Jeffrey S. Whitaker Phone : (303)497-6313
Meteorologist FAX : (303)497-6449
NOAA/OAR/PSD R/PSD1 Email : Jeffrey.S.Whitaker@...259...
325 Broadway Office : Skaggs Research Cntr 1D-113
Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg

Just to add a little info:

I've been poking around various OPeNDAP servers looking for files to try and open (and read), and have had a little success, so the module does seem to work, if not all the time for my purposes. At the moment I'm on a 64-bit machine (Fedora 10), so this is encouraging. Some details:

I tried several of the IPCC AR4 models at PCMDI, with results similar to what I reported earlier. The time object appears with neither units nor a calendar. Looking at the metadata shows this not to be correct, for at least the three models I investigated (gfdl_cm2_1, mpi_echam5, ncar_ccsm3_0). I believe the inclusion of units and a calendar are standard procedure for all of these models, and would probably cause the dataset to be flagged if they weren't present. Many users (like hundreds) have downloaded and analyzed these files.

The IRI data library (http://iridl.ldeo.columbia.edu/) has a large collection of datasets, all available using opendap. But I had problems with the ones I tried because the calendars seem all to be given as "360", rather than "360_day". (Perhaps someone is cutting corners with the typing, I can't say...) I couldn't correct this by setting timedata.calendar='360_day' because the files are opened read-only. There must be files on this server with differently-defined calendars, since the data come from many different sources. I'll have to root around some more to turn some up.

Similarly, the time units in http://test.opendap.org/opendap/data/nc/data.nc are given simply as 'hour', so num2date can't figure out what dates the time values refer to. I wouldn't have expected this at this URL, but maybe it's a test? Aside from the fact that "... since" was missing, netcdf4 also complained that 'hour' was not an acceptable unit. Only 'hours' will do. (No 'months' or 'years' either, it seems.)

http://test.opendap.org/opendap/data/nc/coads_climatology.nc seems to download OK, and there are units, but it's a 12-month climatology, so calendar is irrelevant. I could plot the data, although it appeared reversed left-to-right. (I didn't add axes, but just plotted it raw.)

The conclusion seems to be that (1) there may be a lot of non-conforming datasets out there (and netcdf4 may be a little fussy about what time units it will accept, too), but (2) since there seems to be some discordance w.r.t. the IPCC data (where we believe the units and calendar must actually be present) one cannot be absolutely sure that all of the problems experienced are solely due to malformed data descriptions. Evidently more detective work will be required to sort everything out...

Best, thanks again for the assistance. I've been up too late chasing around the web...

Arthur

Jeff Whitaker wrote:

···

Arthur M. Greene wrote:

Thanks much. I am able to replicate your results using netcdf4.

FYI, I don't believe the xml file is a CDAT creation; rather, it is probably written using CMOR (http://www2-pcmdi.llnl.gov/cmor), which was used to standardize the IPCC model output files, presumably so they could be accessed by a variety of applications via OpenDAP. Hmmmm...

At any rate, I can access the remote data object with netcdf4, but no luck retrieving either data or a time index.

In [94]: datobj = ncf(fname)
In [95]: timedata = datobj.variables['time']
In [97]: taxvals = timedata[1070:1090]
In [99]: print taxvals
[ 32559.5 32590. 32620.5 32651. 32681.5 32712.5 32743. 32773.5
  32804. 32834.5 32865.5 32895. 32924.5 32955. 32985.5 33016.
  33046.5 33077.5 33108. 33138.5]
In [100]: print date2index(date0,timedata.units,timedata.calendar,select='nearest')

Arthur: That's because the timedata variable has no attributes (no calendar or units), and the date2index function looks for these attributes. That's weird though, since that dataset is supposed to be CF compliant. I wonder if openDAP is not handling that xml file correctly.

-Jeff

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getattr__ (netCDF4.c:13593)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4._get_att (netCDF4.c:1806)()

AttributeError: NetCDF: Attribute not found

In [96]: print datobj.variables['tas'].shape
(1680, 90, 144)
In [101]: testdat = datobj.variables['tas'][0,:,:]
---------------------------------------------------------------------------

RuntimeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getitem__ (netCDF4.c:14286)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable._get (netCDF4.c:18945)()

RuntimeError: NetCDF: Variable has no data in DAP request

--------------------------

Well, at least the error messages are different...

Thanks again for all the assistance. It would be useful to access the IPCC output with OpenDap at some point.

Best,

Arthur

Jeff Whitaker wrote:

Arthur M. Greene wrote:

Thanks much for the quick response. I updated both matplotlib and
basemap (now at 0.99.5) via svn and noticed the new netcdftime.py. First, from within site-packages/mpl_toolkits/basemap,

$ grep date2index *.py
__init__.py::func:`date2index`: compute a time variable index
corresponding to a date.
__init__.py:def date2index(dates, nctime, calendar='proleptic_gregorian'):
__init__.py: return netcdftime.date2index(dates, nctime, calendar=None)
netcdftime.py:def date2index(dates, nctime, calendar=None, select='exact'):
netcdftime.py: date2index(dates, nctime, calendar=None, select='exact')

so there seems to be some disagreement between __init__.py and
netcdftime.py concerning the presence of the "select" argument. When I
call date2index with the "select" keyword arg I get

In [24]: ix0 = date2index(date0,timedata,timedata.calendar,select='nearest')
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

TypeError: date2index() got an unexpected keyword argument 'select'

Arthur: I forgot to update the wrapper function in __init__.py - that's fixed now if you do an svn update. Concerning your other problems below, using your test case exposed a couple of other bugs, but it still doesn't work. The basic problem is that the date2index function was designed to work with netCDF4 variable objects (http://code.google.com/p/netcdf4-python), and the netcdf file/variable objects that are produced by pupynere/pydap (the pure python netcdf /dap reader included in basemap) don't quite behave the same way. Using netCDF4, I can get your gfdl_test.nc case to work with

> cat testdate2index.py

#from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
from netCDF4 import Dataset as ncf
from netCDF4 import date2index, num2date
from mpl_toolkits import basemap
fname0 = ‘http://esgcet.llnl.gov/dap/
fname1 =\
‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
fname = fname0+fname1
#fname = ‘gfdl_test.nc’
print fname
datobj = ncf(fname)
print datobj.variables[‘tas’].shape
timedata = datobj.variables[‘time’]
from datetime import datetime as dt
date0 = dt(1951,1,16,12,0,0)
print num2date(timedata[:],timedata.units,calendar=timedata.calendar)
print date0
nt0 = date2index(date0,timedata,select=‘nearest’)
print nt0
print \
timedata[nt0],num2date(timedata[nt0],timedata.units,calendar=timedata.calendar)

> python testdate2index.py

gfdl_test.nc
(13, 31, 29)
[1950-08-16 12:00:00 1950-09-16 00:00:00 1950-10-16 12:00:00
1950-11-16 00:00:00 1950-12-16 12:00:00 1951-01-16 12:00:00
1951-02-15 00:00:00 1951-03-16 12:00:00 1951-04-16 00:00:00
1951-05-16 12:00:00 1951-06-16 00:00:00 1951-07-16 12:00:00
1951-08-16 12:00:00]
1951-01-16 12:00:00
5
[ 32865.5] [1951-01-16 12:00:00]

Your original example doesn't work because the URL is not an opendap server, it's some kind of CDAT xml file that presumably only CDAT understands.

We will see if we can fix the date2index function included in basemap (if not I will remove it), but for now I recommend using netcdf4-python. It's really a much more robust and feature-rich solution for netcdf reading and writing.
-Jeff

-----------------------

This detail aside, I am still having difficulty with date2index, but
annoyingly, I seem to get different error messages with different datasets. I'll illustrate two here, starting with the one I initially posted about. (See note below regarding this data.)

In [3]: from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
In [10]: from mpl_toolkits import basemap
In [11]: print basemap.__version__
0.99.5
In [24]: fname0 = ‘http://esgcet.llnl.gov/dap/
In [25]: fname1 = ‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
In [26]: fname = fname0+fname1
In [28]: datobj = ncf(fname)
In [33]: datobj.variables[‘tas’].shape
Out[33]: (1680, 90, 144)
In [34]: timedata = datobj.variables[‘time’]
In [35]: from datetime import datetime as dt
In [36]: date0 = dt(1951,1,16,12,0,0)
In [37]: print date0
1951-01-16 12:00:00
In [38]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

ClientError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1006 # If the times do not correspond, then it means that the times
   1007 # are not increasing uniformly and we try the bisection method.
-> 1008 if not _check_index(index, dates, nctime, calendar):
   1009
   1010 # Use the bisection method. Assumes the dates are ordered.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in _check_index(indices, dates, nctime, calendar)
    959 return False
    960
--> 961 t = nctime[indices]
    962 return numpy.all( num2date(t, nctime.units, calendar) == dates)
    963

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdf.pyc in __getitem__(self, index)
     65
     66 def __getitem__(self, index):
---> 67 datout = squeeze(self._var.__getitem__(index))
     68 # automatically
     69 # - remove singleton dimensions

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/dtypes.pyc in __getitem__(self, key)
    409 def __getitem__(self, key):
    410 # Return data from the array.
--> 411 return self.data[key]
    412
    413 def __setitem__(self, key, item):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/proxy.pyc in __getitem__(self, index)
    112
    113 # Fetch data.
--> 114 resp, data = openurl(url, self.cache, self.username, self.password)
    115
    116 # First lines are ASCII information that end with 'Data:\n'.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/util/http.pyc in openurl(url, cache, username, password)
     19 m = re.search('code = (?P<code>\d+);\s*message = "(?P<msg>.*)"', data, re.DOTALL | re.MULTILINE)
     20 msg = 'Server error %(code)s: "%(msg)s"' % m.groupdict()
---> 21 raise ClientError(msg)
     22
     23 return resp, data

ClientError: 'Server error 0: "invalid literal for int(): [1113"'

-----------------------------

Note that this is a different error than previously reported. Also, the correct time index is still 1080:

In [40]: taxvals = datobj.variables['time'][:]

In [41]: num2date(taxvals[1080],timedata.units,timedata.calendar)
Out[41]: 1951-01-16 12:00:00

-----------------------------

This dataset, generated by one of the IPCC models, is password-protected, but could be a good target for decoding, since it is typical of a large class of climate models, that generate a lot of analytical activity. To get a password (they're free) one must register. Info is here: http://www-pcmdi.llnl.gov/ipcc/info_for_analysts.php. Follow "How to access..." then "Register to download output." Once you get a userid and password they can be inserted in the NetCDFFile call, voila. Note that there is a new iteration of IPCC coming down the pike; new model files to become widely available starting in 2010.

------------------------------

The underlying data is available via ftp. I fetched it and extracted a small slab, which is available at http://iri.columbia.edu/~amg/test/gfdl_test.nc. The CDAT package can digest this file; first time step is plotted here: http://iri.columbia.edu/~amg/test/gfdl_test_time0.png. The dates can also be read by this package, and run from Aug 1950 to Aug 1951, inclusive (13 mos). So the file does not seem to be garbage.

In [16]: datobj = ncf('gfdl_test.nc')
In [17]: timedata = datobj.variables['time']
In [18]: date0 = dt(1951,1,16,12,0,0)
In [19]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1011 import bisect
   1012
-> 1013 index = numpy.array([bisect.bisect_left(nctime, n) for n in num], int)
   1014
   1015 nomatch = num2date(nctime[index], nctime.units) != dates

TypeError: object of type 'netcdf_variable' has no len()

Investigating the time axis,

In [20]: taxvals = timedata[:]

In [21]: taxvals
Out[21]: array([ 32712.5, 32743. , 32773.5, 32804. , 32834.5, 32865.5, 32895. , 32924.5, 32955. , 32985.5, 33016. , 33046.5, 33077.5])

In [22]: num2date(taxvals,timedata.units,timedata.calendar)
Out[22]: array([1950-08-16 12:00:00, 1950-09-16 00:00:00, 1950-10-16 12:00:00,
       1950-11-16 00:00:00, 1950-12-16 12:00:00, 1951-01-16 12:00:00,
       1951-02-15 00:00:00, 1951-03-16 12:00:00, 1951-04-16 00:00:00,
       1951-05-16 12:00:00, 1951-06-16 00:00:00, 1951-07-16 12:00:00,
       1951-08-16 12:00:00], dtype=object)

Which agrees with what CDAT sees.

-------------------------

I think this is enough for now. I also had problems opening data files whose time units were like "months since xxxx-xx-xx," since the "months" unit does not seem to be supported. ("years since..." could also be useful in some cases.) But maybe one or two things at a time is enough!

Thanks for any assistance/advice!

Best,

Arthur

Jeff Whitaker wrote:

David Huard wrote:

Arthur,

I wrote the date2index function and I think what you are seeing is a bug that I fixed a couple of months ago. By using the latest version of netcdf4-python, not only should this bug disappear, but you'll also find that date2index now supports different selection methods: 'exact', 'before', 'after', 'nearest', that should help with your use case.

If this does not fix the problem you are seeing, I'd appreciate having a copy of the file and code to reproduce the problem and find a solution.

HTH,

David Huard

Arthur: I've just updated basemap svn with David's latest version of date2index, so another option is to update basemap from svn. Or, even simpler, just drop the attached netcdftime.py file in lib/mpl_toolkits/basemap (replacing the old one) and run python setup.py install.

-Jeff

On Mon, Sep 7, 2009 at 9:27 AM, Arthur M. Greene >>>>>> <amg@…1985… <mailto:amg@…1985…>> wrote:

    Hi All,

    The problem is not with fetching the data slice itself, but
    finding the
    correct indices to specify, particularly with the time dimension. The
    below examples refer to a remote dataset that I can open and slice
    using
    indices, as in

    slice = remoteobj.variables['tas'][:120,20:40,30:50].

    However, I have problems when trying to use the syntax in
    plotsst.py or
    pnganim.py (from the examples) to find time indices:

    In [107]: from datetime import datetime as dt
    In [108]: date0 = dt(1951,1,1,0)
    In [110]: print date0
    1951-01-01 00:00:00

    In [125]: timedata = remoteobj.variables['time']
    In [126]: nt0 = date2index(date0,timedata)
    ---------------------------------------------------------------------------

    AssertionError Traceback (most recent
    call last)

    /home/amg/work/nhmm/<ipython console> in <module>()

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc

    in date2index(dates, nctime, calendar)
       3924 Returns an index or a sequence of indices.
       3925 """
    -> 3926 return netcdftime.date2index(dates, nctime, calendar=None)
       3927
       3928 def maskoceans(lonsin,latsin,datain,inlands=False):

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in date2index(dates, nctime, calendar)
        986
        987 # Perform check again.
    --> 988 _check_index(index, dates, nctime, calendar)
        989
        990 # convert numpy scalars or single element arrays to python
    ints.

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in _check_index(indices, dates, nctime, calendar)
        941 for n,i in enumerate(indices):
        942 t[n] = nctime[i]
    --> 943 assert numpy.all( num2date(t, nctime.units, calendar)
    == dates)
        944
        945

    AssertionError:

    ---------------------------------------------------------

    It turns out that date0 corresponds best to index 1080:

    In [139]: remoteobj.variables['time'][1080]
    Out[139]: 32865.5

    In [141]: num2date(32865.5,timedata.units,timedata.calendar)
    Out[141]: 1951-01-16 12:00:00

    This isn't the _exact_ date and time I had specified, but

    In [142]: date0 = dt(1951,01,16,12,00,00)
    In [143]: print date0
    1951-01-16 12:00:00

    In [144]: date2index(date0,timedata,timedata.calendar)

    produces the same AssertionError. Where is the problem?

    What I would _like_ to do is to issue a simple call using coordinates
    rather than the indices, of the form:

    slice = variable[date0:date1,[plev],lat0:lat1,lon0:lon1],

    or similar, preferably without writing a whole module just to find the
    correct indices. I need to fetch similar slices from a group of
    models,
    having time axes that may each be defined slightly differently --
    different calendars, time point set at a different day of the month,
    etc. (It's monthly data and I'm specifying only monthly bounds, even
    though the calendar may be defined as "days since 1860...") I need to
    automate the process so I get back the correct slab regardless.

    Suggestions appreciated!

    Thx,

    Arthur

    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
    Arthur M. Greene, Ph.D.
    The International Research Institute for Climate and Society (IRI)
    The Earth Institute, Columbia University, Lamont Campus

    amg at iri dot columbia dot edu | http://iri.columbia.edu
    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

    ------------------------------------------------------------------------------

    Let Crystal Reports handle the reporting - Free Crystal Reports
    2008 30-Day
    trial. Simplify your report design, integration and deployment -
    and focus on
    what you do best, core application coding. Discover what's new with
    Crystal Reports now. http://p.sf.net/sfu/bobj-july
    _______________________________________________
    Matplotlib-users mailing list
    Matplotlib-users@lists.sourceforge.net
    <mailto:Matplotlib-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/matplotlib-users

------------------------------------------------------------------------

------------------------------------------------------------------------------

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
  
--
*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
Arthur M. Greene, Ph.D.
The International Research Institute for Climate and Society (IRI)
The Earth Institute, Columbia University, Lamont Campus
Monell Building, 61 Route 9W, Palisades, NY 10964-8000 USA
amg at iri dot columbia dot edu | http://iri.columbia.edu
*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

Arthur M. Greene wrote:

Just to add a little info:

I've been poking around various OPeNDAP servers looking for files to try and open (and read), and have had a little success, so the module does seem to work, if not all the time for my purposes. At the moment I'm on a 64-bit machine (Fedora 10), so this is encouraging. Some details:

I tried several of the IPCC AR4 models at PCMDI, with results similar to what I reported earlier. The time object appears with neither units nor a calendar. Looking at the metadata shows this not to be correct, for at least the three models I investigated (gfdl_cm2_1, mpi_echam5, ncar_ccsm3_0). I believe the inclusion of units and a calendar are standard procedure for all of these models, and would probably cause the dataset to be flagged if they weren't present. Many users (like hundreds) have downloaded and analyzed these files.

The IRI data library (http://iridl.ldeo.columbia.edu/) has a large collection of datasets, all available using opendap. But I had problems with the ones I tried because the calendars seem all to be given as "360", rather than "360_day". (Perhaps someone is cutting corners with the typing, I can't say...) I couldn't correct this by setting timedata.calendar='360_day' because the files are opened read-only. There must be files on this server with differently-defined calendars, since the data come from many different sources. I'll have to root around some more to turn some up.

Arthur: It's only the time manipulation functions (date2index, num2date, date2num) that require the time attributes to be CF compliant. You can still read the data with the netCDF module, even if the time attributes are not CF compliant, or not there at all. It is odd that the time attributes appear missing. That does appear to be a bug in the client, or could it have something to do with the fact that the PCC openDAP pages at LLNL appear to be password protected?

I'll email the unidata folks and see what they think. Thanks for your all your testing.

-Jeff

···

Similarly, the time units in http://test.opendap.org/opendap/data/nc/data.nc are given simply as 'hour', so num2date can't figure out what dates the time values refer to. I wouldn't have expected this at this URL, but maybe it's a test? Aside from the fact that "... since" was missing, netcdf4 also complained that 'hour' was not an acceptable unit. Only 'hours' will do. (No 'months' or 'years' either, it seems.)

http://test.opendap.org/opendap/data/nc/coads_climatology.nc seems to download OK, and there are units, but it's a 12-month climatology, so calendar is irrelevant. I could plot the data, although it appeared reversed left-to-right. (I didn't add axes, but just plotted it raw.)

The conclusion seems to be that (1) there may be a lot of non-conforming datasets out there (and netcdf4 may be a little fussy about what time units it will accept, too), but (2) since there seems to be some discordance w.r.t. the IPCC data (where we believe the units and calendar must actually be present) one cannot be absolutely sure that all of the problems experienced are solely due to malformed data descriptions. Evidently more detective work will be required to sort everything out...

Best, thanks again for the assistance. I've been up too late chasing around the web...

Arthur

Jeff Whitaker wrote:

Arthur M. Greene wrote:

Thanks much. I am able to replicate your results using netcdf4.

FYI, I don't believe the xml file is a CDAT creation; rather, it is probably written using CMOR (http://www2-pcmdi.llnl.gov/cmor), which was used to standardize the IPCC model output files, presumably so they could be accessed by a variety of applications via OpenDAP. Hmmmm...

At any rate, I can access the remote data object with netcdf4, but no luck retrieving either data or a time index.

In [94]: datobj = ncf(fname)
In [95]: timedata = datobj.variables['time']
In [97]: taxvals = timedata[1070:1090]
In [99]: print taxvals
[ 32559.5 32590. 32620.5 32651. 32681.5 32712.5 32743. 32773.5
  32804. 32834.5 32865.5 32895. 32924.5 32955. 32985.5 33016.
  33046.5 33077.5 33108. 33138.5]
In [100]: print date2index(date0,timedata.units,timedata.calendar,select='nearest')

Arthur: That's because the timedata variable has no attributes (no calendar or units), and the date2index function looks for these attributes. That's weird though, since that dataset is supposed to be CF compliant. I wonder if openDAP is not handling that xml file correctly.

-Jeff

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getattr__ (netCDF4.c:13593)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4._get_att (netCDF4.c:1806)()

AttributeError: NetCDF: Attribute not found

In [96]: print datobj.variables['tas'].shape
(1680, 90, 144)
In [101]: testdat = datobj.variables['tas'][0,:,:]
---------------------------------------------------------------------------

RuntimeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable.__getitem__ (netCDF4.c:14286)()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/netCDF4.so in netCDF4.Variable._get (netCDF4.c:18945)()

RuntimeError: NetCDF: Variable has no data in DAP request

--------------------------

Well, at least the error messages are different...

Thanks again for all the assistance. It would be useful to access the IPCC output with OpenDap at some point.

Best,

Arthur

Jeff Whitaker wrote:

Arthur M. Greene wrote:

Thanks much for the quick response. I updated both matplotlib and
basemap (now at 0.99.5) via svn and noticed the new netcdftime.py. First, from within site-packages/mpl_toolkits/basemap,

$ grep date2index *.py
__init__.py::func:`date2index`: compute a time variable index
corresponding to a date.
__init__.py:def date2index(dates, nctime, calendar='proleptic_gregorian'):
__init__.py: return netcdftime.date2index(dates, nctime, calendar=None)
netcdftime.py:def date2index(dates, nctime, calendar=None, select='exact'):
netcdftime.py: date2index(dates, nctime, calendar=None, select='exact')

so there seems to be some disagreement between __init__.py and
netcdftime.py concerning the presence of the "select" argument. When I
call date2index with the "select" keyword arg I get

In [24]: ix0 = date2index(date0,timedata,timedata.calendar,select='nearest')
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

TypeError: date2index() got an unexpected keyword argument 'select'

Arthur: I forgot to update the wrapper function in __init__.py - that's fixed now if you do an svn update. Concerning your other problems below, using your test case exposed a couple of other bugs, but it still doesn't work. The basic problem is that the date2index function was designed to work with netCDF4 variable objects (http://code.google.com/p/netcdf4-python), and the netcdf file/variable objects that are produced by pupynere/pydap (the pure python netcdf /dap reader included in basemap) don't quite behave the same way. Using netCDF4, I can get your gfdl_test.nc case to work with

> cat testdate2index.py

#from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
from netCDF4 import Dataset as ncf
from netCDF4 import date2index, num2date
from mpl_toolkits import basemap
fname0 = ‘http://esgcet.llnl.gov/dap/
fname1 =\
‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
fname = fname0+fname1
#fname = ‘gfdl_test.nc’
print fname
datobj = ncf(fname)
print datobj.variables[‘tas’].shape
timedata = datobj.variables[‘time’]
from datetime import datetime as dt
date0 = dt(1951,1,16,12,0,0)
print num2date(timedata[:],timedata.units,calendar=timedata.calendar)
print date0
nt0 = date2index(date0,timedata,select=‘nearest’)
print nt0
print \
timedata[nt0],num2date(timedata[nt0],timedata.units,calendar=timedata.calendar)

> python testdate2index.py

gfdl_test.nc
(13, 31, 29)
[1950-08-16 12:00:00 1950-09-16 00:00:00 1950-10-16 12:00:00
1950-11-16 00:00:00 1950-12-16 12:00:00 1951-01-16 12:00:00
1951-02-15 00:00:00 1951-03-16 12:00:00 1951-04-16 00:00:00
1951-05-16 12:00:00 1951-06-16 00:00:00 1951-07-16 12:00:00
1951-08-16 12:00:00]
1951-01-16 12:00:00
5
[ 32865.5] [1951-01-16 12:00:00]

Your original example doesn't work because the URL is not an opendap server, it's some kind of CDAT xml file that presumably only CDAT understands.

We will see if we can fix the date2index function included in basemap (if not I will remove it), but for now I recommend using netcdf4-python. It's really a much more robust and feature-rich solution for netcdf reading and writing.
-Jeff

-----------------------

This detail aside, I am still having difficulty with date2index, but
annoyingly, I seem to get different error messages with different datasets. I'll illustrate two here, starting with the one I initially posted about. (See note below regarding this data.)

In [3]: from mpl_toolkits.basemap import date2index,num2date,NetCDFFile as ncf
In [10]: from mpl_toolkits import basemap
In [11]: print basemap.__version__
0.99.5
In [24]: fname0 = ‘http://esgcet.llnl.gov/dap/
In [25]: fname1 = ‘ipcc4/20c3m/gfdl_cm2_1/pcmdi.ipcc4.gfdl_cm2_1.20c3m.run1.atm.mo.xml’
In [26]: fname = fname0+fname1
In [28]: datobj = ncf(fname)
In [33]: datobj.variables[‘tas’].shape
Out[33]: (1680, 90, 144)
In [34]: timedata = datobj.variables[‘time’]
In [35]: from datetime import datetime as dt
In [36]: date0 = dt(1951,1,16,12,0,0)
In [37]: print date0
1951-01-16 12:00:00
In [38]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

ClientError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1006 # If the times do not correspond, then it means that the times
   1007 # are not increasing uniformly and we try the bisection method.
-> 1008 if not _check_index(index, dates, nctime, calendar):
   1009
   1010 # Use the bisection method. Assumes the dates are ordered.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in _check_index(indices, dates, nctime, calendar)
    959 return False
    960
--> 961 t = nctime[indices]
    962 return numpy.all( num2date(t, nctime.units, calendar) == dates)
    963

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdf.pyc in __getitem__(self, index)
     65
     66 def __getitem__(self, index):
---> 67 datout = squeeze(self._var.__getitem__(index))
     68 # automatically
     69 # - remove singleton dimensions

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/dtypes.pyc in __getitem__(self, key)
    409 def __getitem__(self, key):
    410 # Return data from the array.
--> 411 return self.data[key]
    412
    413 def __setitem__(self, key, item):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/proxy.pyc in __getitem__(self, index)
    112
    113 # Fetch data.
--> 114 resp, data = openurl(url, self.cache, self.username, self.password)
    115
    116 # First lines are ASCII information that end with 'Data:\n'.

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/dap/util/http.pyc in openurl(url, cache, username, password)
     19 m = re.search('code = (?P<code>\d+);\s*message = "(?P<msg>.*)"', data, re.DOTALL | re.MULTILINE)
     20 msg = 'Server error %(code)s: "%(msg)s"' % m.groupdict()
---> 21 raise ClientError(msg)
     22
     23 return resp, data

ClientError: 'Server error 0: "invalid literal for int(): [1113"'

-----------------------------

Note that this is a different error than previously reported. Also, the correct time index is still 1080:

In [40]: taxvals = datobj.variables['time'][:]

In [41]: num2date(taxvals[1080],timedata.units,timedata.calendar)
Out[41]: 1951-01-16 12:00:00

-----------------------------

This dataset, generated by one of the IPCC models, is password-protected, but could be a good target for decoding, since it is typical of a large class of climate models, that generate a lot of analytical activity. To get a password (they're free) one must register. Info is here: http://www-pcmdi.llnl.gov/ipcc/info_for_analysts.php. Follow "How to access..." then "Register to download output." Once you get a userid and password they can be inserted in the NetCDFFile call, voila. Note that there is a new iteration of IPCC coming down the pike; new model files to become widely available starting in 2010.

------------------------------

The underlying data is available via ftp. I fetched it and extracted a small slab, which is available at http://iri.columbia.edu/~amg/test/gfdl_test.nc. The CDAT package can digest this file; first time step is plotted here: http://iri.columbia.edu/~amg/test/gfdl_test_time0.png. The dates can also be read by this package, and run from Aug 1950 to Aug 1951, inclusive (13 mos). So the file does not seem to be garbage.

In [16]: datobj = ncf('gfdl_test.nc')
In [17]: timedata = datobj.variables['time']
In [18]: date0 = dt(1951,1,16,12,0,0)
In [19]: nt0 = date2index(date0,timedata)
---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

/home/amg/work/nhmm/<ipython console> in <module>()

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc in date2index(dates, nctime, calendar)
   3931 Returns an index or a sequence of indices.
   3932 """
-> 3933 return netcdftime.date2index(dates, nctime, calendar=None)
   3934
   3935 def maskoceans(lonsin,latsin,datain,inlands=False):

/home/amg/usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc in date2index(dates, nctime, calendar, select)
   1011 import bisect
   1012
-> 1013 index = numpy.array([bisect.bisect_left(nctime, n) for n in num], int)
   1014
   1015 nomatch = num2date(nctime[index], nctime.units) != dates

TypeError: object of type 'netcdf_variable' has no len()

Investigating the time axis,

In [20]: taxvals = timedata[:]

In [21]: taxvals
Out[21]: array([ 32712.5, 32743. , 32773.5, 32804. , 32834.5, 32865.5, 32895. , 32924.5, 32955. , 32985.5, 33016. , 33046.5, 33077.5])

In [22]: num2date(taxvals,timedata.units,timedata.calendar)
Out[22]: array([1950-08-16 12:00:00, 1950-09-16 00:00:00, 1950-10-16 12:00:00,
       1950-11-16 00:00:00, 1950-12-16 12:00:00, 1951-01-16 12:00:00,
       1951-02-15 00:00:00, 1951-03-16 12:00:00, 1951-04-16 00:00:00,
       1951-05-16 12:00:00, 1951-06-16 00:00:00, 1951-07-16 12:00:00,
       1951-08-16 12:00:00], dtype=object)

Which agrees with what CDAT sees.

-------------------------

I think this is enough for now. I also had problems opening data files whose time units were like "months since xxxx-xx-xx," since the "months" unit does not seem to be supported. ("years since..." could also be useful in some cases.) But maybe one or two things at a time is enough!

Thanks for any assistance/advice!

Best,

Arthur

Jeff Whitaker wrote:

David Huard wrote:

Arthur,

I wrote the date2index function and I think what you are seeing is a bug that I fixed a couple of months ago. By using the latest version of netcdf4-python, not only should this bug disappear, but you'll also find that date2index now supports different selection methods: 'exact', 'before', 'after', 'nearest', that should help with your use case.

If this does not fix the problem you are seeing, I'd appreciate having a copy of the file and code to reproduce the problem and find a solution.

HTH,

David Huard

Arthur: I've just updated basemap svn with David's latest version of date2index, so another option is to update basemap from svn. Or, even simpler, just drop the attached netcdftime.py file in lib/mpl_toolkits/basemap (replacing the old one) and run python setup.py install.

-Jeff

On Mon, Sep 7, 2009 at 9:27 AM, Arthur M. Greene >>>>>>> <amg@…1985… <mailto:amg@…1985…>> wrote:

    Hi All,

    The problem is not with fetching the data slice itself, but
    finding the
    correct indices to specify, particularly with the time dimension. The
    below examples refer to a remote dataset that I can open and slice
    using
    indices, as in

    slice = remoteobj.variables['tas'][:120,20:40,30:50].

    However, I have problems when trying to use the syntax in
    plotsst.py or
    pnganim.py (from the examples) to find time indices:

    In [107]: from datetime import datetime as dt
    In [108]: date0 = dt(1951,1,1,0)
    In [110]: print date0
    1951-01-01 00:00:00

    In [125]: timedata = remoteobj.variables['time']
    In [126]: nt0 = date2index(date0,timedata)
    ---------------------------------------------------------------------------

    AssertionError Traceback (most recent
    call last)

    /home/amg/work/nhmm/<ipython console> in <module>()

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/__init__.pyc

    in date2index(dates, nctime, calendar)
       3924 Returns an index or a sequence of indices.
       3925 """
    -> 3926 return netcdftime.date2index(dates, nctime, calendar=None)
       3927
       3928 def maskoceans(lonsin,latsin,datain,inlands=False):

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in date2index(dates, nctime, calendar)
        986
        987 # Perform check again.
    --> 988 _check_index(index, dates, nctime, calendar)
        989
        990 # convert numpy scalars or single element arrays to python
    ints.

    /usr/local/cdat/trunk/lib/python2.5/site-packages/mpl_toolkits/basemap/netcdftime.pyc

    in _check_index(indices, dates, nctime, calendar)
        941 for n,i in enumerate(indices):
        942 t[n] = nctime[i]
    --> 943 assert numpy.all( num2date(t, nctime.units, calendar)
    == dates)
        944
        945

    AssertionError:

    ---------------------------------------------------------

    It turns out that date0 corresponds best to index 1080:

    In [139]: remoteobj.variables['time'][1080]
    Out[139]: 32865.5

    In [141]: num2date(32865.5,timedata.units,timedata.calendar)
    Out[141]: 1951-01-16 12:00:00

    This isn't the _exact_ date and time I had specified, but

    In [142]: date0 = dt(1951,01,16,12,00,00)
    In [143]: print date0
    1951-01-16 12:00:00

    In [144]: date2index(date0,timedata,timedata.calendar)

    produces the same AssertionError. Where is the problem?

    What I would _like_ to do is to issue a simple call using coordinates
    rather than the indices, of the form:

    slice = variable[date0:date1,[plev],lat0:lat1,lon0:lon1],

    or similar, preferably without writing a whole module just to find the
    correct indices. I need to fetch similar slices from a group of
    models,
    having time axes that may each be defined slightly differently --
    different calendars, time point set at a different day of the month,
    etc. (It's monthly data and I'm specifying only monthly bounds, even
    though the calendar may be defined as "days since 1860...") I need to
    automate the process so I get back the correct slab regardless.

    Suggestions appreciated!

    Thx,

    Arthur

    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*
    Arthur M. Greene, Ph.D.
    The International Research Institute for Climate and Society (IRI)
    The Earth Institute, Columbia University, Lamont Campus

    amg at iri dot columbia dot edu | http://iri.columbia.edu
    *^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*^*~*

    ------------------------------------------------------------------------------

    Let Crystal Reports handle the reporting - Free Crystal Reports
    2008 30-Day
    trial. Simplify your report design, integration and deployment -
    and focus on
    what you do best, core application coding. Discover what's new with
    Crystal Reports now. http://p.sf.net/sfu/bobj-july
    _______________________________________________
    Matplotlib-users mailing list
    Matplotlib-users@lists.sourceforge.net
    <mailto:Matplotlib-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/matplotlib-users

------------------------------------------------------------------------

------------------------------------------------------------------------------

Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
------------------------------------------------------------------------

_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users