cross correlation

Dear Users,

          I am relatively new to Matplotlib. I wanted to find cross correlation between 2 time series for my research and was looking at options available with python and found [http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr)  . However I

wanted to save the results in a netcdf file for further use. ie the correlation, lags and significance if possible. Is there a way to get the corr and lags from the axis.xcorr ?? any help in this matter will be greatly appreciated.

Sudheer

···

Sudheer Joseph
Indian National Centre for Ocean Information Services
Ministry of Earth Sciences, Govt. of India
POST BOX NO: 21, IDA
Jeedeemetla P.O.
Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
Tel:+91-40-23044600®,Tel:+91-40-9440832534(Mobile)
E-mail:sjo.India@…287…;sudheer.joseph@…9…
Web- http://oppamthadathil.tripod.com


Sudheer,

A call to axes.xcorr returns the lags, correlation (from np.correlate) and
the line artists on the figure.

In IPython, doing "plt.xcorr??" should provide sufficient information. It's
a pretty simple method.
-paul

···

On Thu, Feb 7, 2013 at 3:24 AM, Sudheer Joseph <sudheer.joseph@...9...>wrote:

Dear Users,
              I am relatively new to Matplotlib. I wanted to find cross
correlation between 2 time series for my research and was looking at
options available with python and found
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr .
However I wanted to save the results in a netcdf file for further use. ie
the correlation, lags and significance if possible. Is there a way to get
the corr and lags from the axis.xcorr ?? any help in this matter will be
greatly appreciated.
Sudheer

Thank you verymuch Hobson,

                                  However I think I did not understand the suggestion by you fully( pardon my ignorance). I use the below test code from matplotlib site. How does one make a call to get lags and correlation corresponding to the x and y values in the plot. a Print command of  

In [23]: print ax1.xcorr

<bound method AxesSubplot.xcorr of <matplotlib.axes.AxesSubplot object at 0x44c1410>>

results as above. Is it possible to assign the xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2) ? with a different syntax? I get below error when I try the above .

In [27]: xcorr,lags=ax1.xcorr(x, y, usevlines=True,
maxlags=50, normed=True, lw=2)

···

ValueError Traceback (most recent call last)

/home/sjo/work/PY_WORK/stats/ in ()

----> 1 xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)

ValueError: too many values to unpack

import matplotlib.pyplot as plt

import numpy as np

x,y = np.random.randn(2,100)

fig = plt.figure()

ax1 =
fig.add_subplot(211)

ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)

ax1.grid(True)

ax1.axhline(0, color=‘black’, lw=2)

ax2 = fig.add_subplot(212, sharex=ax1)

ax2.acorr(x, usevlines=True, normed=True, maxlags=50, lw=2)

ax2.grid(True)

ax2.axhline(0, color=‘black’, lw=2)

plt.show()

From: Paul Hobson <pmhobson@…287…>
To: Sudheer Joseph <sudheer.joseph@…9…>
Cc:
"matplotlib-users@lists.sourceforge.net" <matplotlib-users@…1753…forge.net>
Sent: Thursday, 7 February 2013 10:31 PM
Subject: Re: [Matplotlib-users] cross correlation

On Thu, Feb 7, 2013 at 3:24 AM, Sudheer Joseph <sudheer.joseph@…9…> wrote:

Dear Users,

          I am relatively new to Matplotlib. I wanted to find cross correlation between 2 time series for my research and was looking at options available with python and found [http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr)  . However I

wanted to save the results in a netcdf file for further use. ie the correlation, lags and significance if possible. Is there a way to get the corr and lags from the axis.xcorr ?? any help in this matter will be greatly appreciated.

Sudheer

Sudheer,

A call to axes.xcorr returns the lags, correlation (from np.correlate) and the line artists on the figure.

In IPython, doing “plt.xcorr??” should provide sufficient information. It’s a pretty simple method.

-paul

Sudheer,

For the documentation you are looking for

print ax1.xcorr.__doc__

(Paul tried to give you the IPython method of getting that documentation which is by typing a ? (or ??) after the desired object.)

In the documentation (at the link you gave http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr), it says that there are three objects returned by xcorr:
Return value is a tuple (*lags*, *c*, *line*) where:

  - *lags* are a length ``2*maxlags+1`` lag vector

  - *c* is the ``2*maxlags+1`` auto correlation vector

  - *line* is a :class:`~matplotlib.lines.Line2D` instance
     returned by :func:`~matplotlib.pyplot.plot`.

So the error you were getting is due to the fact that you have only specified two variables to hold the three returned objects.

Try:
lags,c,line = ax1.xcorr .....

(Note that you have xcorr and lags backwards in your attempt.)

-Sterling

···

On Feb 8, 2013, at 1:56AM, Sudheer Joseph wrote:

Thank you verymuch Hobson,
                                      However I think I did not understand the suggestion by you fully( pardon my ignorance). I use the below test code from matplotlib site. How does one make a call to get lags and correlation corresponding to the x and y values in the plot. a Print command of
In [23]: print ax1.xcorr
<bound method AxesSubplot.xcorr of <matplotlib.axes.AxesSubplot object at 0x44c1410>>
results as above. Is it possible to assign the xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2) ? with a different syntax? I get below error when I try the above .
In [27]: xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/home/sjo/work/PY_WORK/stats/<ipython-input-27-e1e58c045ad4> in <module>()
----> 1 xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)

ValueError: too many values to unpack

import matplotlib.pyplot as plt
import numpy as np
x,y = np.random.randn(2,100)
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)
ax1.grid(True)
ax1.axhline(0, color='black', lw=2)
ax2 = fig.add_subplot(212, sharex=ax1)
ax2.acorr(x, usevlines=True, normed=True, maxlags=50, lw=2)
ax2.grid(True)
ax2.axhline(0, color='black', lw=2)
plt.show()

From: Paul Hobson <pmhobson@...287...>
To: Sudheer Joseph <sudheer.joseph@...9...>
Cc: "matplotlib-users@lists.sourceforge.net" <matplotlib-users@...1544...ceforge.net>
Sent: Thursday, 7 February 2013 10:31 PM
Subject: Re: [Matplotlib-users] cross correlation

On Thu, Feb 7, 2013 at 3:24 AM, Sudheer Joseph <sudheer.joseph@...9...> wrote:
Dear Users,
              I am relatively new to Matplotlib. I wanted to find cross correlation between 2 time series for my research and was looking at options available with python and found http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr . However I wanted to save the results in a netcdf file for further use. ie the correlation, lags and significance if possible. Is there a way to get the corr and lags from the axis.xcorr ?? any help in this matter will be greatly appreciated.
Sudheer

Sudheer,

A call to axes.xcorr returns the lags, correlation (from np.correlate) and the line artists on the figure.

In IPython, doing "plt.xcorr??" should provide sufficient information. It's a pretty simple method.
-paul

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Thank you very much Smith and Paul,

                                       I was away from office due to a medical situation. So could not respond and thank you regarding the help. I have got the results now and the tips from both of you were extremely useful. I am facing an issue with the code when I call plt.xcorr,  in a loop. it builds up usage of memory by python and reaches to the RAM what ever available ( in my 4 GB laptop it reaches almost full and in my 24 GB desktop it reaches the available. I suspected the plot not

being closed during each iteration so have given a plt.close(‘all’) in the loop. after which it is taking a good time to run the code which was otherwise faster until ram usage reaches its maximum.

Is there a way to get out of this situation?. I am attaching the code here and also the link to the data I am using. If possible kindly help.

 ftp
ftpser.incois.gov.in
user temp
password incoistemp
cd /home0/temp/comp
bin
mget qu_test.nc.gz
gunzip qu_test.nc.gz

create_ncf.py (891 Bytes)

gen_xcorr_wnd.py (784 Bytes)

···

Sudheer Joseph
Indian National Centre for Ocean Information Services
Ministry of Earth Sciences, Govt. of India
POST BOX NO: 21, IDA Jeedeemetla P.O.
Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
Tel:+91-40-23044600®,Tel:+91-40-9440832534(Mobile)
E-mail:sjo.India@…287…;sudheer.joseph@…9…
Web- http://oppamthadathil.tripod.com



From: Sterling Smith <smithsp@…3304…>
To: Sudheer Joseph <sudheer.joseph@…878…9…>
Cc: Paul Hobson <pmhobson@…287…>; "matplotlib-users@lists.sourceforge.net" matplotlib-users@lists.sourceforge.net
Sent: Friday, 8 February 2013 10:23 PM
Subject: Re: [Matplotlib-users] cross correlation

Sudheer,

For the documentation you are looking for

print ax1.xcorr.doc

(Paul tried to give you the IPython method of getting that documentation which is by typing a ? (or ??) after the desired object.)

In the documentation (at the link
you gave http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr), it says that there are three objects returned by xcorr:
Return value is a tuple (lags, c, line) where:

  • lags are a length 2*maxlags+1 lag vector

  • c is the 2*maxlags+1 auto correlation vector

  • line is a :class:~matplotlib.lines.Line2D instance
    returned by :func:~matplotlib.pyplot.plot.

So the error you were getting is due to the fact that you have only specified two variables to hold the three returned objects.

Try:
lags,c,line = ax1.xcorr …

(Note that you have xcorr and lags backwards in your attempt.)

-Sterling

On Feb 8, 2013, at 1:56AM, Sudheer Joseph wrote:

Thank you verymuch Hobson,

                      However I think I did not understand the suggestion by you fully( pardon my ignorance). I use the below test code from matplotlib site. How does one make a call to get lags and correlation corresponding to the x and y values in the plot. a Print command of 

In [23]: print ax1.xcorr
<bound method AxesSubplot.xcorr of <matplotlib.axes.AxesSubplot object at 0x44c1410>>
results as above. Is it possible to assign the xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2) ? with a different syntax? I get below error when I try the above .
In [27]: xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)

ValueError
Traceback (most recent call last)
/home/sjo/work/PY_WORK/stats/ in ()
----> 1 xcorr,lags=ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)

ValueError: too many values to unpack

import matplotlib.pyplot as plt
import numpy as np
x,y = np.random.randn(2,100)
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax1.xcorr(x, y, usevlines=True, maxlags=50, normed=True, lw=2)
ax1.grid(True)
ax1.axhline(0, color=‘black’, lw=2)
ax2 = fig.add_subplot(212, sharex=ax1)
ax2.acorr(x, usevlines=True, normed=True, maxlags=50, lw=2)
ax2.grid(True)
ax2.axhline(0, color=‘black’, lw=2)
plt.show()

From: Paul Hobson <pmhobson@…287…>
To: Sudheer Joseph <sudheer.joseph@…9…>
Cc: “matplotlib-users@…504…et” <matplotlib-users@…2569…sourceforge.net>
Sent: Thursday, 7 February 2013 10:31 PM
Subject: Re: [Matplotlib-users] cross correlation

On Thu, Feb 7, 2013 at 3:24 AM, Sudheer Joseph <sudheer.joseph@…9…> wrote:
Dear Users,
I am relatively new to
Matplotlib. I wanted to find cross correlation between 2 time series for my research and was looking at options available with python and found http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xcorr . However I wanted to save the results in a netcdf file for further use. ie the correlation, lags and significance if possible. Is there a way to get the corr and lags from the axis.xcorr ?? any help in this matter will be greatly appreciated.
Sudheer

Sudheer,

A call to axes.xcorr returns the lags, correlation (from np.correlate) and the line artists on the figure.

In IPython, doing “plt.xcorr??” should provide sufficient information. It’s a pretty simple method.
-paul


Free
Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Hi Sudheer,

Thank you very much Smith and Paul,

                I was away from office due to a medical

situation. So could not respond and thank you regarding the
help. I have got the results now and the tips from both of
you were extremely useful. I am facing an issue with the
code when I call plt.xcorr, in a loop. it builds up usage
of memory by python and reaches to the RAM what ever
available ( in my 4 GB laptop it reaches almost full and in
my 24 GB desktop it reaches the available. I suspected the
plot not being closed during each iteration so have given a
plt.close(‘all’) in the loop. after which it is taking a
good time to run the code which was otherwise faster until
ram usage reaches its maximum.

        Is there a way to get

out of this situation?. I am attaching the code here and
also the link to the data I am using. If possible kindly
help.

Thanks for sharing the code. By a quick look at gen_xcorr_wnd.py,

you are generating a quite high number (about len(lons)*len(lats))
of xcorr series over 365 lags. Here are two thoughts about why I
would not recommend using xcorr from matplotlib for this job :

1) There is an overhead in creating a plot object which is

unnecessary since you’re only interested in correlation values

2) internally, plt.xcorr uses numpy.correlate

(
and
)
which is quite fast but unfortunately cannot be well tuned in terms
of the output length (only three modes : ‘valid’, ‘same’ or ‘full’.
Matplotlib uses ‘full’ )
All this to say that when you’re interested in 365 correlation
values, the internal computations takes place on (N+M-1) points
(where N, M are the length of the input vectors, i.e. 2189 if I’m
right) and so about 90 % of the output is thrown away.
This being said, there is a tiny issue : I don’t know a good module
which has the (x)correlation function. statsmodel has acf (aka
correlation) but I don’t remember if there is crosscorrelation. For
acf has two computation modes : one based on fft, one based on
numpy.correlate which suffer from the same problem as matplotlib’s
xcorr (
)
best,
Pierre

···

https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes.py#L4319https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py#L731

https://github.com/statsmodels/statsmodels/blob/master/statsmodels/tsa/stattools.py#L347

Thank you Pierre,

                          I will test the other options. I did not know the number limitation in case of plt.xcorr.

Thanks a lot

with best regards,

Sudheer

···

Sudheer Joseph
Indian National Centre for Ocean Information Services
Ministry of Earth Sciences, Govt. of India
POST BOX NO: 21, IDA Jeedeemetla P.O.
Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
Tel:+91-40-23044600®,Tel:+91-40-9440832534(Mobile)
E-mail:sjo.India@…287…;sudheer.joseph@…9…
Web- http://oppamthadathil.tripod.com



From: Pierre Haessig <pierre.haessig@…4217…9…>
To: matplotlib-users@lists.sourceforge.net
Sent: Thursday, 21 February 2013 9:52 PM
Subject: Re: [Matplotlib-users] cross correlation

Hi Sudheer,

Le 21/02/2013 02:22, Sudheer Joseph a écrit :

Thank you very much Smith and Paul,

                I was away from office due to a medical

situation. So could not respond and thank you regarding the
help. I have got the results now and the tips from both of
you were extremely useful. I am facing an issue with the
code when I call plt.xcorr, in a loop. it builds up usage
of memory by python and reaches to the RAM what ever
available ( in my 4 GB laptop it reaches almost full and in
my 24 GB desktop it reaches the available. I suspected the
plot not being closed during each iteration so have given a
plt.close(‘all’) in the loop. after which it is taking a
good time to run the code which was otherwise faster until
ram usage reaches its maximum.

        Is there a way to get

out of this situation?. I am attaching the code here and
also the link to the data I am using. If possible kindly
help.

Thanks for sharing the code. By a quick look at gen_xcorr_wnd.py,

you are generating a quite high number (about len(lons)*len(lats))
of xcorr series over 365 lags. Here are two thoughts about why I
would not recommend using xcorr from matplotlib for this job :

1) There is an overhead in creating a plot object which is

unnecessary since you’re only interested in correlation values

2) internally, plt.xcorr uses numpy.correlate

(https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/axes.py#L4319
and
https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py#L731 )
which is quite fast but unfortunately cannot be well tuned in terms
of the output length (only three modes : ‘valid’, ‘same’ or ‘full’.
Matplotlib uses ‘full’ )

All this to say that when you're interested in 365 correlation

values, the internal computations takes place on (N+M-1) points
(where N, M are the length of the input vectors, i.e. 2189 if I’m
right) and so about 90 % of the output is thrown away.

This being said, there is a tiny issue : I don't know a good module

which has the (x)correlation function. statsmodel has acf (aka
correlation) but I don’t remember if there is crosscorrelation. For
acf has two computation modes : one based on fft, one based on
numpy.correlate which suffer from the same problem as matplotlib’s
xcorr (
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/tsa/stattools.py#L347)

best,

Pierre

Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb


Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Just for reference :
You’ll see that (cross)correlation in Python a long ongoing topic.
best,
Pierre

···

http://stackoverflow.com/questions/6991471/computing-cross-correlation-function

Dear Pierre,

                    I was checking the plt.xcorr and it calls the np.correlate in side it. It calls np.correlate(ts1,ts2, mode=2).

Is there a way to see which vector is sided back in time? ie ts1[t1,t2,t3,t4…] ts2[t1,t2,t3,t4…] ( ts2[t2] correlated with ts1[t1] or ts2[t1] is correlated with ts2[t2] ie {to make out which one is the cause and which is effect)

in
case of cross correlation it can be obtained by either sliding ts1 back in time or ts2 back in time. Is there a way to know this? I am not able to make much from np.correlate. Also is there a way to get 95% significance or p value from the xcorr.? in case of matlab? in matlab xcorr can be called with option of corcoeff instead of default cross correlation but is there similar option for matplotlib?

sincerely.

Sudheer

···

Sudheer Joseph
Indian National Centre for Ocean Information Services
Ministry of Earth Sciences, Govt. of India
POST BOX NO: 21, IDA Jeedeemetla P.O.
Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55
Tel:+91-40-23886047(O),Fax:+91-40-23895011(O),
Tel:+91-40-23044600®,Tel:+91-40-9440832534(Mobile)
E-mail:sjo.India@…287…;sudheer.joseph@…9…
Web- http://oppamthadathil.tripod.com



Sudheer:

It sounds like your needs are beyond the scope of matplotlib. It'll
probably be more productive to check in with the numpy or scipy mailing
lists.
-paul

···

On Wed, Feb 27, 2013 at 1:01 AM, Sudheer Joseph <sudheer.joseph@...9...>wrote:

Dear Pierre,
                        I was checking the plt.xcorr and it calls the
np.correlate in side it. It calls np.correlate(ts1,ts2, mode=2).
Is there a way to see which vector is sided back in time? ie
ts1[t1,t2,t3,t4....] ts2[t1,t2,t3,t4...] ( ts2[t2] correlated with ts1[t1]
or ts2[t1] is correlated with ts2[t2] ie {to make out which one is the
cause and which is effect)
in case of cross correlation it can be obtained by either sliding ts1 back
in time or ts2 back in time. Is there a way to know this? I am not able to
make much from np.correlate. Also is there a way to get 95% significance or
p value from the xcorr.? in case of matlab? in matlab xcorr can be called
with option of corcoeff instead of default cross correlation but is there
similar option for matplotlib?
sincerely.
Sudheer

Hi,

I was checking the plt.xcorr and it calls the np.correlate in side it.
It calls np.correlate(ts1,ts2, mode=2).

Just as a side note, mode=2 is the old fashioned way to specify
mode='full' [1]. This may help in reading the numpy.correlate doc.

This being said, I'm really unfamiliar with cross-correlations. I just
kind of know the usual 95% confidence interval for autocorrelation at
1.96/sqrt(n). Just as a quick check, this is what R uses by default, but
there are options like ci.type get more appropriate intervals for an MA
series
(http://stat.ethz.ch/R-manual/R-patched/library/stats/html/plot.acf.html)

best,
Pierre

[1] https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py#L678

···

Le 27/02/2013 10:01, Sudheer Joseph a écrit :

Thank you Pierre.

with best regards,

Sudheer

···

From: Pierre Haessig <pierre.haessig@…1709…>
To:
Cc: "matplotlib-users@lists.sourceforge.net" matplotlib-users@lists.sourceforge.net
Sent: Thursday, 28 February 2013 7:15 PM
Subject: Re: [Matplotlib-users] cross correlation

Hi,

Le 27/02/2013 10:01, Sudheer Joseph a écrit :

I was checking the plt.xcorr and it calls the np.correlate in side it.
It calls np.correlate(ts1,ts2, mode=2).
Just as a side note, mode=2 is the old fashioned way to specify
mode=‘full’ [1]. This may help in reading the numpy.correlate doc.

This being said, I’m really unfamiliar with cross-correlations. I just
kind of know the usual 95% confidence interval for autocorrelation at
1.96/sqrt(n). Just as a quick check, this is what R uses by default, but
there are options like ci.type get more appropriate intervals for an MA
series
(http://stat.ethz.ch/R-manual/R-patched/library/stats/html/plot.acf.html)

best,
Pierre

[1] https://github.com/numpy/numpy/blob/master/numpy/core/numeric.py#L678


Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb


Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users