Cross-correlation?

Hi y'all.

I have two series of data, taken over a period of time and I want to
know whether one set of data contains a signal similar to that in the
other. I believe the way to do this is to use cross-correlation(*) and
I notice that dir(pylab) now contains a function called
cross_correlate, but given two arrays, that function returns a single
scalar value (whereas I was expecting a list of correlation
coefficients corresponding to how well the two signals match on
successive lags).

So ... does matplotlib have something akin to what I'm after or is
there an extention that might have it?

Very many thanks,

Sarah

(*)

http://www.mathworks.com/access/helpdesk/help/toolbox/garch/index.html?/access/helpdesk/help/toolbox/garch/crosscorr.html
http://www.mathworks.com/access/helpdesk/help/toolbox/signal/index.html?/access/helpdesk/help/toolbox/signal/xcorr.html

···

--
Work: http://www.cogentcomputing.org
Blog: http://varspool.blogspot.com
Photos: http://flickr.com/photos/sarahmount/

Ciao Sarah

cross_correlate, but given two arrays, that function returns a single
scalar value (whereas I was expecting a list of correlation
coefficients corresponding to how well the two signals match on
successive lags).

That's not really a mpl question, more a 'numerix' one: what does your numerix
stand for ? numpy ? numarray ?

For numpy, I guess that the corresponding function is 'correlate', which you
must use with the 'full' parameter:

corxy = N.correlate(x, y, 'full')

Unless I'm mistaken, the lags are then 1-n, 2-n,..., -1, 0, 1, ..., n-1
Note that you may want to use anomalies (ie, x-x.mean() and y-y.mean()), and
divide by the variances to get proper coefficients

A second possibility is to use FFTs,

Fx = N.fft.fft(x, npad, axis=axis)
Fy = N.fft.fft(y, npad, axis=axis)
iFxy = N.fft.ifft(Fx.conj()*Fy,n=npad,axis=axis).real

with npad=x.size+y.size
The lags are then 0,1,2,...,n,1-n,...,-1

If you're interested, you can use these two functions that I had written

#..............................................................................
def ccf(x, y, axis=None):
    """Computes the cross-correlation function of two series `x` and `y`.
Note that the computations are performed on anomalies (deviations from
average).
Returns the values of the cross-correlation at different lags.
Lags are given as [0,1,2,...,n,n-1,n-2,...,-2,-1].

:Parameters:
    `x` : 1D MaskedArray
        Time series.
    `y` : 1D MaskedArray
        Time series.
    `axis` : integer *[None]*
        Axis along which to compute (0 for rows, 1 for cols).
        If `None`, the array is flattened first.
    """
    assert(x.ndim == y.ndim, "Inconsistent shape !")
# assert(x.shape == y.shape, "Inconsistent shape !")
    if axis is None:
        if x.ndim > 1:
            x = x.ravel()
            y = y.ravel()
        npad = x.size + y.size
        xanom = (x - x.mean(axis=None))
        yanom = (y - y.mean(axis=None))
        Fx = N.fft.fft(xanom, npad, )
        Fy = N.fft.fft(yanom, npad, )
        iFxy = N.fft.ifft(Fx.conj()*Fy).real
        varxy = N.sqrt(N.inner(xanom,xanom) * N.inner(yanom,yanom))
    else:
        npad = x.shape[axis] + y.shape[axis]
        if axis == 1:
            if x.shape[0] != y.shape[0]:
                raise ValueError, "Arrays should have the same length!"
            xanom = (x - x.mean(axis=1)[:,None])
            yanom = (y - y.mean(axis=1)[:,None])
            varxy = N.sqrt((xanom*xanom).sum(1) * (yanom*yanom).sum(1))
[:,None]
        else:
            if x.shape[1] != y.shape[1]:
                raise ValueError, "Arrays should have the same width!"
            xanom = (x - x.mean(axis=0))
            yanom = (y - y.mean(axis=0))
            varxy = N.sqrt((xanom*xanom).sum(0) * (yanom*yanom).sum(0))
        Fx = N.fft.fft(xanom, npad, axis=axis)
        Fy = N.fft.fft(yanom, npad, axis=axis)
        iFxy = N.fft.ifft(Fx.conj()*Fy,n=npad,axis=axis).real

···

#
    return iFxy/varxy
#..............................................................................
def ccf1d(x,y):
    """Computes the crosscorrelation of two flat arrays `x` and `y`, with the
numpy.correlate function.
Note that the computations are performed on anomalies (deviations from
average).
    """
    if x.ndim > 1:
        x = x.ravel()
    if y.ndim > 1:
        y = y.ravel()
    (xanom, yanom) = (x-x.mean(), y-y.mean())
    corxy = N.correlate(xanom, yanom, 'full')
    n = min(x.size, y.size)
    # return N.r_[ xc[len(yf)-1:], 0, xc[:len(yf)-1] ]
    corxy = N.r_[ corxy[:n][::-1], 0, corxy[n:][::-1] ]
    varxy = N.sqrt(N.inner(xanom,xanom) * N.inner(yanom,yanom))
    return corxy/varxy

So ... does matplotlib have something akin to what I'm after or is
there an extention that might have it?

Very many thanks,

Sarah

(*)

http://www.mathworks.com/access/helpdesk/help/toolbox/garch/index.html?/acc
ess/helpdesk/help/toolbox/garch/crosscorr.html
http://www.mathworks.com/access/helpdesk/help/toolbox/signal/index.html?/ac
cess/helpdesk/help/toolbox/signal/xcorr.html