plot cdf

Alan_G_Isaac1 · September 26, 2007, 9:09pm

Is there a standard function or practice for
plotting the CDF of a series? (I am aware
of the output of hist.)

Thank you,
Alan Isaac

Tommy_Grav1 · September 26, 2007, 11:07pm

Without really knowing what CDF is (I am assuming
it is Cumulative Density Fucntion or something similar).
I would suggest taking a look at numpy and the histogram
and cumsum functions therein.

Cheers
Tommy

···

On Sep 26, 2007, at 5:09 PM, Alan Isaac wrote:

Is there a standard function or practice for
plotting the CDF of a series? (I am aware
of the output of hist.)

Robert_Kern3 · September 27, 2007, 12:11am

Alan Isaac wrote:

Is there a standard function or practice for
plotting the CDF of a series? (I am aware
of the output of hist.)

import numpy as np
from matplotlib import pylab

x = ... # whatever
n = len(x)
x2 = np.repeat(x, 2)
y2 = np.hstack([0.0, np.repeat(np.arange(1,n) / float(n), 2), 1.0])
pylab.plot(x2, y2)

···

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

_David_Huard1 · September 27, 2007, 2:03pm

Hi Alan,

There is an empiricalcdf function in scipy/sandbox/dhuard/stats.py
It’s not fancy but it might do what you want.

David

2007/9/26, Alan Isaac <
aisaac@…310…>:

···

Is there a standard function or practice for
plotting the CDF of a series? (I am aware

of the output of hist.)

Thank you,
Alan Isaac

This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.

http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

Matplotlib-users mailing list

Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Alan_G_Isaac1 · September 27, 2007, 3:00pm

OK, that's pretty slick.
I did not think about ``repeat``.
Thanks,
Alan

···

On Wed, 26 Sep 2007, Robert Kern apparently wrote:

n = len(x)
x2 = np.repeat(x, 2)
y2 = np.hstack([0.0, np.repeat(np.arange(1,n) / float(n), 2), 1.0])
pylab.plot(x2, y2)

Alan_G_Isaac1 · September 27, 2007, 4:08pm

Well, I also did not think of a double application of
argsort to rank the observations. Nice. OK, I've got plenty
to work with now.

Thanks,
Alan

···

On Thu, 27 Sep 2007, David Huard apparently wrote:

scipy/sandbox/dhuard/stats.py

Alan_G_Isaac1 · September 28, 2007, 4:18am

After thinking it over, I did not go for
Robert or David's cool numpy tricks, but
I'll append a simple object in case someone
else wants to do more.

Cheers,
Alan Isaac

class EmpiricalCDF(object):
    '''Empirical cdf.
    First point will be (xmin,0).
    Last point will be (xmax,1).
    '''
    def __init__(self, data, sortdata=True):
        if sortdata:
            data = N.sort(data)
        self.data = data
        self.nobs = len(data)
    def gen_xp(self):
        data, nobs = self.data, self.nobs
        prob = N.linspace(0, 1, nobs+1)
        xsteps = ( data[(idx)//2] for idx in xrange(2*nobs) )
        psteps = ( prob[(idx+1)//2] for idx in xrange(2*nobs) )
        return xsteps, psteps
    def get_steps(self):
        '''Return: 2-tuple of arrays,
        the data values and corresponding cumulative
        probabilities.
        '''
        xsteps, psteps = self.gen_xp()
        return N.fromiter(xsteps,'f'), N.fromiter(psteps,'f')