# plot cdf

Is there a standard function or practice for
plotting the CDF of a series? (I am aware
of the output of hist.)

Thank you,
Alan Isaac

Without really knowing what CDF is (I am assuming
it is Cumulative Density Fucntion or something similar).
I would suggest taking a look at numpy and the histogram
and cumsum functions therein.

Cheers
Tommy

···

On Sep 26, 2007, at 5:09 PM, Alan Isaac wrote:

Is there a standard function or practice for
plotting the CDF of a series? (I am aware
of the output of hist.)

Alan Isaac wrote:

Is there a standard function or practice for
plotting the CDF of a series? (I am aware
of the output of hist.)

import numpy as np
from matplotlib import pylab

x = ... # whatever
n = len(x)
x2 = np.repeat(x, 2)
y2 = np.hstack([0.0, np.repeat(np.arange(1,n) / float(n), 2), 1.0])
pylab.plot(x2, y2)

···

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Hi Alan,

There is an empiricalcdf function in scipy/sandbox/dhuard/stats.py
It’s not fancy but it might do what you want.

David

2007/9/26, Alan Isaac <
aisaac@…310…>:

···

Is there a standard function or practice for
plotting the CDF of a series? (I am aware

of the output of hist.)

Thank you,
Alan Isaac

This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft® Visual Studio 2005.

http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

Matplotlib-users mailing list

OK, that's pretty slick.
I did not think about ``repeat``.
Thanks,
Alan

···

On Wed, 26 Sep 2007, Robert Kern apparently wrote:

n = len(x)
x2 = np.repeat(x, 2)
y2 = np.hstack([0.0, np.repeat(np.arange(1,n) / float(n), 2), 1.0])
pylab.plot(x2, y2)

Well, I also did not think of a double application of
argsort to rank the observations. Nice. OK, I've got plenty
to work with now.

Thanks,
Alan

···

On Thu, 27 Sep 2007, David Huard apparently wrote:

scipy/sandbox/dhuard/stats.py

After thinking it over, I did not go for
Robert or David's cool numpy tricks, but
I'll append a simple object in case someone
else wants to do more.

Cheers,
Alan Isaac

class EmpiricalCDF(object):
'''Empirical cdf.
First point will be (xmin,0).
Last point will be (xmax,1).
'''
def __init__(self, data, sortdata=True):
if sortdata:
data = N.sort(data)
self.data = data
self.nobs = len(data)
def gen_xp(self):
data, nobs = self.data, self.nobs
prob = N.linspace(0, 1, nobs+1)
xsteps = ( data[(idx)//2] for idx in xrange(2*nobs) )
psteps = ( prob[(idx+1)//2] for idx in xrange(2*nobs) )
return xsteps, psteps
def get_steps(self):
'''Return: 2-tuple of arrays,
the data values and corresponding cumulative
probabilities.
'''
xsteps, psteps = self.gen_xp()
return N.fromiter(xsteps,'f'), N.fromiter(psteps,'f')