Is there a standard function or practice for

plotting the CDF of a series? (I am aware

of the output of hist.)

Thank you,

Alan Isaac

Is there a standard function or practice for

plotting the CDF of a series? (I am aware

of the output of hist.)

Thank you,

Alan Isaac

Without really knowing what CDF is (I am assuming

it is Cumulative Density Fucntion or something similar).

I would suggest taking a look at numpy and the histogram

and cumsum functions therein.

Cheers

Tommy

On Sep 26, 2007, at 5:09 PM, Alan Isaac wrote:

Is there a standard function or practice for

plotting the CDF of a series? (I am aware

of the output of hist.)

Alan Isaac wrote:

plotting the CDF of a series? (I am aware

of the output of hist.)

import numpy as np

from matplotlib import pylab

x = ... # whatever

n = len(x)

x2 = np.repeat(x, 2)

y2 = np.hstack([0.0, np.repeat(np.arange(1,n) / float(n), 2), 1.0])

pylab.plot(x2, y2)

--

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma

that is made terrible by our own mad attempt to interpret it as though it had

an underlying truth."

-- Umberto Eco

Hi Alan,

There is an empiricalcdf function in scipy/sandbox/dhuard/stats.py

It’s not fancy but it might do what you want.

David

2007/9/26, Alan Isaac <

aisaac@…310…>:

Is there a standard function or practice for

plotting the CDF of a series? (I am awareof the output of hist.)

Thank you,

Alan Isaac

This SF.net email is sponsored by: Microsoft

Defy all challenges. Microsoft® Visual Studio 2005.http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

Matplotlib-users mailing list

Matplotlib-users@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/matplotlib-users

OK, that's pretty slick.

I did not think about ``repeat``.

Thanks,

Alan

On Wed, 26 Sep 2007, Robert Kern apparently wrote:

n = len(x)

x2 = np.repeat(x, 2)

y2 = np.hstack([0.0, np.repeat(np.arange(1,n) / float(n), 2), 1.0])

pylab.plot(x2, y2)

Well, I also did not think of a double application of

argsort to rank the observations. Nice. OK, I've got plenty

to work with now.

Thanks,

Alan

On Thu, 27 Sep 2007, David Huard apparently wrote:

scipy/sandbox/dhuard/stats.py

After thinking it over, I did not go for

Robert or David's cool numpy tricks, but

I'll append a simple object in case someone

else wants to do more.

Cheers,

Alan Isaac

class EmpiricalCDF(object):

'''Empirical cdf.

First point will be (xmin,0).

Last point will be (xmax,1).

'''

def __init__(self, data, sortdata=True):

if sortdata:

data = N.sort(data)

self.data = data

self.nobs = len(data)

def gen_xp(self):

data, nobs = self.data, self.nobs

prob = N.linspace(0, 1, nobs+1)

xsteps = ( data[(idx)//2] for idx in xrange(2*nobs) )

psteps = ( prob[(idx+1)//2] for idx in xrange(2*nobs) )

return xsteps, psteps

def get_steps(self):

'''Return: 2-tuple of arrays,

the data values and corresponding cumulative

probabilities.

'''

xsteps, psteps = self.gen_xp()

return N.fromiter(xsteps,'f'), N.fromiter(psteps,'f')