histograms

I'm plotting some histograms with hist() --- well, actually with ax.hist(), where ax is an axis --- and the "normed=1" isn't working the way I would expect.

from pylab import *

data = sin(arange(0.0,100,.01))

fig = figure()
ax = fig.add_subplot(111)
ax.hist(data,bins=50,normed=1,align='center')
show()

If I do not include normed=1, then the Y scale is an actual count inside each bin. (The scale goes from 1-1000).

If I include normed=1, the Y scale goes from 1 - 7. What does that mean? normed is supposed to make the first result from ax.hist be a normalized probability distribution. But I would think that it would change the Y axis to be a probability as well, and it doesn't do that.

The docstrings do not give any insight, so I looked at the source code. It certainly *looks* like it's plotting the probability distribution. But why does the above example give a Y scale going from 1 to 7? Perhaps I'm showing my lack of statistics here, but I would think that a strict probability distribution would have the value of all of the bars adding to 1,

Sorry to send out so many messages today. I really am trying to figure this out on my own...

Simson,

Using your example I get most of the values around 0.5, and the ends near 2.3. This is correct for a probability density function; the integral of the pdf over the range of the bins should be 1. This way the pdf values as a function of x don't change with changes in the number of bins, apart from the change in resolution. The probability of a datum appearing in any subrange is the integral of the pdf over that subrange.

Having the sum of the bars add to 1 would be a different sort of normalization. Undoubtedly it has a name, but I don't know what it is. And I don't know why you were getting a y-axis up to 7.

Eric

Simson Garfinkel wrote:

ยทยทยท

I'm plotting some histograms with hist() --- well, actually with ax.hist(), where ax is an axis --- and the "normed=1" isn't working the way I would expect.

from pylab import *

data = sin(arange(0.0,100,.01))

fig = figure()
ax = fig.add_subplot(111)
ax.hist(data,bins=50,normed=1,align='center')
show()

If I do not include normed=1, then the Y scale is an actual count inside each bin. (The scale goes from 1-1000).

If I include normed=1, the Y scale goes from 1 - 7. What does that mean? normed is supposed to make the first result from ax.hist be a normalized probability distribution. But I would think that it would change the Y axis to be a probability as well, and it doesn't do that.

The docstrings do not give any insight, so I looked at the source code. It certainly *looks* like it's plotting the probability distribution. But why does the above example give a Y scale going from 1 to 7? Perhaps I'm showing my lack of statistics here, but I would think that a strict probability distribution would have the value of all of the bars adding to 1,

Sorry to send out so many messages today. I really am trying to figure this out on my own...

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options