 # What's Wrong with this Histogram?

That helped by using the original data of 256 elements. So all the large values in the array beyond 120 would be tiny bars stretched out to x of about 127516. OK, now with the original 256 elements I see some problems.

Individually, they contain some high counts, so I guess they are going off scale. This is unfortunate, since the original data was put into 256 bins by hardware from 307,000 + values. It looks like what I should be feeding hist, but recreating the 307K from the 256 seems something of a waste in that it is undoing what the hardware did. Is there some graph function that will treat the input as already binned? For example, if I have [10, 7, 5], I want to see a histogram of three bars, one at x =0 of height 10, one at x=1 of height 6, and 2 of height 5. x might be some other numbers like 18.2, 46.3 and 60.1.

···

Hello,

hist takes the raw data directly, and not a histogram already computed.

If data is an array containing your pixels,
hist(data, bins = range(0,255,8) , normed=True) should do what you expect

The code you sent adequately counts 13 occurences for 0 in freq and one at 121, with some rescaling.

Pierre

Le 30 nov. 09 � 16:52, Wayne Watson a �crit :

I'm working with a Python program that produces freq below. There are 32
bins. The bins represent 0-7, 8-14, ..., 248 - 255 of a set of
frequencies (integer counts). 0 to 255 are the brightness pixel values
from a 640x480 frame of b/w pixels. I binned 8 into each of 32 bins. One
can easily see that the various bins are of a different height. However,
the result is fixed height bar from 0 to 10, and a shorter single bar
from about 120 to 130. The x-scale goes from 0 to 140 and not from 0 to
255, or somewhere in that range. It seems like hist is clumping
everything into two groups. I've changed the range parameter several
times and get the same result. I'd send an attachment of the figure, but
that often seems to delay a post in most of these Python mail lists.

freq = [127516, 8548, 46797, 46648, 21085, 9084, 7466, 6534, 5801,
5051, 4655, 4168, 4343, 3105, 2508, 2082, 1200, 488, 121, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]
fig = pylab.figure()
v = array(freq)
plt.hist(v, bins=linspace(0,256,nplt_bins+1), normed=1, range=(30,200))
pylab.show()

--

--

(121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)
Obz Site: 39� 15' 7" N, 121� 2' 32" W, 2700 feet
The popular press and many authorities believe the number
of pedifiles that prowl the web is 50,00. There are no
figures that support this. The number of children below
18 years of age kidnapped by strangers is 1 in 600,00,
or 115 per year. -- The Science of Fear by D. Gardner
Web Page: <www.speckledwithstars.net/>

bar does what you need.

import numpy as np
import matplotlib.pyplot as plt

freq = np.array( [127516, 8548, 46797, 46648, 21085, 9084, 7466, 6534, 5801,
5051, 4655, 4168, 4343, 3105, 2508, 2082, 1200, 488, 121, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0] )

fig = plt.figure()
plt.bar(range(0,255,8),freq*1./freq.sum(),width=8)
# the 1. avoid an integer division that gives 0 everywhere.
# width=8 specifies that each bins takes 8 units of width, corresponding to the spacing in range(0,255,8)
plt.show()

···

Le 30 nov. 09 à 17:46, Wayne Watson a écrit :

That helped by using the original data of 256 elements. So all the large values in the array beyond 120 would be tiny bars stretched out to x of about 127516. OK, now with the original 256 elements I see some problems.

Individually, they contain some high counts, so I guess they are going off scale. This is unfortunate, since the original data was put into 256 bins by hardware from 307,000 + values. It looks like what I should be feeding hist, but recreating the 307K from the 256 seems something of a waste in that it is undoing what the hardware did. Is there some graph function that will treat the input as already binned? For example, if I have [10, 7, 5], I want to see a histogram of three bars, one at x =0 of height 10, one at x=1 of height 6, and 2 of height 5. x might be some other numbers like 18.2, 46.3 and 60.1.

Hello,

hist takes the raw data directly, and not a histogram already computed.

If data is an array containing your pixels,
hist(data, bins = range(0,255,8) , normed=True) should do what you expect

The code you sent adequately counts 13 occurences for 0 in freq and one at 121, with some rescaling.

Pierre

Le 30 nov. 09 à 16:52, Wayne Watson a écrit :

I'm working with a Python program that produces freq below. There are 32
bins. The bins represent 0-7, 8-14, ..., 248 - 255 of a set of
frequencies (integer counts). 0 to 255 are the brightness pixel values
from a 640x480 frame of b/w pixels. I binned 8 into each of 32 bins. One
can easily see that the various bins are of a different height. However,
the result is fixed height bar from 0 to 10, and a shorter single bar
from about 120 to 130. The x-scale goes from 0 to 140 and not from 0 to
255, or somewhere in that range. It seems like hist is clumping
everything into two groups. I've changed the range parameter several
times and get the same result. I'd send an attachment of the figure, but
that often seems to delay a post in most of these Python mail lists.

freq = [127516, 8548, 46797, 46648, 21085, 9084, 7466, 6534, 5801,
5051, 4655, 4168, 4343, 3105, 2508, 2082, 1200, 488, 121, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]
fig = pylab.figure()
v = array(freq)
plt.hist(v, bins=linspace(0,256,nplt_bins+1), normed=1, range=(30,200))
pylab.show()

--

--

(121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)
Obz Site: 39° 15' 7" N, 121° 2' 32" W, 2700 feet The popular press and many authorities believe the number
of pedifiles that prowl the web is 50,00. There are no
figures that support this. The number of children below
18 years of age kidnapped by strangers is 1 in 600,00,
or 115 per year. -- The Science of Fear by D. Gardner
Web Page: <www.speckledwithstars.net/>

Thanks. Very good.

···

bar does what you need.

import numpy as np
import matplotlib.pyplot as plt

freq = np.array( [127516, 8548, 46797, 46648, 21085, 9084, 7466, 6534, 5801,
5051, 4655, 4168, 4343, 3105, 2508, 2082, 1200, 488, 121, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0] )

fig = plt.figure()
plt.bar(range(0,255,8),freq*1./freq.sum(),width=8)
# the 1. avoid an integer division that gives 0 everywhere.
# width=8 specifies that each bins takes 8 units of width, corresponding to the spacing in range(0,255,8)
plt.show()

Le 30 nov. 09 � 17:46, Wayne Watson a �crit :

That helped by using the original data of 256 elements. So all the large values in the array beyond 120 would be tiny bars stretched out to x of about 127516. OK, now with the original 256 elements I see some problems.

Individually, they contain some high counts, so I guess they are going off scale. This is unfortunate, since the original data was put into 256 bins by hardware from 307,000 + values. It looks like what I should be feeding hist, but recreating the 307K from the 256 seems something of a waste in that it is undoing what the hardware did. Is there some graph function that will treat the input as already binned? For example, if I have [10, 7, 5], I want to see a histogram of three bars, one at x =0 of height 10, one at x=1 of height 6, and 2 of height 5. x might be some other numbers like 18.2, 46.3 and 60.1.

Hello,

hist takes the raw data directly, and not a histogram already computed.

If data is an array containing your pixels,
hist(data, bins = range(0,255,8) , normed=True) should do what you expect

The code you sent adequately counts 13 occurences for 0 in freq and one at 121, with some rescaling.

Pierre

Le 30 nov. 09 � 16:52, Wayne Watson a �crit :

I'm working with a Python program that produces freq below. There are 32
bins. The bins represent 0-7, 8-14, ..., 248 - 255 of a set of
frequencies (integer counts). 0 to 255 are the brightness pixel values
from a 640x480 frame of b/w pixels. I binned 8 into each of 32 bins. One
can easily see that the various bins are of a different height. However,
the result is fixed height bar from 0 to 10, and a shorter single bar
from about 120 to 130. The x-scale goes from 0 to 140 and not from 0 to
255, or somewhere in that range. It seems like hist is clumping
everything into two groups. I've changed the range parameter several
times and get the same result. I'd send an attachment of the figure, but
that often seems to delay a post in most of these Python mail lists.

freq = [127516, 8548, 46797, 46648, 21085, 9084, 7466, 6534, 5801,
5051, 4655, 4168, 4343, 3105, 2508, 2082, 1200, 488, 121, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0]
fig = pylab.figure()
v = array(freq)
plt.hist(v, bins=linspace(0,256,nplt_bins+1), normed=1, range=(30,200))
pylab.show()

--

--

(121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)
Obz Site: 39� 15' 7" N, 121� 2' 32" W, 2700 feet The popular press and many authorities believe the number
of pedifiles that prowl the web is 50,00. There are no
figures that support this. The number of children below
18 years of age kidnapped by strangers is 1 in 600,00,
or 115 per year. -- The Science of Fear by D. Gardner
Web Page: <www.speckledwithstars.net/>

--

(121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)
Obz Site: 39� 15' 7" N, 121� 2' 32" W, 2700 feet

The popular press and many authorities believe the number
of pedifiles that prowl the web is 50,00. There are no
figures that support this. The number of children below
18 years of age kidnapped by strangers is 1 in 600,00,
or 115 per year. -- The Science of Fear by D. Gardner

Web Page: <www.speckledwithstars.net/>

Another related question. is there some statistics function that computes the mean, std. dev., min/max, etc. from a frequency distribution?

···

--