Hi,

I would like to access values in the bins of a matplotlib histogram. The following example script is an attempt to do this. Clearly pdf contains floating point numbers, but I am unable to access them.

Help with this problem would be much appreciated.

Chris

···

--------------------------------------------------------------------------------------------------------------
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()

mu, sigma = 100, 15
x = mu + sigma * np.random.randn(20)

#Generate the histogram of the data. Example from Matplotlib documentation

n, bins, patches = plt.hist(x, 50, normed=True, facecolor='g', alpha=0.75)
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(60, .025, r'$\mu=100,\ \sigma=15$')
plt.axis([40, 160, 0, 0.03])
plt.grid(True)

#From Matplotlib documentation.
#normed: If True, the first element of the return tuple will be the counts normalized
#to form a probability density, i.e., n/(len(x)*dbin). In a probability density,
#the integral of the histogram should be 1; you can verify that with a trapezoidal
#integration of the probability density function.

pdf, bins, patches = ax.hist(x, 50, normed=True, facecolor='g', alpha=0.75)

#print pdf shows pdf contains the value in each bin of the normed histogram

print "pdf = ", pdf

print " Integration of PDF = ", np.sum(pdf * np.diff(bins))

#How to access values in pdf? Various tries made but none successful. Example attempt shown

count=0
for line in open(pdf,'r+'):
z=('%.10f' % float(x))
count=count+1
print "count = ", count

----------------------------------------------------------------------------------------------------

Hi Chris,

I think I understand what you are asking. I think the key point is I have used "np.histogram" where you are using "np.hist"

When I make my plots, I use np.hist, but then to access the data, I use np.histogram.

Just to demonstrate, incase this is not what you want, I have found, if I create a bin

bin = np.histogram(binData,range=(ymin,ymax),weights=binQ,bins=np.arange(ymin,ymax,dm0/4))

where

ind = np.argsort(my_data) # list to order the data from low to high

binDat = my_data[ind]

binQ = weights[ind] / np.sum(weights) #ordered list of weight factors for the data (for a weighted distribution. example, if you have data with uncertainties, the weights are given by the inverse uncertainties)

and ymin, ymax and dm0 are params I have specified (based on the data) to set the bin size and range of bins

The pdf, in this case, is given by pdf[i] = binQ[i].

I can then access this with

bin[i] #this is the i'th weight (the pdf at i)

also, the data (the x values) can be accessed by

bin[i]

At the very least, this gives a poor-working man's solution. I couldn't figure out how to get it from np.hist.

Andre

···

On Mar 24, 2011, at 8:47 PM, Chris Edwards wrote:

Hi,

I would like to access values in the bins of a matplotlib histogram. The following example script is an attempt to do this. Clearly pdf contains floating point numbers, but I am unable to access them.

Help with this problem would be much appreciated.

Chris

--------------------------------------------------------------------------------------------------------------
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()

mu, sigma = 100, 15
x = mu + sigma * np.random.randn(20)

#Generate the histogram of the data. Example from Matplotlib documentation

n, bins, patches = plt.hist(x, 50, normed=True, facecolor='g', alpha=0.75)
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(60, .025, r'$\mu=100,\ \sigma=15$')
plt.axis([40, 160, 0, 0.03])
plt.grid(True)

#From Matplotlib documentation.
#normed: If True, the first element of the return tuple will be the counts normalized
#to form a probability density, i.e., n/(len(x)*dbin). In a probability density,
#the integral of the histogram should be 1; you can verify that with a trapezoidal
#integration of the probability density function.

pdf, bins, patches = ax.hist(x, 50, normed=True, facecolor='g', alpha=0.75)

#print pdf shows pdf contains the value in each bin of the normed histogram

print "pdf = ", pdf

print " Integration of PDF = ", np.sum(pdf * np.diff(bins))

#How to access values in pdf? Various tries made but none successful. Example attempt shown

count=0
for line in open(pdf,'r+'):
z=('%.10f' % float(x))
count=count+1
print "count = ", count

----------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the