On Sat, Oct 20, 2012 at 10:25 PM, Steven Boada <boada@…3847…> wrote:
It’d be cool if we could do something like
bins = [(0.0,0.05,0.1),(0.05,0.1,0.15)…]
Where I have specified the left edge, center and right edge of each
bin. Yeah, that’d be pretty slick.
S
On Sat Oct 20 16:21:41 2012, Steven Boada wrote:
Let’s say I generate a bunch of random numbers from 0-1. Then, I’d
like to make a histogram of it. But here’s the clincher. I’d like my
bins to overlap a bit. For example, if the first bin is from 0 - 0.1,
centered on 0.05, I’d like the next (second) bin to be centered on 0.1
and range from 0.05 - 0.15.
So basically, I want the width of each bin to be greater than the
spacing.
Is this something that could be done with the histogram function? I
did a couple of google searches and couldn’t come up with anything
meaningful. Apparently, ‘rwidth’ in the hist function just makes the
displayed bars bigger or smaller.
Any thoughts?
–
Steven Boada
Doctoral Student
Dept of Physics and Astronomy
Texas A&M University
boada@…3847…
My thoughts are that this goes against everything a histogram is set
out to do; attempt to provide a ‘discretised’ probability distribution
function given a set of discrete samples. Lets say a sample lies in
the region where two bins overlap. How do you define which bin the
sample lies in? Both? If both, how do you define the value of the
approximated probability distribution on a bin? You could just take
the height of the bin, but some of the bin’s mass lies in each of the
neighbouring bins.
If you don’t want to apply mass to the neighbouring bins for a sample
that lies in the region where two bins overlap, you could just pick
one. You then have the problem of non-uniqueness. If you’d picked the
other bin you’d have a different probability distribution function.
This a bad property to have.
If you don’t want to pick a neighbouring bin to apply more mass, and
just increase the width of the each bin’s matplotlib.patches.Patch
object, then that is more sensible. Except now you have the problem of
displaying the histogram. Which bin gets displayed over its left
neighbour? And its right neighbour?
I dread to think what this would imply if you also wanted to stack
such histograms. A potential can of worms.