That should all be in the boxplot docstring. Do you use ipython? If
not, you should
if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
-paul
路路路
On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs@...2650...> wrote:
On 21-Aug-2012 17:50, Paul Hobson wrote:
On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs@...2650...> wrote:
In reference to my previous email.
How can I find the outliers (samples points beyond the whiskers) in the
data
used for the boxplot?
Here is a code snippet that shows how it was used for the timings data (a
list
of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
...
...
...
# Box Plots
plt.subplot(2,1,2)
timings = [y1,y2,y3,y4]
pos = np.array(range(len(timings)))+1
bp = plt.boxplot( timings, sym='k+', patch_artist=True,
positions=pos, notch=1, bootstrap=5000 )
plt.xlabel('Algorithm')
plt.ylabel('Exection time (sec)')
plt.ylim(0.9*ymin,1.1*ymax)
plt.setp(bp['whiskers'], color='k', linestyle='-' )
plt.setp(bp['fliers'], markersize=3.0)
plt.title('Box plots (%4d trials)' %(n))
plt.show()
...
...
...
Again my questions:
1) How to get the value of the median?
2) How to find the outliers (outside the whiskers)?
3) How to find the width of the notch?
Virgil, the objects stuffed inside the `bp` dictionary should have
methods to retrieve their values. Let's see:
In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
In [37]: # Question 1
...: print('medians')
...: for n, median in enumerate(bp['medians']):
...: print('%d: %f' % (n, median.get_ydata()[0]))
...:
medians
0: 6.339692
1: 3.449320
2: 4.503706
In [38]: # Question 2
...: print('fliers')
...: for n in range(0, len(bp['fliers']), 2):
...: print('%d: upper outliers = \t' % (n/2,))
...: print(bp['fliers'][n].get_ydata())
...: print('\n%d: lower outliers = \t' % (n/2,))
...: print(bp['fliers'][n+1].get_ydata())
...: print('\n')
...:
You had no outliers!
In [39]: # Question 3
...: print('Confidence Intervals')
...: for n, box in enumerate(bp['boxes']):
...: print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
...: print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
...:
Confidence Intervals
0: lower CI: 1.760701
0: upper CI: 10.102221
1: lower CI: 1.626386
1: upper CI: 5.601927
2: lower CI: 2.173173
Hope that helps,
-paul
Just what I was looking for Paul! Thanks very much.
One final question --- Where can I find the documentation that answers my
questions and gives more details about the equations used for the width of
notch. etc.?
Thanks again
I still have a problem...
Let me show the updated code snippet again
...
# Box Plots
iplt += 1
plt.figure(iplt)
timings = [ya[0],ya[1],ya[2],ya[3]]
pos = np.array(range(len(timings)))+1
bp = plt.boxplot( timings, sym='k+', patch_artist=True,
positions=pos, notch=1, bootstrap=5000 )
print ('medians')
for nn,median in enumerate(bp['medians']):
print('%d: %f' %(nn,median.get_ydata()[0]))
print('fliers')
for nn in range(0, len(bp['fliers']), 2):
print('%d: upper outliers = \t' % (nn/2,))
print(bp['fliers'][nn].get_ydata())
print('\n%d: lower outliers = \t' % (nn/2,))
print(bp['fliers'][nn+1].get_ydata())
print('\n')
print('Confidence Intervals')
for nn, box in enumerate(bp['boxes']):
print('%d: lower CI: %f' % (nn, box.get_ydata()[2]))<--- FAILS!
print('%d: upper CI: %f' % (nn, box.get_ydata()[4]))
...
Medians and fliers work perfectly; but, I get the following error message when trying to access the confidence intervals:
AttributeError: 'PathPatch' object has no attribute 'get_ydata'
Note, I am using boxplot with 4 sets of data and I am using matplotlib vers. 1.1.0.
Any suggestions on how to fix this problem?
路路路
On 21-Aug-2012 17:59, Paul Hobson wrote:
On Tue, Aug 21, 2012 at 8:56 AM, Virgil Stokes <vs@...2650...> wrote:
On 21-Aug-2012 17:50, Paul Hobson wrote:
On Tue, Aug 21, 2012 at 7:58 AM, Virgil Stokes <vs@...2650...> wrote:
In reference to my previous email.
How can I find the outliers (samples points beyond the whiskers) in the
data
used for the boxplot?
Here is a code snippet that shows how it was used for the timings data (a
list
of 4 sublists (y1,y2,y3,y4), each containing 400,000 real data values),
...
# Box Plots
plt.subplot(2,1,2)
timings = [y1,y2,y3,y4]
pos = np.array(range(len(timings)))+1
bp = plt.boxplot( timings, sym='k+', patch_artist=True,
positions=pos, notch=1, bootstrap=5000 )
plt.xlabel('Algorithm')
plt.ylabel('Exection time (sec)')
plt.ylim(0.9*ymin,1.1*ymax)
plt.setp(bp['whiskers'], color='k', linestyle='-' )
plt.setp(bp['fliers'], markersize=3.0)
plt.title('Box plots (%4d trials)' %(n))
plt.show()
...
Again my questions:
1) How to get the value of the median?
2) How to find the outliers (outside the whiskers)?
3) How to find the width of the notch?
Virgil, the objects stuffed inside the `bp` dictionary should have
methods to retrieve their values. Let's see:
In [35]: x = np.random.lognormal(mean=1.25, sigma=1.35, size=(37,3))
In [36]: bp = plt.boxplot(x, bootstrap=5000, notch=True)
In [37]: # Question 1
...: print('medians')
...: for n, median in enumerate(bp['medians']):
...: print('%d: %f' % (n, median.get_ydata()[0]))
...:
medians
0: 6.339692
1: 3.449320
2: 4.503706
In [38]: # Question 2
...: print('fliers')
...: for n in range(0, len(bp['fliers']), 2):
...: print('%d: upper outliers = \t' % (n/2,))
...: print(bp['fliers'][n].get_ydata())
...: print('\n%d: lower outliers = \t' % (n/2,))
...: print(bp['fliers'][n+1].get_ydata())
...: print('\n')
...:
You had no outliers!
In [39]: # Question 3
...: print('Confidence Intervals')
...: for n, box in enumerate(bp['boxes']):
...: print('%d: lower CI: %f' % (n, box.get_ydata()[2]))
...: print('%d: upper CI: %f' % (n, box.get_ydata()[4]))
...:
Confidence Intervals
0: lower CI: 1.760701
0: upper CI: 10.102221
1: lower CI: 1.626386
1: upper CI: 5.601927
2: lower CI: 2.173173
Hope that helps,
-paul
Just what I was looking for Paul! Thanks very much.
One final question --- Where can I find the documentation that answers my
questions and gives more details about the equations used for the width of
notch. etc.?
Thanks again
That should all be in the boxplot docstring. Do you use ipython? If
not, you should
if so, just do `plt.boxplot?` at the ipython terminal and it'll show up.
-paul