Those figures look great. Seaborn has some similar functionality (scroll
down a bit):
right -- seaborn looks really nice and I am yet to take advantage of it.
BUT that is why we are talking here, at matplotlib list: seaborn (and
few others) while aiming to provide high level convenience, specific to
e.g. using pandas as the core datastructures, add improvements which
could easily go into stock matplotlib and thus benefit all of the users.
That is why I thought that improving boxplot itself could be of
more generic benefit, while allowing all the dependent projects take
advantage of it without requiring unnecessary fragmentation (e.g. "use
seaborn for paired plots", which could easily go straight into stock
boxplot operating on arrays).
Even violin plots could probably could be done in matplotlib with
some basic density estimator (with parameter for a custom one) as an
option within boxplot function itself.
The main point of the most recent overhaul of boxplots was to allow users
to just what you describe. The methods plt.boxplot and ax.boxplot now do
very little on their own. Input data are passed to
matplotlib.cbook.boxplot_stats, that function returns a list of
dictionaries of statistics, and then ax.bxp actually does the drawing. All
of this is to say that you can write your own function to modify
boxplot_stats' output or generate independently the list of dictionaries
expected by ax.bxp.
The keys of those dictionaries can include:
- label -> tick label for the boxplot
- mean -> mean value (can plot as a line or point)
- median -> 50th percentile
- q1 -> first quartile (25th pctl)
- q3 -> third quartile (75 (pctl)
- cilo -> lower notch around the median
- ciho -> upper notch around the median
- whislo -> end of the lower whisker
- whishi -> end of the upper whisker
- fliers -> outliers
Basically, you can set the appropriate values to whatever you want to draw
boxplots however you wish (like open/close diagrams for pandas).
Also, the `whis` kwarg accepted by boxplot and cbook.boxplot_stats can
either be a float (1.5 by default), a list of integer percentiles (like 5,
95), or the strings 'range', 'limits', or 'min/max', all of which will
extend the whiskers to over all of the data.
Since you're running off of master, you should access to this new
usually I run off the releases and even more often from releases in
Debian stable. But yes -- I have the master and this new functionality
looks neat -- thanks again. But those few enhancements, such as
- plot actual datapoints with the jitter
- plot pairing lines across boxplots
seems to be not there and I would consider them worthwhile enhancement
Feel free to hit me up with any other questions!
sorry that I have hit with not really a question above
On Sat, 15 Feb 2014, Paul Hobson wrote:
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419