dataLim and getting the data from `scatter` (bonus: an attempt at a Frame class)

Hi. A couple of questions about scatter:

Q1

frame.py (11.7 KB)

···

====

The bounding box axes.dataLim increases in size when calling scatter(x, y) compared to plot(x, y) (for the same x, y data, of course). I think this change is due to the size of the markers being added to the data limits (not sure if this is true). Is there an easy way to get the same data limits I would have for a call to plot?

Q2

Is there a way to get the data from the axes of a scatter plot?

Initially I thought I could get it with:

for collection in ax.collections:

for path in collection._paths:
    print path.vertices

But this seems to be the path that draws the scatter markers. Any ideas?

Frame Class

==========

Finally, if anyone is interested, I’m playing around with a Frame class for axes.frame. This class adds customizable axes frames similar to the topic of this thread:

http://sourceforge.net/mailarchive/message.php?msg_id=87d57vjgye.fsf%40peds-pc311.bsd.uchicago.edu

In this older thread, the SAGE axes frames were criticized for not being flexible enough. I’ve tried to make this class as general as possible (within my ability:). As an example of the flexibility of this Frame class, I’ve added some Tufte-style frames similar to:

http://hupp.org/adam/weblog/2007/09/03/etframes-applying-the-ideas-of-edward-tufte-to-matplotlib/

To the developers on this thread: If there’s anything I could do to make the attached Frame class more flexible (and more suitable for possible inclusion into MPL), I’d be happy to get some feedback.

Current Limitations:

================

  • the frame can only be placed on the borders of the axes (mainly because I don’t know how to move the tickers anywhere else).

  • RangeFrame only works with linear data (I’m not sure how to use the axes.transScale to properly transform the data)

  • RangeFrame and DashDotFrame don’t work properly with scatter (because of the questions in this post).

The frame class itself isn’t too long, but I got a little carried away adding in extra crap.

Sorry for the long, rambling email.:wink:

-Tony

Hi. A couple of questions about `scatter`:
Q1

The bounding box `axes.dataLim` increases in size when calling scatter(x, y)
compared to plot(x, y) (for the same x, y data, of course). I think this
change is due to the size of the markers being added to the data limits (not
sure if this is true). Is there an easy way to get the same data limits I
would have for a call to `plot`?

I think your explanation is correct (the size of the markers are
included in scatter (and patches in general) but not for plot (and
lines in general). There is no *easy* way, but you could maintain
your own bbox instance and update it in the same way the
Axes._update_datalim* methods do. (See Axes._update_patch_limits and
Axes._update_line_limits for example)

Q2

Is there a way to get the data from the axes of a scatter plot?
Initially I thought I could get it with:

for collection in ax.collections:
    for path in collection._paths:
        print path.vertices

But this seems to be the path that draws the scatter markers. Any ideas?

The "offsets" of the collection are the x and y locs, and the vertices
are the marker path, as you observed. We do not have a "publicly
accessible" api for accessing the offsets (they are stored as
col._offsets where col is the collection instance), but you could add
an spi (get_offsets, set_offsets, etc...

Frame Class

Finally, if anyone is interested, I'm playing around with a Frame class for
`axes.frame`. This class adds customizable axes frames similar to the topic
of this thread:
http://sourceforge.net/mailarchive/message.php?msg_id=87d57vjgye.fsf%40peds-pc311.bsd.uchicago.edu
In this older thread, the SAGE axes frames were criticized for not being
flexible enough. I've tried to make this class as general as possible
(within my ability:). As an example of the flexibility of this Frame class,
I've added some Tufte-style frames similar to:
hupp.org
To the developers on this thread: If there's anything I could do to make the
attached Frame class more flexible (and more suitable for possible inclusion
into MPL), I'd be happy to get some feedback.

We are *definitely* interested in this. It has some features I am
less interested in, and is missing some of the features I am very
interested in. In the former camp, some of the Tufte ideas have a
place somewhere, and the frame class (or extensions thereof) is
probably the right place for them, but I don't consider these to be
the core limitations of the current frame handling. Don't get me
wrong, I am a big Tufte fan, have all of his books, have attended his
lecture, am into his ideas, etc. I just don't consider the lack of
such features to be the major problem with the current impl. The two
major problems as I see them are: 1) the ability to just draw part of
the frame, which you've partly addressed, and 2) the ability to put
the frame (and associated ticks and labels) where you want, which is
still lacking. I'd like the user to be able to say: draw the y axis
at such-and-such a loc in such-and-such coords (axes or data) and have
the ticks and labels come along for the ride. As you've noticed, this
takes a bit more work. Ideally, one could have one or more of these
(the current top and bottom for the x axes would then just be two
incarnations) with (possibly) separate tickers and formatters.
Because this goes to the core plotting functionality, it is not easy
either from a design or implementation standpoint, particularly within
the constraints of backward compatabiliity, but it is a problem well
worth solving. It's been on the wish list a long time.

I'd love for you to take the lead on this. Given my (and other
developers) constraints on time, we'll have only limited time to help,
but hopefully we can give you some pointers when you get stuck.

Current Limitations:

* the frame can only be placed on the borders of the axes (mainly because I
don't know how to move the tickers anywhere else).

Look at how the transforms are set in the axis.Axis class for the
tickers - the ticks and labels have "blended transforms" which blend
data coords and axes coords. Eg, the x axis has the x location in
data coords and the y location in axes coords (0 for the bottom and 1
for the top). For example, for default rectilinear coords, this
transformation is built by the Axes instance:

        self._xaxis_transform = mtransforms.blended_transform_factory(
                self.axes.transData, self.axes.transAxes)

and accessed via

    def get_xaxis_transform(self):
        """
        Get the transformation used for drawing x-axis labels, ticks
        and gridlines. The x-direction is in data coordinates and the
        y-direction is in axis coordinates.

        .. note::
            This transformation is primarily used by the
            :class:`~matplotlib.axis.Axis` class, and is meant to be
            overridden by new kinds of projections that may need to
            place axis elements in different locations.
        """
        return self._xaxis_transform

* RangeFrame only works with linear data (I'm not sure how to use the
`axes.transScale` to properly transform the data)
* RangeFrame and DashDotFrame don't work properly with `scatter` (because of
the questions in this post).
The frame class itself isn't too long, but I got a little carried away
adding in extra crap.
Sorry for the long, rambling email.:wink:

No problem -- this is a complicated part of the code that is not
particularly flexible or extensible. It would be extremely useful if
you could develop and extensible API for axis handling (so one could
incorporate some of these Tufte extensions and related ideas at the
*user* level even if they are not built in, though we'd probably
supply some by default or by example) that also solves the core
problem of letting the user draw the axis (one or more) where they
want. SAGE, for example, had to hack an entirely separate axis
implementation (with their own tick lines and labels) just to draw
them in the center mathematica style.

Keep us posted!
JDH

···

On Sun, Jun 29, 2008 at 6:10 PM, Tony S Yu <tonyyu@...1166...> wrote:

I’d love for you to take the lead on this. Given my (and other
developers) constraints on time, we’ll have only limited time to help,
but hopefully we can give you some pointers when you get stuck.

I don’t know if I’m the best person to be taking this on, but if no one else is interested, then I’d be happy to take a shot at it.

Current Limitations:

  • the frame can only be placed on the borders of the axes (mainly because I
    don’t know how to move the tickers anywhere else).

Look at how the transforms are set in the axis.Axis class for the
tickers - the ticks and labels have “blended transforms” which blend
data coords and axes coords.

That reminds me: does it make more sense to have the frame as an attribute/child of each axis (just as ticks are attributes of each axis)? It seemed more appropriate to me, but I just used the frame in axes because it was already defined.

It would be extremely useful if
you could develop and extensible API for axis handling (so one could
incorporate some of these Tufte extensions and related ideas at the
user level even if they are not built in, though we’d probably
supply some by default or by example)

I didn’t really expect the Tufte-style frames to be incorporated into the core; Incidentally, my initial goal was just to play around with these Tufte-style frames, not to write a Frame class.

Keep us posted!

Will do. Thanks for your comments.

-Tony

···

On Jun 30, 2008, at 10:10 PM, John Hunter wrote: