fixing overlapping annotations

I am trying to plot a large number of locations that need to be labeled. Often the locations are quite clustered and the resulting text is unreadable. I have been looking through the API and examples on the matplotlib web page, and I don’t see a straightforward way to plot text labels, preventing them from overlapping. There is no easy answer to the problem, since locating the labels so they are close to the point you want to label, and not overlapping is a sort of optimization problem, I guess.

Using annotate(), the location and alignment of the text can be fixed, but you don’t know the size of the resulting box until after draw() is called. Once draw is called, you can inquire what the bounding box for a label is, and then check to see if it overlaps with other labels, but this is an iterative process, and draw() can be quite slow to call repeatedly.

I guess unless you use a fixed-width font (possible, but not optimal), you just don’t know how big the labels will be, and therefore where they will extend to, and then how they should be avoided. This involves coming up with some sort of accounting system for the location and size of each text box, outside of the matplotlib API, and seems sub-optimal.

Has anybody dealt with this problem and come up with an elegant or efficient solution?

Kris

Kris Kuhlman, on 2011-02-01 18:03, wrote:

I am trying to plot a large number of locations that need to be labeled.
Often the locations are quite clustered and the resulting text is
unreadable. I have been looking through the API and examples on the
matplotlib web page, and I don't see a straightforward way to plot text
labels, preventing them from overlapping. There is no easy answer to the
problem, since locating the labels so they are close to the point you want
to label, and not overlapping is a sort of optimization problem, I guess.

Using annotate(), the location and alignment of the text can be fixed, but
you don't know the size of the resulting box until after draw() is called.
Once draw is called, you can inquire what the bounding box for a label is,
and then check to see if it overlaps with other labels, but this is an
iterative process, and draw() can be quite slow to call repeatedly.

I guess unless you use a fixed-width font (possible, but not optimal), you
just don't know how big the labels will be, and therefore where they will
extend to, and then how they should be avoided. This involves coming up
with some sort of accounting system for the location and size of each text
box, outside of the matplotlib API, and seems sub-optimal.

Has anybody dealt with this problem and come up with an elegant or efficient
solution?

Hi Kris,

unfortunately, there isn't a turn-key solution implemented for
this at the moment, but this would be something very useful and
something I've been wanting to see in matplotlib, but never had a
strong enough need to implement myself.

Take a look here for the type of machinery that could be used to
implement such functionality:

http://matplotlib.sourceforge.net/faq/howto_faq.html#automatically-make-room-for-tick-labels

best,

···

--
Paul Ivanov
314 address only used for lists, off-list direct email at:
http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7

Paul Ivanov, on 2011-02-01 17:14, wrote:

Kris Kuhlman, on 2011-02-01 18:03, wrote:
> I am trying to plot a large number of locations that need to be labeled.
> Often the locations are quite clustered and the resulting text is
> unreadable. I have been looking through the API and examples on the
> matplotlib web page, and I don't see a straightforward way to plot text
> labels, preventing them from overlapping. There is no easy answer to the
> problem, since locating the labels so they are close to the point you want
> to label, and not overlapping is a sort of optimization problem, I guess.
>
> Using annotate(), the location and alignment of the text can be fixed, but
> you don't know the size of the resulting box until after draw() is called.
> Once draw is called, you can inquire what the bounding box for a label is,
> and then check to see if it overlaps with other labels, but this is an
> iterative process, and draw() can be quite slow to call repeatedly.
>
> I guess unless you use a fixed-width font (possible, but not optimal), you
> just don't know how big the labels will be, and therefore where they will
> extend to, and then how they should be avoided. This involves coming up
> with some sort of accounting system for the location and size of each text
> box, outside of the matplotlib API, and seems sub-optimal.
>
> Has anybody dealt with this problem and come up with an elegant or efficient
> solution?

Hi Kris,

unfortunately, there isn't a turn-key solution implemented for
this at the moment, but this would be something very useful and
something I've been wanting to see in matplotlib, but never had a
strong enough need to implement myself.

Take a look here for the type of machinery that could be used to
implement such functionality:

http://matplotlib.sourceforge.net/faq/howto_faq.html#automatically-make-room-for-tick-labels

I should add that the overlaps and count_overlaps methods of
mpl.transforms.Bbox could be used for some sort of iterative
solution, as you can get the bounding box using

  a = plt.annotate("Foo",(1,2))
  bbox = a.get_window_extent()

Also, depeding on the number of labels and the need for reproducibility
of plots, you can just make the labels draggable, and move them
around using the mouse

  a.draggable()

best,

···

--
Paul Ivanov
314 address only used for lists, off-list direct email at:
http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7