path simplification with nan (or move_to)

Eric_Firing2 · October 7, 2008, 1:20am

Mike, John,

Because path simplification does not work with anything but a continuous line, it is turned off if there are any nans in the path. The result is that if one does this:

import numpy as np
xx = np.arange(200000)
yy = np.random.rand(200000)
#plot(xx, yy)
yy[1000] = np.nan
plot(xx, yy)

the plot fails with an incomplete rendering and general unresponsiveness; apparently some mysterious agg limit is quietly exceeded. With or without the nan, this test case also shows the bizarre slowness of add_line that I asked about in a message yesterday, and that has me completely baffled.

Both of these are major problems for real-world use.

Do you have any thoughts on timing and strategy for solving this problem? A few weeks ago, when the problem with nans and path simplification turned up, I tried to figure out what was going on and how to fix it, but I did not get very far. I could try again, but as you know I don't get along well with C++.

I am also wondering whether more than straightforward path simplification with nan/moveto might be needed. Suppose there is a nightmarish time series with every third point being bad, so it is essentially a sequence of 2-point line segments. The simplest form of path simplification fix might be to reset the calculation whenever a moveto is encountered, but this would yield no simplification in this case. I assume Agg would still choke. Is there a need for some sort of automatic chunking of the rendering operation in addition to path simplification?

Thanks.

Eric

Michael_Droettboom · October 7, 2008, 4:33pm

Eric Firing wrote:

Mike, John,

Because path simplification does not work with anything but a continuous line, it is turned off if there are any nans in the path. The result is that if one does this:

import numpy as np
xx = np.arange(200000)
yy = np.random.rand(200000)
#plot(xx, yy)
yy[1000] = np.nan
plot(xx, yy)

the plot fails with an incomplete rendering and general unresponsiveness; apparently some mysterious agg limit is quietly exceeded.

The limit in question is "cell_block_limit" in agg_rasterizer_cells_aa.h. The relationship between the number vertices and the number of rasterization cells I suspect depends on the nature of the values.

However, if we want to increase the limit, each "cell_block" is 4096 cells, each with 16 bytes, and currently it maxes out at 1024 cell blocks, for a total of 67,108,864 bytes. So, the question is, how much memory should be devoted to rasterization, when the data set is large like this? I think we could safely quadruple this number for a lot of modern machines, and this maximum won't affect people plotting smaller data sets, since the memory is dynamically allocated anyway. It works for me, but I have 4GB RAM here at work.

With or without the nan, this test case also shows the bizarre slowness of add_line that I asked about in a message yesterday, and that has me completely baffled.

lsprofcalltree is my friend!

Both of these are major problems for real-world use.

Do you have any thoughts on timing and strategy for solving this problem? A few weeks ago, when the problem with nans and path simplification turned up, I tried to figure out what was going on and how to fix it, but I did not get very far. I could try again, but as you know I don't get along well with C++.

That simplification code is pretty hairy, particularly because it tries to avoid a copy by doing everything in an iterator/generator way. I think even just supporting MOVETOs there would be tricky, but probably the easiest first thing.

I am also wondering whether more than straightforward path simplification with nan/moveto might be needed. Suppose there is a nightmarish time series with every third point being bad, so it is essentially a sequence of 2-point line segments. The simplest form of path simplification fix might be to reset the calculation whenever a moveto is encountered, but this would yield no simplification in this case. I assume Agg would still choke. Is there a need for some sort of automatic chunking of the rendering operation in addition to path simplification?

Chunking is probably something worth looking into (for lines, at least), as it might also reduce memory usage vs. the "increase the cell_block_limit" scenario.

I also think for the special case of high-resolution time series data, where x if uniform, there is an opportunity to do something completely different that should be far faster. Audio editors (such as Audacity), draw each column of pixels based on the min/max and/or mean and/or RMS of the values within that column. This makes the rendering extremely fast and simple. See:

Of course, that would mean writing a bunch of new code, but it shouldn't be incredibly tricky new code. It could convert the time series data to an image and plot that, or to a filled polygon whose vertices are downsampled from the original data. The latter may be nicer for Ps/Pdf output.

Cheers,
Mike

···

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

Eric_Firing2 · October 7, 2008, 10:07pm

Michael Droettboom wrote:

Eric Firing wrote:

Mike, John,

Because path simplification does not work with anything but a continuous line, it is turned off if there are any nans in the path. The result is that if one does this:

import numpy as np
xx = np.arange(200000)
yy = np.random.rand(200000)
#plot(xx, yy)
yy[1000] = np.nan
plot(xx, yy)

the plot fails with an incomplete rendering and general unresponsiveness; apparently some mysterious agg limit is quietly exceeded.

The limit in question is "cell_block_limit" in agg_rasterizer_cells_aa.h. The relationship between the number vertices and the number of rasterization cells I suspect depends on the nature of the values.
However, if we want to increase the limit, each "cell_block" is 4096 cells, each with 16 bytes, and currently it maxes out at 1024 cell blocks, for a total of 67,108,864 bytes. So, the question is, how much memory should be devoted to rasterization, when the data set is large like this? I think we could safely quadruple this number for a lot of modern machines, and this maximum won't affect people plotting smaller data sets, since the memory is dynamically allocated anyway. It works for me, but I have 4GB RAM here at work.

It sounds like we have little to lose by increasing the limit as you suggest here. In addition, it would be nice if hitting that limit triggered an informative exception instead of a puzzling and quiet failure, but maybe that would be hard to arrange. I have no idea how to approach it.

With or without the nan, this test case also shows the bizarre slowness of add_line that I asked about in a message yesterday, and that has me completely baffled.

lsprofcalltree is my friend!

Thank you very much for finding that!

Both of these are major problems for real-world use.

Do you have any thoughts on timing and strategy for solving this problem? A few weeks ago, when the problem with nans and path simplification turned up, I tried to figure out what was going on and how to fix it, but I did not get very far. I could try again, but as you know I don't get along well with C++.

That simplification code is pretty hairy, particularly because it tries to avoid a copy by doing everything in an iterator/generator way. I think even just supporting MOVETOs there would be tricky, but probably the easiest first thing.

The attached patch seems to work, based on cursory testing. I can make an array of 1M points, salt it with nans, and plot it, complete with gaps, and all in a reasonably snappy fashion, thanks to your units fix.

I will hold off on committing it until I hear from you or John; or if either of you want to polish and commit it (or an alternative), that's even better.

Eric

simplify.diff (8.18 KB)

···

I am also wondering whether more than straightforward path simplification with nan/moveto might be needed. Suppose there is a nightmarish time series with every third point being bad, so it is essentially a sequence of 2-point line segments. The simplest form of path simplification fix might be to reset the calculation whenever a moveto is encountered, but this would yield no simplification in this case. I assume Agg would still choke. Is there a need for some sort of automatic chunking of the rendering operation in addition to path simplification?

Chunking is probably something worth looking into (for lines, at least), as it might also reduce memory usage vs. the "increase the cell_block_limit" scenario.

I also think for the special case of high-resolution time series data, where x if uniform, there is an opportunity to do something completely different that should be far faster. Audio editors (such as Audacity), draw each column of pixels based on the min/max and/or mean and/or RMS of the values within that column. This makes the rendering extremely fast and simple. See:

Audacity download | SourceForge.net

Of course, that would mean writing a bunch of new code, but it shouldn't be incredibly tricky new code. It could convert the time series data to an image and plot that, or to a filled polygon whose vertices are downsampled from the original data. The latter may be nicer for Ps/Pdf output.

Cheers,
Mike

Eric_Firing2 · October 8, 2008, 8:26am

The patch in that last message of mine was clearly not quite right. I have gone through several iterations, and have seemed tantalizingly close, but I still don't have it right yet. I need to leave it alone for a while, but I do think it is important to get this working correctly ASAP--certainly it is for my own work, at least.

What happens with a nan should be somewhat similar to what happens with clipping, so perhaps one could take advantage of part of the clipping logic, but I have not looked at this approach closely.

Eric

Eric Firing wrote:

···

Michael Droettboom wrote:

Eric Firing wrote:

Mike, John,

Because path simplification does not work with anything but a continuous line, it is turned off if there are any nans in the path. The result is that if one does this:

import numpy as np
xx = np.arange(200000)
yy = np.random.rand(200000)
#plot(xx, yy)
yy[1000] = np.nan
plot(xx, yy)

the plot fails with an incomplete rendering and general unresponsiveness; apparently some mysterious agg limit is quietly exceeded.

The limit in question is "cell_block_limit" in agg_rasterizer_cells_aa.h. The relationship between the number vertices and the number of rasterization cells I suspect depends on the nature of the values.
However, if we want to increase the limit, each "cell_block" is 4096 cells, each with 16 bytes, and currently it maxes out at 1024 cell blocks, for a total of 67,108,864 bytes. So, the question is, how much memory should be devoted to rasterization, when the data set is large like this? I think we could safely quadruple this number for a lot of modern machines, and this maximum won't affect people plotting smaller data sets, since the memory is dynamically allocated anyway. It works for me, but I have 4GB RAM here at work.

It sounds like we have little to lose by increasing the limit as you suggest here. In addition, it would be nice if hitting that limit triggered an informative exception instead of a puzzling and quiet failure, but maybe that would be hard to arrange. I have no idea how to approach it.

With or without the nan, this test case also shows the bizarre slowness of add_line that I asked about in a message yesterday, and that has me completely baffled.

lsprofcalltree is my friend!

Thank you very much for finding that!

Both of these are major problems for real-world use.

Do you have any thoughts on timing and strategy for solving this problem? A few weeks ago, when the problem with nans and path simplification turned up, I tried to figure out what was going on and how to fix it, but I did not get very far. I could try again, but as you know I don't get along well with C++.

That simplification code is pretty hairy, particularly because it tries to avoid a copy by doing everything in an iterator/generator way. I think even just supporting MOVETOs there would be tricky, but probably the easiest first thing.

The attached patch seems to work, based on cursory testing. I can make an array of 1M points, salt it with nans, and plot it, complete with gaps, and all in a reasonably snappy fashion, thanks to your units fix.

I will hold off on committing it until I hear from you or John; or if either of you want to polish and commit it (or an alternative), that's even better.

Eric

I am also wondering whether more than straightforward path simplification with nan/moveto might be needed. Suppose there is a nightmarish time series with every third point being bad, so it is essentially a sequence of 2-point line segments. The simplest form of path simplification fix might be to reset the calculation whenever a moveto is encountered, but this would yield no simplification in this case. I assume Agg would still choke. Is there a need for some sort of automatic chunking of the rendering operation in addition to path simplification?

Chunking is probably something worth looking into (for lines, at least), as it might also reduce memory usage vs. the "increase the cell_block_limit" scenario.

I also think for the special case of high-resolution time series data, where x if uniform, there is an opportunity to do something completely different that should be far faster. Audio editors (such as Audacity), draw each column of pixels based on the min/max and/or mean and/or RMS of the values within that column. This makes the rendering extremely fast and simple. See:

Audacity download | SourceForge.net

Of course, that would mean writing a bunch of new code, but it shouldn't be incredibly tricky new code. It could convert the time series data to an image and plot that, or to a filled polygon whose vertices are downsampled from the original data. The latter may be nicer for Ps/Pdf output.

Cheers,
Mike

------------------------------------------------------------------------

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/

------------------------------------------------------------------------

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

Michael_Droettboom · October 8, 2008, 12:38pm

Eric Firing wrote:

Michael Droettboom wrote:

Eric Firing wrote:

Mike, John,

Because path simplification does not work with anything but a continuous line, it is turned off if there are any nans in the path. The result is that if one does this:

import numpy as np
xx = np.arange(200000)
yy = np.random.rand(200000)
#plot(xx, yy)
yy[1000] = np.nan
plot(xx, yy)

the plot fails with an incomplete rendering and general unresponsiveness; apparently some mysterious agg limit is quietly exceeded.

The limit in question is "cell_block_limit" in agg_rasterizer_cells_aa.h. The relationship between the number vertices and the number of rasterization cells I suspect depends on the nature of the values.
However, if we want to increase the limit, each "cell_block" is 4096 cells, each with 16 bytes, and currently it maxes out at 1024 cell blocks, for a total of 67,108,864 bytes. So, the question is, how much memory should be devoted to rasterization, when the data set is large like this? I think we could safely quadruple this number for a lot of modern machines, and this maximum won't affect people plotting smaller data sets, since the memory is dynamically allocated anyway. It works for me, but I have 4GB RAM here at work.

It sounds like we have little to lose by increasing the limit as you suggest here. In addition, it would be nice if hitting that limit triggered an informative exception instead of a puzzling and quiet failure, but maybe that would be hard to arrange. I have no idea how to approach it.

Agreed. But also, I'm not sure how to do that. I can see where the limit is tested and no more memory is allocated, but not where it shuts down drawing after that. If we can find that point, we should be able to throw an exception back to Python somehow.

With or without the nan, this test case also shows the bizarre slowness of add_line that I asked about in a message yesterday, and that has me completely baffled.

lsprofcalltree is my friend!

Thank you very much for finding that!

Both of these are major problems for real-world use.

Do you have any thoughts on timing and strategy for solving this problem? A few weeks ago, when the problem with nans and path simplification turned up, I tried to figure out what was going on and how to fix it, but I did not get very far. I could try again, but as you know I don't get along well with C++.

That simplification code is pretty hairy, particularly because it tries to avoid a copy by doing everything in an iterator/generator way. I think even just supporting MOVETOs there would be tricky, but probably the easiest first thing.

The attached patch seems to work, based on cursory testing. I can make an array of 1M points, salt it with nans, and plot it, complete with gaps, and all in a reasonably snappy fashion, thanks to your units fix.

Very nice! It looks like a nice approach --- though I see from your second message that things aren't quite perfect yet. I, too, feel it's close, though.

One possible minor improvement might be to change the "should_simplify" expression to be true if codes is not None and contains only LINETO and MOVETOs (but not curves, obviously). I don't imagine a lot of people are building up their own paths with MOVETOs in them, but your improvement would at least make simplifying those possible.

Mike

···

Eric

I am also wondering whether more than straightforward path simplification with nan/moveto might be needed. Suppose there is a nightmarish time series with every third point being bad, so it is essentially a sequence of 2-point line segments. The simplest form of path simplification fix might be to reset the calculation whenever a moveto is encountered, but this would yield no simplification in this case. I assume Agg would still choke. Is there a need for some sort of automatic chunking of the rendering operation in addition to path simplification?

Chunking is probably something worth looking into (for lines, at least), as it might also reduce memory usage vs. the "increase the cell_block_limit" scenario.

I also think for the special case of high-resolution time series data, where x if uniform, there is an opportunity to do something completely different that should be far faster. Audio editors (such as Audacity), draw each column of pixels based on the min/max and/or mean and/or RMS of the values within that column. This makes the rendering extremely fast and simple. See:

Audacity download | SourceForge.net

Of course, that would mean writing a bunch of new code, but it shouldn't be incredibly tricky new code. It could convert the time series data to an image and plot that, or to a filled polygon whose vertices are downsampled from the original data. The latter may be nicer for Ps/Pdf output.

Cheers,
Mike

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

Michael_Droettboom · October 8, 2008, 4:37pm

Michael Droettboom wrote:

Eric Firing wrote:


Michael Droettboom wrote:


Eric Firing wrote:


Mike, John,

Because path simplification does not work with anything but a continuous line, it is turned off if there are any nans in the path. The result is that if one does this:

import numpy as np
xx = np.arange(200000)
yy = np.random.rand(200000)
#plot(xx, yy)
yy[1000] = np.nan
plot(xx, yy)

the plot fails with an incomplete rendering and general unresponsiveness; apparently some mysterious agg limit is quietly exceeded.


The limit in question is "cell_block_limit" in agg_rasterizer_cells_aa.h. The relationship between the number vertices and the number of rasterization cells I suspect depends on the nature of the values.
However, if we want to increase the limit, each "cell_block" is 4096 cells, each with 16 bytes, and currently it maxes out at 1024 cell blocks, for a total of 67,108,864 bytes. So, the question is, how much memory should be devoted to rasterization, when the data set is large like this? I think we could safely quadruple this number for a lot of modern machines, and this maximum won't affect people plotting smaller data sets, since the memory is dynamically allocated anyway. It works for me, but I have 4GB RAM here at work.


It sounds like we have little to lose by increasing the limit as you suggest here. In addition, it would be nice if hitting that limit triggered an informative exception instead of a puzzling and quiet failure, but maybe that would be hard to arrange. I have no idea how to approach it.


Agreed. But also, I'm not sure how to do that. I can see where the limit is tested and no more memory is allocated, but not where it shuts down drawing after that. If we can find that point, we should be able to throw an exception back to Python somehow.

I figured this out. When this happens, a RuntimeError("Agg rendering complexity exceeded") is thrown.

Cheers,
Mike

···

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

_John_Hunter · October 8, 2008, 4:58pm

Do you think it is a good idea to put a little helper note in the
exception along the lines of

throw "Agg rendering complexity exceeded; you may want to increase
the cell_block_size in agg_rasterizer_cells_aa.h"

in case someone gets this exception two years from now and none of us
can remember this brilliant fix

···

On Wed, Oct 8, 2008 at 11:37 AM, Michael Droettboom <mdroe@...31...> wrote:

I figured this out. When this happens, a RuntimeError("Agg rendering
complexity exceeded") is thrown.

Michael_Droettboom · October 8, 2008, 5:18pm

John Hunter wrote:

I figured this out. When this happens, a RuntimeError("Agg rendering
complexity exceeded") is thrown.

Do you think it is a good idea to put a little helper note in the
exception along the lines of

  throw "Agg rendering complexity exceeded; you may want to increase
the cell_block_size in agg_rasterizer_cells_aa.h"

in case someone gets this exception two years from now and none of us
can remember this brilliant fix

We can suggest that, or suggest that the size of the data is too large (which is easier for most users to fix, I would suspect). What about:

"Agg rendering complexity exceeded. Consider downsampling or decimating your data."

along with a comment (not thrown), saying

/* If this is thrown too often, increase cell_block_limit. */

Mike

···

On Wed, Oct 8, 2008 at 11:37 AM, Michael Droettboom <mdroe@...31...> wrote:

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

Eric_Firing2 · October 9, 2008, 1:40am

Michael Droettboom wrote:

John Hunter wrote:

I figured this out. When this happens, a RuntimeError("Agg rendering
complexity exceeded") is thrown.


Do you think it is a good idea to put a little helper note in the
exception along the lines of

  throw "Agg rendering complexity exceeded; you may want to increase
the cell_block_size in agg_rasterizer_cells_aa.h"

in case someone gets this exception two years from now and none of us
can remember this brilliant fix


We can suggest that, or suggest that the size of the data is too large (which is easier for most users to fix, I would suspect). What about:

"Agg rendering complexity exceeded. Consider downsampling or decimating your data."

along with a comment (not thrown), saying

/* If this is thrown too often, increase cell_block_limit. */

Mike

Mike,

Thanks for doing this--it has already helped me in my testing of the gappy-path simplification support, which I have now committed. As you suggested earlier, I included in path.py a check for a compatible codes array.

The agg limit still can be a problem. It looks like chunking could be added easily by making the backend_agg draw_path a python method calling the renderer method; if the path length exceeds some threshold, then subpaths would be generated and passed to the renderer method.

Eric

···

On Wed, Oct 8, 2008 at 11:37 AM, Michael Droettboom <mdroe@...31...> wrote:

_John_Hunter · October 9, 2008, 1:56am

In unrelated news, I am not in favor of the recent change to warn on
non-GUI backends when "show" is called. I realize this may sometimes
cause head-scratching behavior for some users who call show and no
figure pops up, but I think this must be pretty rare. 99% of users
have a matplotlibrc which defines a GUI default. AFAIK, the only
exceptions to this are 1) when the user has changed the rc (power
user, needs no protection) or 2) no GUI was available at build time
and the image backend was the default backend chosen (warning more
appropriate at build time). If I am missing a use case, let me know.

I like the design where the same script can be used to either generate
a UI figure or hardcopy depending on an rc settng or a command flag
(eg backend driver) and would rather not do the warning. This has
been a very infrequent problem for users over the years (a few times
at most?) so I am not sure that the annoyance of the warning is
justified.

If 2) in the choices above is the case you are concerned about, and
you want this warning feature in this case, we can add an rc param
which is autoset at build time, something like "show.warn =
True>False" since the build script is setting the default image
backend and can set "show.warn = True" when it sets an image backend
by default, otherwise False.

JDH

···

On Wed, Oct 8, 2008 at 8:40 PM, Eric Firing <efiring@...229...> wrote:

Thanks for doing this--it has already helped me in my testing of the
gappy-path simplification support, which I have now committed. As you
suggested earlier, I included in path.py a check for a compatible codes
array.

The agg limit still can be a problem. It looks like chunking could be added
easily by making the backend_agg draw_path a python method calling the
renderer method; if the path length exceeds some threshold, then subpaths
would be generated and passed to the renderer method.

Michael_Droettboom · October 9, 2008, 11:14am

John Hunter wrote:

Thanks for doing this--it has already helped me in my testing of the
gappy-path simplification support, which I have now committed. As you
suggested earlier, I included in path.py a check for a compatible codes
array.

The agg limit still can be a problem. It looks like chunking could be added
easily by making the backend_agg draw_path a python method calling the
renderer method; if the path length exceeds some threshold, then subpaths
would be generated and passed to the renderer method.

In unrelated news, I am not in favor of the recent change to warn on
non-GUI backends when "show" is called. I realize this may sometimes
cause head-scratching behavior for some users who call show and no
figure pops up, but I think this must be pretty rare. 99% of users
have a matplotlibrc which defines a GUI default. AFAIK, the only
exceptions to this are 1) when the user has changed the rc (power
user, needs no protection) or 2) no GUI was available at build time
and the image backend was the default backend chosen (warning more
appropriate at build time). If I am missing a use case, let me know.

The motivation was when a popular Linux distribution misconfigured the default to the Agg backend:

This warning was meant as future insurance against this happening -- and hoping that if the packagers don't make a GUI backend the default (in an attempt to reduce dependencies), that they at least would include the warning patch so that users aren't left feeling that matplotlib is "broken".

But we can't prevent all downstream packaging errors, so maybe this patch doesn't belong in trunk.

I like the design where the same script can be used to either generate
a UI figure or hardcopy depending on an rc settng or a command flag
(eg backend driver) and would rather not do the warning. This has
been a very infrequent problem for users over the years (a few times
at most?) so I am not sure that the annoyance of the warning is
justified.

Perhaps the warning could only be emitted when running inside an interactive console (ipython or standard python) -- if that's possible. Alternatively, "ipython -pylab" could warn on startup if the backend is non-GUI (and that change would probably live in ipython).

If 2) in the choices above is the case you are concerned about, and
you want this warning feature in this case, we can add an rc param
which is autoset at build time, something like "show.warn =
True>False" since the build script is setting the default image
backend and can set "show.warn = True" when it sets an image backend
by default, otherwise False.

I intended the warning to warn against misconfiguration, so one shouldn't have to explicitly configure anything to get it...

Mike

···

On Wed, Oct 8, 2008 at 8:40 PM, Eric Firing <efiring@...229...> wrote:

Michael_Droettboom · October 9, 2008, 11:42am

Michael Droettboom wrote:

John Hunter wrote:


In unrelated news, I am not in favor of the recent change to warn on
non-GUI backends when "show" is called. I realize this may sometimes
cause head-scratching behavior for some users who call show and no
figure pops up, but I think this must be pretty rare. 99% of users
have a matplotlibrc which defines a GUI default. AFAIK, the only
exceptions to this are 1) when the user has changed the rc (power
user, needs no protection) or 2) no GUI was available at build time
and the image backend was the default backend chosen (warning more
appropriate at build time). If I am missing a use case, let me know.


The motivation was when a popular Linux distribution misconfigured the default to the Agg backend:

Bug #278764 “[intrepid] No output in ipython because because of ...” : Bugs : matplotlib package : Ubuntu

This warning was meant as future insurance against this happening -- and hoping that if the packagers don't make a GUI backend the default (in an attempt to reduce dependencies), that they at least would include the warning patch so that users aren't left feeling that matplotlib is "broken".

But we can't prevent all downstream packaging errors, so maybe this patch doesn't belong in trunk.


I like the design where the same script can be used to either generate
a UI figure or hardcopy depending on an rc settng or a command flag
(eg backend driver) and would rather not do the warning. This has
been a very infrequent problem for users over the years (a few times
at most?) so I am not sure that the annoyance of the warning is
justified.


Perhaps the warning could only be emitted when running inside an interactive console (ipython or standard python) -- if that's possible. Alternatively, "ipython -pylab" could warn on startup if the backend is non-GUI (and that change would probably live in ipython).

I have modified the code to only do this when show() is called directly from the python or ipython console. Calling show() from a script will not emit this warning.

If that's still too noisy, I'm happy to revert the whole thing.

Mike

_John_Hunter · October 9, 2008, 12:31pm

If 2) in the choices above is the case you are concerned about, and
you want this warning feature in this case, we can add an rc param
which is autoset at build time, something like "show.warn =
True>False" since the build script is setting the default image
backend and can set "show.warn = True" when it sets an image backend
by default, otherwise False.

I intended the warning to warn against misconfiguration, so one shouldn't
have to explicitly configure anything to get it...

This is mostly academic, since I am happy with your latest changes
because I can run backend driver or do

> python somefile.py -dAgg

and get no warning.

But .... I wasn't suggesting explicit configuration by the user. At
build time, mpl looks for a functioning backend in setup.py and if it
fails to find one, sets Agg and creates the default matplotlibrc from
matplotlibrc.template. In the case where a no GUI was detected, the
build script could also set a rc warn-on-show flag. The ubuntu
packager, who probably built mpl in an environment with no X11 and got
no functioning GUI, would get a rc file with backend Agg and the
warn-on-show flag set.

But I think we can leave things as they are.

···

On Thu, Oct 9, 2008 at 6:14 AM, Michael Droettboom <mdroe@...31...> wrote: