requesting permission to remove traits and configobj

_Darren_Dale2 · December 11, 2008, 3:20am

There has been a report at the bugtracker complaining that matplotlib is overwriting an existing installation of configobj. I had a look at the code and thought the bug report must be a mistake or windows specific, but I just saw similar behavior on my linux system.

I would like to simply remove configobj and traits from our repository. They are only used by the long-neglected experimental traited config package, which is only of interest to developers who can easily install them as external dependencies. Is it ok to remove them? If so, should it be done on all the branches?

How long are we going to continue to maintain the different branches? It was so much easier back when we only had to worry about the trunk…

Thanks,
Darren

_John_Hunter · December 11, 2008, 4:10am

There has been a report at the bugtracker complaining that matplotlib is overwriting an existing installation of configobj. I had a look at the code and thought the bug report must be a mistake or windows specific, but I just saw similar behavior on my linux system.

Ignoring for a minute the question of whether we can/should flush configobj/traits, it sounds like the real problem is that setup.cfg is not working like we expect it to. And that is something that should be fixed if is broken. If mpl is installing configobj or traits even if we are telling it not to via setup.cfg, then we have a problem. This is worth knowing, since the last mpl release was broken vis-a-vis the default backend on win32, which could be explained by a broken setup.cfg.

I would like to simply remove configobj and traits from our repository. They are only used by the long-neglected experimental traited config package, which is only of interest to developers who can easily install them as external dependencies. Is it ok to remove them? If so, should it be done on all the branches?

How long are we going to continue to maintain the different branches? It was so much easier back when we only had to worry about the trunk…

You can remove them from the trunk. They should remain on the 0.91 branch as is (with any known bugs fixed and merged) since that is the point of the branch (stability for those who cannot upgrade – in principle someone might be depending on the traited config, in practice unlikely). As for the 0.98 branch, it is slated for destruction so no worries. I share your visceral reaction against branches, but my head is starting to override this bodily reaction, as I see the need for them in practice. If we carefully document the best practices and motivations in the developerr’s guide, we can use them advantageously.

We have a lot of people contributing to mpl, and approaching or just after release time we need some mechanism for stabilizing the tested feature set of the release candidate while allowing other development to proceed, and branches are the natural mechanism for that. That we are starting to use them is a reflection of the fact that we have many more active developers than we ever had before (12 developers contributed between 98.3 and 98.4, it used to be just 3 or 4 at a time). I wouldn’t be advocating branches otherwise, because I am an advocate of doing things as simply as possible: “Make everything as simple as possible, but not simpler.”.

In general, I am in favor of the wildest, wooliest, development process we can afford. I would like to have everyone on the trunk, making releases as often as possible (nightly if we can), with an attitude of “if you break it, just fix it an rerelease it”. This model worked fine for us for years, and I think it would continue to work if we have a hyper-active developer or an automated build bot. In the old days, I would release any time I added a new feaure, and if I broke something I would have a new release out in hours. I no longer have the time for that, and we are lucky to have Charlie buildng OS X and win32 binaties and eggs for multiple python versions. When we release broken code, Charlie has to go through the entire test/upload/release cycle again, building for multiple OSs and python versions while taking care of his wife and two babies, so we want to minimize that. At the same time, we have lots of developers pushing code into the mainline. We need some mechanism of balancing the desire of developers to get new code in and the need for the packagers and release manangers to get stable code out.

I think the right balance for mpl before a release is to test the HEAD, sign off on it, branch it, let development proceed on the HEAD, and put any release critical bugs and fixes into the branch. When the next release comes up, delete the old branch, and wash-rinse-repeat. We will have in perpetuity one active release branch at a time, which gets important bug fixes and nothing else, and in addition (for a while) a legacy branch (0.91) which is updated once a month or so. I am happy to get input on this.

JDH

···

On Wed, Dec 10, 2008 at 9:20 PM, Darren Dale <dsdale24@…55…149…> wrote:

_Darren_Dale2 · December 11, 2008, 1:50pm

There has been a report at the bugtracker complaining that matplotlib is overwriting an existing installation of configobj. I had a look at the code and thought the bug report must be a mistake or windows specific, but I just saw similar behavior on my linux system.

Ignoring for a minute the question of whether we can/should flush configobj/traits, it sounds like the real problem is that setup.cfg is not working like we expect it to. And that is something that should be fixed if is broken. If mpl is installing configobj or traits even if we are telling it not to via setup.cfg, then we have a problem. This is worth knowing, since the last mpl release was broken vis-a-vis the default backend on win32, which could be explained by a broken setup.cfg.

I think I figured this our in my sleep last night. I dont think setup.cfg or setupext.py is broken. Here is what happened:

I have a new system and I want to build mpl. I run setup.py build, with no setup.cfg, and setupext.py tells me that I dont have configobj and mpl is going to provide it. That’s not what I want, I would rather install it with kubuntu’s package manager, so I do. Now I run setup.py build again and mpl tells me that it found the configobj I installed with apt-get. Great, so I run setup.py install and, wait for it, mpl installs its own configobj, overwriting the one I just installed, because I forgot to delete my build/ which contained a configobj from the first time I ran setup.py build.

This can probably bite us when building the windows installers too, hopefully Charlie is deletes his build as part of the standard procedure.

I would like to simply remove configobj and traits from our repository. They are only used by the long-neglected experimental traited config package, which is only of interest to developers who can easily install them as external dependencies. Is it ok to remove them? If so, should it be done on all the branches?

How long are we going to continue to maintain the different branches? It was so much easier back when we only had to worry about the trunk…

You can remove them from the trunk. They should remain on the 0.91 branch as is (with any known bugs fixed and merged) since that is the point of the branch (stability for those who cannot upgrade – in principle someone might be depending on the traited config, in practice unlikely). As for the 0.98 branch, it is slated for destruction so no worries. I share your visceral reaction against branches, but my head is starting to override this bodily reaction, as I see the need for them in practice. If we carefully document the best practices and motivations in the developerr’s guide, we can use them advantageously.

At least we have a nice overview of the procedure in the developers guide. Thanks for that.

I will remove these from the trunk, but I might not get to it until this afternoon or evening. I have something pressing this morning at work.

We have a lot of people contributing to mpl, and approaching or just after release time we need some mechanism for stabilizing the tested feature set of the release candidate while allowing other development to proceed, and branches are the natural mechanism for that. That we are starting to use them is a reflection of the fact that we have many more active developers than we ever had before (12 developers contributed between 98.3 and 98.4, it used to be just 3 or 4 at a time). I wouldn’t be advocating branches otherwise, because I am an advocate of doing things as simply as possible: “Make everything as simple as possible, but not simpler.”.

Having worked with bzr and launchpad for a few months now, I wonder if we might consider such an approach in the future. I know there has been some experimentation with a git repository, is git supported on windows now?

···

On Wed, Dec 10, 2008 at 11:10 PM, John Hunter <jdh2358@…149…> wrote:

On Wed, Dec 10, 2008 at 9:20 PM, Darren Dale <dsdale24@…149…> wrote:

Michael_Droettboom · December 11, 2008, 2:55pm

Darren Dale wrote:

    We have a lot of people contributing to mpl, and approaching or
    just after release time we need some mechanism for stabilizing the
    tested feature set of the release candidate while allowing other
    development to proceed, and branches are the natural mechanism for
    that. That we are starting to use them is a reflection of the
    fact that we have many more active developers than we ever had
    before (12 developers contributed between 98.3 and 98.4, it used
    to be just 3 or 4 at a time). I wouldn't be advocating branches
    otherwise, because I am an advocate of doing things as simply as
    possible: "Make everything as simple as possible, but not
    simpler.".

Having worked with bzr and launchpad for a few months now, I wonder if we might consider such an approach in the future. I know there has been some experimentation with a git repository, is git supported on windows now?

I'm not sold that bzr/hg/git makes things simpler for this development model. Brett Cannon is currently developing a PEP to propose DVCS for CPython development. See here:

http://docs.google.com/View?docid=dg7fctr4_40dvjkdg64

What John is proposing for matplotlib is identical to the "Backport" use case in the PEP. As you can see, hg actually makes things a lot more complicated than svn/svnmerge in this regard. I don't know if bzr (which I have almost no experience with), or git (which I've tried to do this kind of thing but haven't found the magic incantation yet), are any better in this regard. Perhaps they are, but it's difficult to find documentation on "methodologies" rather than just "methods" for these youngish tools. I think it would be fantastic for anyone with enough knowledge able to help Brett flesh out his PEP, because then we'd all have a really clean comparison between the tools for specific use cases. And it's a very scientific and "spin-free" way forward, IMHO.

So, I'm not meaning to jump on your suggestion specifically, Darren, but I think I've reached some sort of level of fatigue with people saying (mainly outside matplotlib) "if we just switched to X, all this merging/branching would be so much easier", without a specific description of how to migrate to and use X and how that's superior enough to warrant the effort. I don't mean that rhetorically -- I actually believe anything is probably better than Subversion, but specifically why and how is so often lacking.

I happen to like svnmerge, because one developer (a VC specialist, if you will) can set up the branching and merge tracking and all anyone else needs to know is "svnmerge.py merge", if they care about merging at all. It always feels like using the DVCS tools that everyone is forced to know the topology of the project just to do anything with it -- that's a matter of style more than the tool, but DVCS do seem to encourage a more spaghetti-branched approach from what I've seen.

Mike

Andrew_Straw5 · December 11, 2008, 7:20pm

Michael Droettboom wrote:

Darren Dale wrote:

Having worked with bzr and launchpad for a few months now, I wonder if
we might consider such an approach in the future. I know there has
been some experimentation with a git repository, is git supported on
windows now?

I'm not sold that bzr/hg/git makes things simpler for this development
model.

My thought is that matplotlib.sourceforge.net is a centralized website
making centralized, official releases and other centralized facilities.
Thus, it seems to me that a centralized, official version control branch
is an entirely reasonable thing to have. svn provides a
least-common-denominator for this job, and I don't see the reasons to
shift to bzr/hg/git as sufficiently strong to merit such a shift. In
particular, the svn model is pretty darn simple, and therefore easy to
interface with (whether you're a human or a computer program).

Of course, part of why I think this way is that git seems to be working
pretty well for inter-operation with the official svn repository. My
experimental repository, described at
http://matplotlib.sourceforge.net/faq/installing_faq.html#install-from-git
, is nicely allowing me to browse history locally, do git bisect,
maintain my own branches, commit back to the svn repository when
desired, and so on. I think there *may* be some impedance mis-matching
if we tried to really map git branches on svnmerge branches, but right
now that hasn't been an issue I've pursued.

As far as git on Windows: I think there's some kind of msys git and also
the cygwin approach. Not using windows much, though, I'm not sure. I did
hear that Microsoft just started using github for ironruby, so
presumably something works for them.

_Drain_Theodore_R_34 · December 11, 2008, 8:44pm

John,

One thing that would help w/ a rapid delivery/response cycle
would be more comprehensive tests. They would let other developers try out
various ideas and see what breaks before you release it.

We’ve implemented an automated approach where we run an MPL
script using Agg, save the output image and then compare it against a “good”
image that someone looked at. We use PIL to do the compare and if it’s close
(that’s the hard part), then the test passes. If it’s not, someone looks at the
two images to see if the difference is benign. Something similar to this could
be done (if you’re not already) for the MPL examples to make sure that changes
don’t cause plotting problems in other areas.

Having this kind of system is also a great driver for people to
expand it. For example – we really care about unit processing support
everywhere. Every once in awhile, we find a change that someone submits that
breaks unit support. So once of the tasks we‘re going to work on next year is
to build a set of automated test cases that try and hit every plot function
with units which can then run on every release. If there were a simple to use
MPL standard test system like this, other people might contribute more tests as
a way of insuring that the things they care about stay working through various
changes.

It would also be nice to have a test system for unit testing of
components. A lot of the code that does different transformations, symbol and
color mapping, etc etc could be unit tested without the need for actually
drawing anything. If there was a standard location, style, and system, people
could slowly add to the tests over time. You can also consider requiring some
level of unit test for newly submitted code where ever it’s possible.

Just some thoughts…

Ted

[mailto:jdh2358@…149…]

···

From: John Hunter
Sent: Wednesday, December 10, 2008 8:10 PM
To: Darren Dale
Cc: matplotlib-devel@lists.sourceforge.net
Subject: Re: [matplotlib-devel] requesting permission to remove traits
and configobj

On Wed, Dec 10, 2008 at 9:20 PM, Darren Dale <dsdale24@…149…> wrote:

There has been a report at the bugtracker complaining that
matplotlib is overwriting an existing installation of configobj. I had a look
at the code and thought the bug report must be a mistake or windows specific,
but I just saw similar behavior on my linux system.

Ignoring for a minute the question of whether we can/should flush
configobj/traits, it sounds like the real problem is that setup.cfg is not
working like we expect it to. And that is something that should be fixed
if is broken. If mpl is installing configobj or traits even if we are
telling it not to via setup.cfg, then we have a problem. This is
worth knowing, since the last mpl release was broken vis-a-vis the default
backend on win32, which could be explained by a broken setup.cfg.

I would like to simply remove configobj and traits from our
repository. They are only used by the long-neglected experimental traited
config package, which is only of interest to developers who can easily install
them as external dependencies. Is it ok to remove them? If so, should it be
done on all the branches?

How long are we going to continue to maintain the different branches? It was so
much easier back when we only had to worry about the trunk…

You can remove them from the trunk. They should remain on the 0.91 branch
as is (with any known bugs fixed and merged) since that is the point of the
branch (stability for those who cannot upgrade – in principle someone might be
depending on the traited config, in practice unlikely). As for the 0.98
branch, it is slated for destruction so no worries. I share your visceral
reaction against branches, but my head is starting to override this bodily reaction,
as I see the need for them in practice. If we carefully document the best
practices and motivations in the developerr’s guide, we can use them
advantageously.

We have a lot of people contributing to mpl, and approaching or just after
release time we need some mechanism for stabilizing the tested feature
set of the release candidate while allowing other development to proceed,
and branches are the natural mechanism for that. That we are starting to
use them is a reflection of the fact that we have many more active developers
than we ever had before (12 developers contributed between 98.3 and 98.4, it
used to be just 3 or 4 at a time). I wouldn’t be advocating
branches otherwise, because I am an advocate of doing things as simply as
possible: “Make everything as simple as possible, but not
simpler.”.

In general, I am in favor of the wildest, wooliest, development process we can
afford. I would like to have everyone on the trunk, making releases
as often as possible (nightly if we can), with an attitude of “if you
break it, just fix it an rerelease it”. This model worked fine for
us for years, and I think it would continue to work if we have a hyper-active
developer or an automated build bot. In the old days, I would release any
time I added a new feaure, and if I broke something I would have a new release
out in hours. I no longer have the time for that, and we are lucky
to have Charlie buildng OS X and win32 binaties and eggs for multiple python
versions. When we release broken code, Charlie has to go through the
entire test/upload/release cycle again, building for multiple OSs and python
versions while taking care of his wife and two babies, so we want to minimize
that. At the same time, we have lots of developers pushing code into the
mainline. We need some mechanism of balancing the desire of developers to
get new code in and the need for the packagers and release manangers to get
stable code out.

I think the right balance for mpl before a release is to test the HEAD, sign
off on it, branch it, let development proceed on the HEAD, and put any release
critical bugs and fixes into the branch. When the next release comes up,
delete the old branch, and wash-rinse-repeat. We will have in perpetuity
one active release branch at a time, which gets important bug fixes and nothing
else, and in addition (for a while) a legacy branch (0.91) which is updated
once a month or so. I am happy to get input on this.

JDH

No virus
found in this incoming message.

Checked by AVG - http://www.avg.com

Version: 8.0.176 / Virus Database: 270.9.16/1841 - Release Date: 12/10/2008
6:53 PM

_Jordan_Mantha · December 11, 2008, 8:56pm

Michael Droettboom wrote:

Darren Dale wrote:

Having worked with bzr and launchpad for a few months now, I wonder if
we might consider such an approach in the future. I know there has
been some experimentation with a git repository, is git supported on
windows now?

I'm not sold that bzr/hg/git makes things simpler for this development
model.

My thought is that matplotlib.sourceforge.net is a centralized website
making centralized, official releases and other centralized facilities.
Thus, it seems to me that a centralized, official version control branch
is an entirely reasonable thing to have. svn provides a
least-common-denominator for this job, and I don't see the reasons to
shift to bzr/hg/git as sufficiently strong to merit such a shift. In
particular, the svn model is pretty darn simple, and therefore easy to
interface with (whether you're a human or a computer program).

I just wanted to interject some things from the bzr side. Bzr can work
either in a distributed or centralized model which makes it sort of
handy for SVN people, though I don't personally see that as any good
motivation to completely convert the whole thing to bzr. At this point
in time as I've had a chance to work with several projects using SVN,
git, and bzr I agree with you that SVN is a good
least-common-denominator, especially if the project is already using
SVN.

Of course, part of why I think this way is that git seems to be working
pretty well for inter-operation with the official svn repository. My
experimental repository, described at
http://matplotlib.sourceforge.net/faq/installing_faq.html#install-from-git
, is nicely allowing me to browse history locally, do git bisect,
maintain my own branches, commit back to the svn repository when
desired, and so on. I think there *may* be some impedance mis-matching
if we tried to really map git branches on svnmerge branches, but right
now that hasn't been an issue I've pursued.

bzr-svn is also fantastic, I use it all the time. I also wanted to
point out that Launchpad has a bzr mirror of matplotlib going so if
you want to play around with bzr you can try that. In the past bzr has
been very slow when branching but with the latest stable release I
got:

$ time bzr branch lp:matplotlib
Branched 4100 revision(s).

real 2m9.893s

Which is pretty impressive considering you're getting the entire
history. Anyway, I'm not a bzr fanatic but I just wanted to point out
the above for completeness and in case anybody was interested.

-Jordan

···

On Thu, Dec 11, 2008 at 11:20 AM, Andrew Straw <strawman@...36...> wrote:

Andrew_Straw5 · December 11, 2008, 9:37pm

I have also developed some image-based unit tests to compare MPL output,
and I completely agree that getting something like this into MPL is a
very good idea. As Ted writes, the hard part is defining "close" for
images. Minor changes to, for example, agg pixel alignment, cause the
tests to fail when they have low tolerances. So, such tests are a bit
more interactive than plain old unit tests when something changes, but I
think that's worth the pain.

Prabhu Ramachandran has also done similar things for mayavi2 using VTK's
image comparison (see compare_image_with_saved_image in
https://svn.enthought.com/enthought/browser/Mayavi/trunk/integrationtests/mayavi/common.py
).

I'm attaching the code I use to compare images as a starting point -- it
currently uses scipy to load the images, but this could surely be worked
around.

-Andrew

Drain, Theodore R wrote:

image_compare.py (1.93 KB)

···

John,

One thing that would help w/ a rapid delivery/response cycle would be
more comprehensive tests. They would let other developers try out
various ideas and see what breaks before you release it.

We’ve implemented an automated approach where we run an MPL script using
Agg, save the output image and then compare it against a “good” image
that someone looked at. We use PIL to do the compare and if it’s close
(that’s the hard part), then the test passes. If it’s not, someone
looks at the two images to see if the difference is benign. Something
similar to this could be done (if you’re not already) for the MPL
examples to make sure that changes don’t cause plotting problems in
other areas.

Having this kind of system is also a great driver for people to expand
it. For example – we really care about unit processing support
everywhere. Every once in awhile, we find a change that someone submits
that breaks unit support. So once of the tasks we‘re going to work on
next year is to build a set of automated test cases that try and hit
every plot function with units which can then run on every release. If
there were a simple to use MPL standard test system like this, other
people might contribute more tests as a way of insuring that the things
they care about stay working through various changes.

It would also be nice to have a test system for unit testing of
components. A lot of the code that does different transformations,
symbol and color mapping, etc etc could be unit tested without the need
for actually drawing anything. If there was a standard location, style,
and system, people could slowly add to the tests over time. You can
also consider requiring some level of unit test for newly submitted code
where ever it’s possible.

Just some thoughts…

Ted

*From:* John Hunter [mailto:jdh2358@…149…]
*Sent:* Wednesday, December 10, 2008 8:10 PM
*To:* Darren Dale
*Cc:* matplotlib-devel@lists.sourceforge.net
*Subject:* Re: [matplotlib-devel] requesting permission to remove traits
and configobj

On Wed, Dec 10, 2008 at 9:20 PM, Darren Dale <dsdale24@...149... > <mailto:dsdale24@…149…>> wrote:

There has been a report at the bugtracker complaining that matplotlib is
overwriting an existing installation of configobj. I had a look at the
code and thought the bug report must be a mistake or windows specific,
but I just saw similar behavior on my linux system.

Ignoring for a minute the question of whether we can/should flush
configobj/traits, it sounds like the real problem is that setup.cfg is
not working like we expect it to. And that is something that should be
fixed if is broken. If mpl is installing configobj or traits even if we
are telling it not to via setup.cfg, then we have a problem. This is
worth knowing, since the last mpl release was broken vis-a-vis the
default backend on win32, which *could* be explained by a broken setup.cfg.

    I would like to simply remove configobj and traits from our
    repository. They are only used by the long-neglected experimental
    traited config package, which is only of interest to developers who
    can easily install them as external dependencies. Is it ok to remove
    them? If so, should it be done on all the branches?

    How long are we going to continue to maintain the different
    branches? It was so much easier back when we only had to worry about
    the trunk...

You can remove them from the trunk. They should remain on the 0.91
branch as is (with any known bugs fixed and merged) since that is the
point of the branch (stability for those who cannot upgrade -- in
principle someone might be depending on the traited config, in practice
unlikely). As for the 0.98 branch, it is slated for destruction so no
worries. I share your visceral reaction against branches, but my head
is starting to override this bodily reaction, as I see the need for them
in practice. If we carefully document the best practices and
motivations in the developerr's guide, we can use them advantageously.

We have a lot of people contributing to mpl, and approaching or just
after release time we need some mechanism for stabilizing the tested
feature set of the release candidate while allowing other development
to proceed, and branches are the natural mechanism for that. That we
are starting to use them is a reflection of the fact that we have many
more active developers than we ever had before (12 developers
contributed between 98.3 and 98.4, it used to be just 3 or 4 at a
time). I wouldn't be advocating branches otherwise, because I am an
advocate of doing things as simply as possible: "Make everything as
simple as possible, but not simpler.".

In general, I am in favor of the wildest, wooliest, development process
we can afford. I would like to have everyone on the trunk, making
releases as often as possible (nightly if we can), with an attitude of
"if you break it, just fix it an rerelease it". This model worked fine
for us for years, and I think it would continue to work if we have a
hyper-active developer or an automated build bot. In the old days, I
would release any time I added a new feaure, and if I broke something I
would have a new release out in hours. I no longer have the time for
that, and we are lucky to have Charlie buildng OS X and win32 binaties
and eggs for multiple python versions. When we release broken code,
Charlie has to go through the entire test/upload/release cycle again,
building for multiple OSs and python versions while taking care of his
wife and two babies, so we want to minimize that. At the same time, we
have lots of developers pushing code into the mainline. We need some
mechanism of balancing the desire of developers to get new code in and
the need for the packagers and release manangers to get stable code out.

I think the right balance for mpl before a release is to test the HEAD,
sign off on it, branch it, let development proceed on the HEAD, and put
any release critical bugs and fixes into the branch. When the next
release comes up, delete the old branch, and wash-rinse-repeat. We will
have in perpetuity one active release branch at a time, which gets
important bug fixes and nothing else, and in addition (for a while) a
legacy branch (0.91) which is updated once a month or so. I am happy to
get input on this.

JDH

No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.176 / Virus Database: 270.9.16/1841 - Release Date:
12/10/2008 6:53 PM

------------------------------------------------------------------------

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you. Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/

------------------------------------------------------------------------

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

Gael_Varoquaux1 · December 12, 2008, 6:05am

Yeah, and it always fails due to the hardware rendering being slightly
different on different computers :). I think we are giving up on this
approach.

Gaël

···

On Thu, Dec 11, 2008 at 01:37:01PM -0800, Andrew Straw wrote:

Prabhu Ramachandran has also done similar things for mayavi2 using VTK's
image comparison (see compare_image_with_saved_image in
https://svn.enthought.com/enthought/browser/Mayavi/trunk/integrationtests/mayavi/common.py
).

_Darren_Dale2 · December 12, 2008, 1:41pm

There has been a report at the bugtracker complaining that matplotlib is overwriting an existing installation of configobj. I had a look at the code and thought the bug report must be a mistake or windows specific, but I just saw similar behavior on my linux system.

Ignoring for a minute the question of whether we can/should flush configobj/traits, it sounds like the real problem is that setup.cfg is not working like we expect it to. And that is something that should be fixed if is broken. If mpl is installing configobj or traits even if we are telling it not to via setup.cfg, then we have a problem. This is worth knowing, since the last mpl release was broken vis-a-vis the default backend on win32, which could be explained by a broken setup.cfg.

I would like to simply remove configobj and traits from our repository. They are only used by the long-neglected experimental traited config package, which is only of interest to developers who can easily install them as external dependencies. Is it ok to remove them? If so, should it be done on all the branches?

[…]

You can remove them from the trunk. They should remain on the 0.91 branch as is (with any known bugs fixed and merged) since that is the point of the branch (stability for those who cannot upgrade – in principle someone might be depending on the traited config, in practice unlikely).

I just removed them from the trunk, but not the 0.91 or 0.98.5 branches. I was going to add a note to the API_CHANGES log, was it removed?

···

On Wed, Dec 10, 2008 at 11:10 PM, John Hunter <jdh2358@…149…> wrote:

On Wed, Dec 10, 2008 at 9:20 PM, Darren Dale <dsdale24@…149…> wrote:

Michael_Droettboom · December 12, 2008, 2:04pm

Darren Dale wrote:

        There has been a report at the bugtracker complaining that
        matplotlib is overwriting an existing installation of
        configobj. I had a look at the code and thought the bug report
        must be a mistake or windows specific, but I just saw similar
        behavior on my linux system.

    Ignoring for a minute the question of whether we can/should flush
    configobj/traits, it sounds like the real problem is that
    setup.cfg is not working like we expect it to. And that is
    something that should be fixed if is broken. If mpl is installing
    configobj or traits even if we are telling it not to via
    setup.cfg, then we have a problem. This is worth knowing, since
    the last mpl release was broken vis-a-vis the default backend on
    win32, which *could* be explained by a broken setup.cfg.

        I would like to simply remove configobj and traits from our
        repository. They are only used by the long-neglected
        experimental traited config package, which is only of interest
        to developers who can easily install them as external
        dependencies. Is it ok to remove them? If so, should it be
        done on all the branches?

[...]

    You can remove them from the trunk. They should remain on the
    0.91 branch as is (with any known bugs fixed and merged) since
    that is the point of the branch (stability for those who cannot
    upgrade -- in principle someone might be depending on the traited
    config, in practice unlikely).

I just removed them from the trunk, but not the 0.91 or 0.98.5 branches. I was going to add a note to the API_CHANGES log, was it removed?

API_CHANGES was moved to doc/api/api_changes.rst so it gets automatically put up on the website.

Cheers,
Mike

···

On Wed, Dec 10, 2008 at 11:10 PM, John Hunter <jdh2358@...149... > <mailto:jdh2358@…149…>> wrote:
On Wed, Dec 10, 2008 at 9:20 PM, Darren Dale <dsdale24@...149... > <mailto:dsdale24@…149…>> wrote:

--
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

_Drain_Theodore_R_34 · December 16, 2008, 5:12pm

Continued from: requesting permission to remove traits and configobj...

Gael,
There might be ways to handle these problems. A lot of depends on what we're trying to test. I agree that if we take the example scripts, run them, and save the plots, we'll never get an automated test harness to figure things out because of machine differences.

However, if we set the goal of the testing to be that we make sure that MPL runs, that it accepts the correct options, and produces a plot, then we can be more creative with the testing harness. We can do significant amounts of testing with a fixed back end (Agg probably) that generates an image for comparison. We've tried a number of things in our own testing work which could help. The first step is to identify why plots on different machines are different: numeric differences in input data, agg output, font differences, colors, etc, etc.

Some ideas that might (or might not) help:

- Use wide lines that are grey in color for everything. The plot looks crazy but then if you get one pixel shifts, it isn't a case of the pixels going "white, black, white" on one machine and "white, white, black" on another - you end up with most of the line overlapping which makes image comparisons easier.

- Never generate the input data on the machine you're on. For example, never do this:
t = arange(0.0,3.01,0.01)
s = sin(2*pi*t)
Because you can get differences between machines. A better way is to run this on the machine that will generate the "correct" image and then save the numbers using pickle or by embedding them in the script.

- Embed a font with the tests to eliminate font server differences (no experience with this so I'm not sure how hard this would be). We could even create a dummy font that just has black squares for each character - it still tests that everything is drawn in the correct place and runs properly and eliminates subtle character differences.

- Create a testing backend that records the drawing commands as a set of meta-data like (draw red line from point 1 to point 2). The test case then checks that the proper commands were issued by the test script. This eliminates drawing completely. A nice comparison suite would allow loose comparisons like "make sure a vertical line was drawn from (10,20) to (30,40) with a pixel slop of 2 pixels.

- Smarter image comparison algorithms. We currently use something that processes the image with PIL and looks at an averaged pixel difference (it's not perfect by any means). I'll try to talk to some of the people here who work in image processing to see if there are any fuzzy image comparison algorithms they can recommend.

Ted

···

-----Original Message-----
From: Gael Varoquaux [mailto:gael.varoquaux@…427…]
Sent: Thursday, December 11, 2008 10:06 PM
To: Andrew Straw
Cc: Drain, Theodore R; matplotlib-devel@lists.sourceforge.net
Subject: Re: [matplotlib-devel] requesting permission to remove traits
and configobj

On Thu, Dec 11, 2008 at 01:37:01PM -0800, Andrew Straw wrote:
> Prabhu Ramachandran has also done similar things for mayavi2 using
VTK's
> image comparison (see compare_image_with_saved_image in
>
https://svn.enthought.com/enthought/browser/Mayavi/trunk/integrationtes
ts/mayavi/common.py
> ).

Yeah, and it always fails due to the hardware rendering being slightly
different on different computers :). I think we are giving up on this
approach.

Gaël

No virus found in this incoming message.
Checked by AVG - http://www.avg.com
Version: 8.0.176 / Virus Database: 270.9.17/1844 - Release Date:
12/11/2008 8:58 PM

_John_Hunter · December 16, 2008, 5:36pm

- Embed a font with the tests to eliminate font server differences (no experience with this so I'm not sure how hard this would be). We could even create a dummy font that just has black squares for each character - it still tests that everything is drawn in the correct place and runs properly and eliminates subtle character differences.

Since we ship our own fonts, it is pretty easy to test against a known
set. Just set up the rc to use Vera, and cm* or stix* for math.

- Create a testing backend that records the drawing commands as a set of meta-data like (draw red line from point 1 to point 2). The test case then checks that the proper commands were issued by the test script. This eliminates drawing completely. A nice comparison suite would allow loose comparisons like "make sure a vertical line was drawn from (10,20) to (30,40) with a pixel slop of 2 pixels.

I think this idea has promise.

I'll also add that we can do *a lot* more with simple API tests that
do not look at the output but at least make sure the inputs, in all
their variety, are at least accepted. backend_driver does this to an
extent, but I will be adding nose tests for any new features I add.
Eg, for the recent markevery, which can be None, an integer or a
length 2 tuple, I added this simple test to unit/nose_tests

def test_markevery():
x, y = np.random.rand(2, 100)

    # check marker only plot
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.plot(x, y, 'o', label='default')
    ax.plot(x, y, 'd', markevery=None, label='mark all')
    ax.plot(x, y, 's', markevery=10, label='mark every 10')
    ax.plot(x, y, '+', markevery=(5, 20), label='mark every 5 starting at 10')
    ax.legend()
    fig.canvas.draw()
    plt.close(fig)

    # check line/marker combos
    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax.plot(x, y, '-o', label='default')
    ax.plot(x, y, '-d', markevery=None, label='mark all')
    ax.plot(x, y, '-s', markevery=10, label='mark every 10')
    ax.plot(x, y, '-+', markevery=(5, 20), label='mark every 5 starting at 10')
    ax.legend()
    fig.canvas.draw()
    plt.close(fig)

Having well defined tests that heavily exercise the frontend API would
be a significant step forward, and the harder question of output
comparison could be added in.

JDH

···

On Tue, Dec 16, 2008 at 11:12 AM, Drain, Theodore R <theodore.r.drain@...179...> wrote:

Gael_Varoquaux1 · December 16, 2008, 6:45pm

Absolutely. I was just saying that we where giving up because, in our
case the cost was bigger than the benefit. However, we rely strongly on
hardware acceleration, so we don't control completely our rendering
pipeline, and images can vary by a huge amount (eg when the z-ordering is
screwed up). You can always design more robust test by proper analysis of
the images, but that could also be a PhD research program :).

I am not saying MPL shouldn't be going down that way, though.

Gaël

···

On Tue, Dec 16, 2008 at 09:12:24AM -0800, Drain, Theodore R wrote:

Continued from: requesting permission to remove traits and configobj...

Gael,
There might be ways to handle these problems. A lot of depends on what we're trying to test. I agree that if we take the example scripts, run them, and save the plots, we'll never get an automated test harness to figure things out because of machine differences.