Directories for C/C++ extensions

Fellow developers,

I am working on a PR to replace the use of matplotlib.delaunay with the Qhull library. Installation will be similar to the existing packages LibAgg and CXX in that if the system already has a sufficiently recent version of Qhull installed then matplotlib will use that, otherwise it will build the required library from the source code shipped with matplotlib.

I have a thin C wrapper called qhull_wrap.c (following the coding guidelines) which I’ll put in the top-level src directory along with most of the existing C/C++ extensions. But my question is where to put the qhull source code?

Current practice has separate top-level directories called agg24 and CXX for the LibAgg and CXX packages respectively, so my initial thought was to follow this and create a new top-level directory called qhull to place the library code in. But I don’t like this approach of creating a new top-level directory as (1) I think the top-level should remain as simple and uncluttered as possible, (2) it tends to overemphasize the importance of these third-party libraries as they are some of the first directories users see when unzipping the mpl tarball, and (3) it is not immediately obvious that the code in these directories is from third-party libraries rather than something we ourselves have written.

Hence my preference is to create a new top-level directory called something like ‘third-party’ (or should that be ‘third_party’?), and place all the third-party libraries in that; i.e. move the agg24 and CXX directories into third-party, and place the new qhull source code in third-party/qhull.

What do others think of this idea?

Ian Thomas

Fellow developers,

I am working on a PR to replace the use of matplotlib.delaunay with the
Qhull library. Installation will be similar to the existing packages
LibAgg and CXX in that if the system already has a sufficiently recent
version of Qhull installed then matplotlib will use that, otherwise it
will build the required library from the source code shipped with
matplotlib.

I have a thin C wrapper called qhull_wrap.c (following the coding
guidelines) which I'll put in the top-level src directory along with
most of the existing C/C++ extensions. But my question is where to put
the qhull source code?

Current practice has separate top-level directories called agg24 and CXX
for the LibAgg and CXX packages respectively, so my initial thought was
to follow this and create a new top-level directory called qhull to
place the library code in. But I don't like this approach of creating a
new top-level directory as (1) I think the top-level should remain as
simple and uncluttered as possible, (2) it tends to overemphasize the
importance of these third-party libraries as they are some of the first
directories users see when unzipping the mpl tarball, and (3) it is not
immediately obvious that the code in these directories is from
third-party libraries rather than something we ourselves have written.

Hence my preference is to create a new top-level directory called
something like 'third-party' (or should that be 'third_party'?), and
place all the third-party libraries in that; i.e. move the agg24 and CXX
directories into third-party, and place the new qhull source code in
third-party/qhull.

What do others think of this idea?

Adding this top-level directory is OK with me, but since I hope we will not need to carry along very many of such library source trees, it doesn't seem so important to segregate them in this way. If you do, alternative names might be "dependencies" or "external". The contents don't necessarily match what can be found elsewhere; Mike has needed to make local patches on occasion.

Eric

···

On 2013/10/06 10:09 AM, Ian Thomas wrote:

Ian Thomas

I like this idea. I've seen this called "extern" in other projects, but I don't have a strong feeling about the name. I think it's good idea for all of the reasons you mention.

Mike

···

________________________________________
From: Ian Thomas [ianthomas23@...149...]
Sent: Sunday, October 06, 2013 4:09 PM
To: matplotlib-devel@lists.sourceforge.net
Subject: [matplotlib-devel] Directories for C/C++ extensions

Fellow developers,

I am working on a PR to replace the use of matplotlib.delaunay with the Qhull library. Installation will be similar to the existing packages LibAgg and CXX in that if the system already has a sufficiently recent version of Qhull installed then matplotlib will use that, otherwise it will build the required library from the source code shipped with matplotlib.

I have a thin C wrapper called qhull_wrap.c (following the coding guidelines) which I'll put in the top-level src directory along with most of the existing C/C++ extensions. But my question is where to put the qhull source code?

Current practice has separate top-level directories called agg24 and CXX for the LibAgg and CXX packages respectively, so my initial thought was to follow this and create a new top-level directory called qhull to place the library code in. But I don't like this approach of creating a new top-level directory as (1) I think the top-level should remain as simple and uncluttered as possible, (2) it tends to overemphasize the importance of these third-party libraries as they are some of the first directories users see when unzipping the mpl tarball, and (3) it is not immediately obvious that the code in these directories is from third-party libraries rather than something we ourselves have written.

Hence my preference is to create a new top-level directory called something like 'third-party' (or should that be 'third_party'?), and place all the third-party libraries in that; i.e. move the agg24 and CXX directories into third-party, and place the new qhull source code in third-party/qhull.

What do others think of this idea?

Ian Thomas

OK, 'extern' seems the best directory name. After I've finished the qhull
PR I'll create another one to move the existing directories across.

Ian

···

On 7 October 2013 15:22, Michael Droettboom <mdroe@...31...> wrote:

I like this idea. I've seen this called "extern" in other projects, but I
don't have a strong feeling about the name. I think it's good idea for all
of the reasons you mention.

Ian,

I am working on a PR to replace the use of matplotlib.delaunay with the
Qhull library.

nice! -- ( though I sure wish Qhull did constrained delaunay...)

Installation will be similar to the existing packages LibAgg
and CXX in that if the system already has a sufficiently recent version of
Qhull installed then matplotlib will use that, otherwise it will build the
required library from the source code shipped with matplotlib.

Why bother, why not just always build the internal version?

(for that matter, same with agg)

Wouldn't it be a lot easier and more robust to be sure that everyone
is running the exact same code?

What are the odds that folks are using qhull for something else, and
even more to the point, what are the odds that the duplication of this
lib would matter one wit?

This isn't like LAPACK, where folks have a compellling reason to run a
particular version.

-- just my thoughts on how to keep things simpler.

-Chris

···

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

From a Linux distro packaging perspective bundled external libs are a big no-no. If a patch is needed for whatever reason packagers don’t want to have to go and hunt down dozens of copies of the same library. In some cases there is no alternative but it should be avoided whenever possible.

···

On Oct 18, 2013 8:20 PM, “Chris Barker” <chris.barker@…236…> wrote:

Ian,

I am working on a PR to replace the use of matplotlib.delaunay with the
Qhull library.

nice! – ( though I sure wish Qhull did constrained delaunay…)

Installation will be similar to the existing packages LibAgg
and CXX in that if the system already has a sufficiently recent version of
Qhull installed then matplotlib will use that, otherwise it will build the
required library from the source code shipped with matplotlib.

Why bother, why not just always build the internal version?

(for that matter, same with agg)

Wouldn’t it be a lot easier and more robust to be sure that everyone
is running the exact same code?

What are the odds that folks are using qhull for something else, and
even more to the point, what are the odds that the duplication of this
lib would matter one wit?

This isn’t like LAPACK, where folks have a compellling reason to run a
particular version.

– just my thoughts on how to keep things simpler.

-Chris

Chris,

Todd has hit the nail on the head.

To expand slightly, with the current situation the onus is on us to ensure
that mpl builds OK and passes all of our tests with and without each of the
external libraries. Linux distro packagers will choose to set up qhull as
a required dependency for their mpl package, and once they have done this
can simply delete our directory containing the qhull source code in their
mpl source package, and it will build OK without any further changes and we
can all be confident that it will work correctly.

If we always used our internal version then distro packagers would have to
change our setup scripts to build using the external libraries. This would
be more time-consuming and error prone leading to less timely mpl distro
releases. We need to make their job as easy as possible.

Ian

···

On 18 October 2013 19:18, Chris Barker <chris.barker@...236...> wrote:

Ian,

> I am working on a PR to replace the use of matplotlib.delaunay with the
> Qhull library.

nice! -- ( though I sure wish Qhull did constrained delaunay...)

> Installation will be similar to the existing packages LibAgg
> and CXX in that if the system already has a sufficiently recent version
of
> Qhull installed then matplotlib will use that, otherwise it will build
the
> required library from the source code shipped with matplotlib.

Why bother, why not just always build the internal version?

(for that matter, same with agg)

Wouldn't it be a lot easier and more robust to be sure that everyone
is running the exact same code?

What are the odds that folks are using qhull for something else, and
even more to the point, what are the odds that the duplication of this
lib would matter one wit?

This isn't like LAPACK, where folks have a compellling reason to run a
particular version.

-- just my thoughts on how to keep things simpler.

Agreed on all of these points, and I’m not advocating a change from
what Ian is doing. However, as get on in years, I’m starting to
more and more feel like the needs of the distro packagers, which are
primarily security and stability, are sometimes at odds with the
needs of scientific software, where the premium is on
reproducibility. The output of matplotlib depends on the versions
of some of its dependencies, not the version of matplotlib alone,
and that’s problematic for some… Anyway, just food for thought.
I still think the most practical approach is the one we’re taking
(shipping dependencies, but making it easy to use the system
libraries when available).
Mike

···

On 10/19/2013 04:14 AM, Ian Thomas
wrote:

On 18 October 2013 19:18, Chris Barker <chris.barker@…236…>
wrote:

Ian,

            > I am working on a PR to replace the use of

matplotlib.delaunay with the

            > Qhull library.
          nice! -- ( though I sure wish Qhull did constrained

delaunay…)

            > Installation will be similar to the existing

packages LibAgg

            > and CXX in that if the system already has a

sufficiently recent version of

            > Qhull installed then matplotlib will use that,

otherwise it will build the

            > required library from the source code shipped with

matplotlib.

          Why bother, why not just always build the internal

version?

          (for that matter, same with agg)



          Wouldn't it be a lot easier and more robust to be sure

that everyone

          is running the exact same code?



          What are the odds that folks are using qhull for something

else, and

          even more to the point, what are the odds that the

duplication of this

          lib would matter one wit?



          This isn't like LAPACK, where folks have a compellling

reason to run a

          particular version.



          -- just my thoughts on how to keep things simpler.

Chris,

Todd has hit the nail on the head.

          To expand slightly, with the current situation the onus is

on us to ensure that mpl builds OK and passes all of our
tests with and without each of the external libraries.
Linux distro packagers will choose to set up qhull as a
required dependency for their mpl package, and once they
have done this can simply delete our directory containing
the qhull source code in their mpl source package, and it
will build OK without any further changes and we can all
be confident that it will work correctly.

          If we always used our internal version then distro

packagers would have to change our setup scripts to build
using the external libraries. This would be more
time-consuming and error prone leading to less timely mpl
distro releases. We need to make their job as easy as
possible.

                   -- _ |\/|o _|_ _. _ | | \.__ __|__|_|_ _ _ ._ _ | ||(_| |(_|(/_| |_/|(_)(/_|_ |_|_)(_)(_)| | |

http://www.droettboom.com

To expand slightly, with the current situation the onus is on us to ensure
that mpl builds OK and passes all of our tests with and without each of the
external libraries.

If you only have internal libs, then there is less to do -- it only
need to work with the version you bundle. And making sure it works
with any-old-version-that-may-not-exist-yet is a pretty formidable
challenge!

Linux distro packagers will choose to set up qhull as a
required dependency for their mpl package, and once they have done this can
simply delete our directory containing the qhull source code in their mpl
source package,

If they are going to insist on doing this, then, yes you should
certainly do it this way.

we can all be confident that it will work correctly.

only if you've tested against the version (maybe patched) of the
external lib they are using...

If we always used our internal version then distro packagers would have to
change our setup scripts to build using the external libraries. This would
be more time-consuming and error prone leading to less timely mpl distro
releases. We need to make their job as easy as possible.

it's easiest for them if they don't try to pull out an included
dependency -- but maybe you're right that that REALLY want to do that!

The needs of the distro packagers, which are primarily security and
stability, are sometimes at odds with the needs of scientific software,
where the premium is on reproducibility. The output of matplotlib depends
on the versions of some of its dependencies, not the version of matplotlib
alone, and that's problematic for some...

exactly -- if we know exactly which version of a lib is in use, we
know that it works the way we expect in the use cases we expect to use
it in.

But I'm not maintaining this code, so have at it in the way that makes
sense to you.

NOTE: it would be a different story if this were a netwroking protocol
lib, or something where security patches would be critical. Maybe I'm
naive, but this doesn't seem likely in this case.

-Chris

···

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

> we can all be confident that it will work correctly.

only if you've tested against the version (maybe patched) of the
external lib they are using...

Of course not. We provide the framework to build mpl and run tests.
Distro developers choose how they want to build it and then run our tests.
If the tests pass then both they and us are confident that it works
correctly. We haven't had to test against anyone else's choice of library
version.

But I'm not maintaining this code, so have at it in the way that makes

sense to you.

This is nothing to do with what makes sense to me; it is about following
the existing policy on C/C++ extensions when adding a new one. Just
because I am taking time to answer your questions don't presume I am taking
a particular stance. In fact I don't have a preference either way, I am
just doing the work that is required in a way that is consistent with
existing code. If there is a change of policy I am happy to do it a
different way.

Ian

···

On 21 October 2013 18:36, Chris Barker <chris.barker@...236...> wrote:

> To expand slightly, with the current situation the onus is on us to
ensure
> that mpl builds OK and passes all of our tests with and without each of
the
> external libraries.

If you only have internal libs, then there is less to do -- it only
need to work with the version you bundle. And making sure it works
with any-old-version-that-may-not-exist-yet is a pretty formidable
challenge!

We have sonums for this very reason. And this could apply just as well to
python itself. There is a reason not many distros ship SAGE packages.

> Linux distro packagers will choose to set up qhull as a
> required dependency for their mpl package, and once they have done this
can
> simply delete our directory containing the qhull source code in their mpl
> source package,

If they are going to insist on doing this, then, yes you should
certainly do it this way.

Yes, they are. This is the whole point of having packages in the first
place, as opposed to something like windows where every program just
bundles everything..

> we can all be confident that it will work correctly.

only if you've tested against the version (maybe patched) of the
external lib they are using...

It is only matplotlib's responsibility to test against the unpatched
versions specified, just like it is only matplotlib's responsibility to
test against the unpatched python versions specified. Doing this isn't a
particularly difficult task, there are easily tens of thousands of apps
that have no problem with this.

> If we always used our internal version then distro packagers would have
to
> change our setup scripts to build using the external libraries. This
would
> be more time-consuming and error prone leading to less timely mpl distro
> releases. We need to make their job as easy as possible.

it's easiest for them if they don't try to pull out an included
dependency -- but maybe you're right that that REALLY want to do that!

It would be easiest if matplotlib detected whether the library is present
at build-time. That is what most packages do.

>The needs of the distro packagers, which are primarily security and
> stability, are sometimes at odds with the needs of scientific software,
> where the premium is on reproducibility. The output of matplotlib
depends
> on the versions of some of its dependencies, not the version of
matplotlib
> alone, and that's problematic for some...

exactly -- if we know exactly which version of a lib is in use, we
know that it works the way we expect in the use cases we expect to use
it in.

But I'm not maintaining this code, so have at it in the way that makes
sense to you.

NOTE: it would be a different story if this were a netwroking protocol
lib, or something where security patches would be critical. Maybe I'm
naive, but this doesn't seem likely in this case.

You would be surprised what sort of packages can lead to security
vulnerabilities.

···

On Mon, Oct 21, 2013 at 7:36 PM, Chris Barker <chris.barker@...236...> wrote: