cxx improvements

Mike,

I hope you don't mind my including the devel list on this; I think it is a very important topic (although it may be one that ends up coming down to each individual's personal experience and preferences).

The reasons pycxx worries me are:
0) C++ worries me in general--I know lots of wonderful code like Agg is written in C++, but personally I find C++ hard to deal with. This is mainly my problem. (But maybe not entirely. From the SWIG history:
"February 11, 2000. SWIG1.3 alpha released. This is the first in a series of releases that slowly migrate most of SWIG's implementation back to ANSI C.")
1) Transform module classes wrapped with pycxx have some strange characteristics, and don't behave like nice python classes. Sorry this is vague, but I stumbled on it, and John confirmed it, months ago, and then I blissfully forgot all the details. I can probably dredge them up from the mailing list if it would be of interest.
2) Given my discomfort with C++, I am even more uncomfortable relying on what seems to be an almost unmaintained component. All the warnings were worrying me, for example. Your recent fixups have reduced that source of discomfort!

Swig: I agree with you on that one. I have never taken to swig either, although I recognize that it has many fans. I'm afraid that it may be the only automated game in town for C++ apart from PyCxx, though. GvR's comments are interesting.

Pyrex: now, there is something I like! Not so great for C++, but it surely is nice for wrapping C, and for speeding some things up while writing "almost Python".

Native API: I've done a little of that--the cntr.c interface code. The good things are that it eliminates another layer of dependency, and what you see is what you get. It is pretty tedious, though, and there are lots of gotchas. For me it is a reasonable choice for C code, although I would still gravitate towards Pyrex in most cases. For C++, I would not know where to start. Again, that's my personal handicap.

Eric

Michael Droettboom wrote:

···

Eric Firing wrote:

Mike,

Thanks for fixing pycxx to remove all those warnings!

I am curious: have you looked at the most recent upstream pycxx? My impression is that there were supposedly some changes to support python 2.5, and a couple months ago I made a feeble and unsuccessful attempt to use the updated version in place of the one included in mpl.

I looked at it a while ago because I suspected it might be the cause of a segfault. (Turned out it wasn't). I diffed it against the mpl one and didn't see much of significance, but I was really only looking for potential memory usage mistakes.

It looks as if some Python2.5 changes have already been made in mpl. All the references to Py_ssize_t are a Python2.5-ism, for instance.

In any case, perhaps your improvements can be incorporated upstream.

It actually looks like someone else has already arrived at the identical fix!

Longer term, I will be pleased if our dependence on pycxx can be removed.

It's funny you bring this up. I know that's on John's wish list. I just spent a little time looking at SWIG (trying to convert my simple ttconv extension to use SWIG), and I have some real concerns about it.

For one, the syntax is weird, and it seems like any real-world non-trivial extension requires all kinds of magic goo.

Secondly, it generates very bloated and suboptimal wrapper code, even after experimenting with the myriad of options to cut down on what gets generated. With my ttconv extension the object file doubled in size, since it appears to be throwing in all kinds of stuff we don't need.

GvR said:

"I've yet to see an extension module using SWIG that doesn't make me think it was a mistake to use SWIG instead of manually written wrappers. The extra time paid upfront to create hand-crafted wrappers is gained back hundredfold by time saved debugging the SWIG-generated code later."

http://www.artima.com/weblogs/viewpost.jsp?thread=95863

In my own experience, writing to the Python/C API directly is not horrible, but there are lots of pits to fall down if you're not careful. GvR wrote a lot of that API, so I'm not sure I agree with the full strength of his statement :wink:

However, SWIG rubs me the wrong way, including its history of needing a specific version to get a specific result. (I don't know if they're still breaking backward compatibility at the same rate that they once were.)

I would hate to move away from CXX because it is less popular (unless the lower popularity translates into a real loss of utility, like updates following Python releases). There are a lot of things in this world that less popular but still superior -- think of beers for instance :wink: I would just advocate looking at all the alternatives to SWIG, including raw Python/C API.

(Again, I'm late to the party, so forgive me if my assumptions are wrong or these things have already been worked through.)

Cheers,
Mike

I don't mind this going on the list with the warning that I was not trying to start any sort of anti-SWIG movement -- These are just my first impressions, and at the end of the day I may feel differently. I think you're *dead on* that personal experience and preferences have a lot to do with how these options are evaluated. Although, ideally, it would be nice to find some common ground so we aren't all just hacking

2) Given my discomfort with C++, I am even more uncomfortable relying on what seems to be an almost unmaintained component. All the warnings were worrying me, for example. Your recent fixups have reduced that source of discomfort!

My recent fixup was exactly 6 characters -- glad to see they had so much impact! :wink:

I think the small amount of support is a valid concern, particularly as we move toward Py3k, if no one steps up to the plate to help with the migration.

I appreciate the way it fits so nicely into C++ ideas about RAII and exceptions -- but if you're not a C++ guy, that pro is probably a con.

Swig: I agree with you on that one. I have never taken to swig either, although I recognize that it has many fans. I'm afraid that it may be the only automated game in town for C++ apart from PyCxx, though.

Boost.Python is another option -- probably more like SWIG than pycxx in that the user specifies *how* something should be called rather than the steps to do it. But if you're C++ adverse, that's a huge step in the wrong direction -- it's pretty simple to use when things go well, but when things go wrong, the error messages can be completely inscrutable.

ctypes is also an option for calling C code, though I have no experience with it.

Pyrex: now, there is something I like! Not so great for C++, but it surely is nice for wrapping C, and for speeding some things up while writing "almost Python".

Pyrex is "fun to use" and it allows code to be more gradually migrated from Python to C. We would need to take care to not add another run-time dependency for users. (As an aside, I can't find the license of Pyrex. Does anyone know what it is?)

Native API: I've done a little of that--the cntr.c interface code. The good things are that it eliminates another layer of dependency, and what you see is what you get. It is pretty tedious, though, and there are lots of gotchas. For me it is a reasonable choice for C code, although I would still gravitate towards Pyrex in most cases. For C++, I would not know where to start. Again, that's my personal handicap.

Agreed, it can be tricky to mix C++ exceptions with raw Python/C API wrappers. It's doable by following certain patterns, but it'd be nice to automate those somehow.

When I worked on Gamera, there was nothing terribly suitable for the sort of generic programming C++ stuff we had (the kind of C++ that makes you cringe, I'm sure), so we ended up writing our own wrapper mechanism that was very highly specialized. I don't think that approach is necessary or desirable here.

I suppose my initial disappointment in SWIG is that I like its fundamental idea -- of automating the tedious boilerplate -- but the execution of it just seems so heavyweight. But maybe that doesn't really matter. This is Python after all :wink:

Cheers,
Mike

···

on our own things, obviously :wink: And it would be nice to find a single tool that is suitable for both C and C++. And it would be nice if it made my dinner :wink: Eric Firing wrote:

Michael Droettboom wrote:

Eric Firing wrote:

Mike,

Thanks for fixing pycxx to remove all those warnings!

I am curious: have you looked at the most recent upstream pycxx? My impression is that there were supposedly some changes to support python 2.5, and a couple months ago I made a feeble and unsuccessful attempt to use the updated version in place of the one included in mpl.

I looked at it a while ago because I suspected it might be the cause of a segfault. (Turned out it wasn't). I diffed it against the mpl one and didn't see much of significance, but I was really only looking for potential memory usage mistakes.

It looks as if some Python2.5 changes have already been made in mpl. All the references to Py_ssize_t are a Python2.5-ism, for instance.

In any case, perhaps your improvements can be incorporated upstream.

It actually looks like someone else has already arrived at the identical fix!

Longer term, I will be pleased if our dependence on pycxx can be removed.

It's funny you bring this up. I know that's on John's wish list. I just spent a little time looking at SWIG (trying to convert my simple ttconv extension to use SWIG), and I have some real concerns about it.

For one, the syntax is weird, and it seems like any real-world non-trivial extension requires all kinds of magic goo.

Secondly, it generates very bloated and suboptimal wrapper code, even after experimenting with the myriad of options to cut down on what gets generated. With my ttconv extension the object file doubled in size, since it appears to be throwing in all kinds of stuff we don't need.

GvR said:

"I've yet to see an extension module using SWIG that doesn't make me think it was a mistake to use SWIG instead of manually written wrappers. The extra time paid upfront to create hand-crafted wrappers is gained back hundredfold by time saved debugging the SWIG-generated code later."

M2Crypto woes

In my own experience, writing to the Python/C API directly is not horrible, but there are lots of pits to fall down if you're not careful. GvR wrote a lot of that API, so I'm not sure I agree with the full strength of his statement :wink:

However, SWIG rubs me the wrong way, including its history of needing a specific version to get a specific result. (I don't know if they're still breaking backward compatibility at the same rate that they once were.)

I would hate to move away from CXX because it is less popular (unless the lower popularity translates into a real loss of utility, like updates following Python releases). There are a lot of things in this world that less popular but still superior -- think of beers for instance :wink: I would just advocate looking at all the alternatives to SWIG, including raw Python/C API.

(Again, I'm late to the party, so forgive me if my assumptions are wrong or these things have already been worked through.)

Cheers,
Mike

Eric Firing wrote:

Swig: GvR's comments are interesting.

Do you have a pointer to those? I'd love to see them.

My thoughts on SWIG:

It's real strength is that, being automated, it can be used to wrap large libraries, particularly ones that are constantly evolving (wxPython). Another one is that it is used for a lot of projects, so it's a handy tool to know. I'd never use it to wrap code written just to extend python.

By the way, it looks like there is now a code generator for boost:

http://www.language-binding.net/pyplusplus/pyplusplus.html

so that's another option -- but talk about dependencies!

And there's SIP, but I've never heard of it being used by anything other than PyQT

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Michael Droettboom wrote:

Pyrex is "fun to use" and it allows code to be more gradually migrated from Python to C. We would need to take care to not add another run-time dependency for users.

There are no run-time dependencies with Pyrex -- it generates C code, which is then compiled.
> (As an aside, I can't find the license

of Pyrex. Does anyone know what it is?)

No, but it doesn't matter as it won't get included in the project, only it's output would -- so it's like a compiler, not a library. Unless we want to write our own fork of it, which I doubt!

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Michael Droettboom wrote:
[...]

I think the small amount of support is a valid concern, particularly as we move toward Py3k, if no one steps up to the plate to help with the migration.

Do you have a sense of how difficult that migration would be?

I appreciate the way it fits so nicely into C++ ideas about RAII and exceptions -- but if you're not a C++ guy, that pro is probably a con.

What is RAII?
Actually, to the limited extent that I understand it, I *like* what I take to be the pycxx approach, transparently translating C++ constructs into corresponding Python constructs (e.g., exceptions). And I appreciate some of the ways that C++ advances beyond C. I just get all tangled up in the complicated declarations and the need to track code in .cpp and .h files in parallel. C++ feels an order of magnitude more complex to me than C. My discomfort level is very gradually decreasing, though.

Pyrex is "fun to use" and it allows code to be more gradually migrated from Python to C. We would need to take care to not add another run-time dependency for users. (As an aside, I can't find the license of Pyrex. Does anyone know what it is?)

Not a problem, as Chris has noted in another reply. Pyrex is already being used in Jeff Whitaker's basemap module, which is a critical contribution to mpl for many users although it is not presently in the main tree.

I don't know how well pyrex will be maintained and updated, long-term.

[...]

I suppose my initial disappointment in SWIG is that I like its fundamental idea -- of automating the tedious boilerplate -- but the execution of it just seems so heavyweight. But maybe that doesn't really matter. This is Python after all :wink:

I think it is a perfectly valid concern and consideration.

Eric

Christopher Barker wrote:

Eric Firing wrote:

Swig: GvR's comments are interesting.

Do you have a pointer to those? I'd love to see them.

http://www.artima.com/weblogs/viewpost.jsp?thread=95863

(It was in Mike's original message.)

Eric

M2Crypto woes

Thanks

(It was in Mike's original message.)

I must have missed that. Guido says:

"""
I've yet to see an extension module using SWIG that doesn't make me think it was a mistake to use SWIG instead of manually written wrappers. The extra time paid upfront to create hand-crafted wrappers is gained back hundredfold by time saved debugging the SWIG-generated code later.
"""

hmm. wxPython is my prime example. I can't imagine that ever being done enough to be useful without auto code generation. Period. Of course, that doesn't apply to far smaller libraries.

And many of the hand-written wrappers I've seen are nightmares of incorrect reference counting. I think hand-wrapping is a just plain bad idea, not when you have ctypes and pyrex and Boost (and CXX?) as options instead.

-Chris

···

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker@...236...

Eric Firing wrote:

Michael Droettboom wrote:

I appreciate the way it fits so nicely into C++ ideas about RAII and exceptions -- but if you're not a C++ guy, that pro is probably a con.

What is RAII?

"Resource Aquisition is Initialization" -- It's a C++ memory management technique where all resources are allocated in constructors and destroyed in destructors (I'm grossly oversimplifying). It allows memory management to be mostly hidden from the users of classes, and exceptions to work as they were intended (without lots of try/catch blocks everywhere.) RAII is not the solution to all memory management problems, of course, but it's a pretty common and important rule of thumb for C++. See here for more info:

http://www.parashift.com/c++-faq-lite/big-picture.html#faq-6.18

PyCxx uses RAII to manage the lifetime of Python objects without requiring explicit reference counting. For instance, when you get a Py::Int from an argument, the reference count of its "owned" Python object is increased in the constructor and decreased in the destructor. So when the Py::Int goes out of scope, it automatically destroys its reference to the underlying Python object.

In this way, I see pycxx less as a wrapper mechanism (like SWIG or even Boost), and more like C++ convenience and safety extensions to the regular Python/C API. Given my familiarity with the Python/C API, that could be why I like it.

I don't know how well pyrex will be maintained and updated, long-term.

Which, of course, is one of the impetuses (impeti?) for moving away from CXX. We'll have to tread carefully.

I suppose my initial disappointment in SWIG is that I like its fundamental idea -- of automating the tedious boilerplate -- but the execution of it just seems so heavyweight. But maybe that doesn't really matter. This is Python after all :wink:

I think it is a perfectly valid concern and consideration.

Some benchmarking may be in order. I think it would be useful to know the difference in function call overhead between the different approaches, for instance. And overall memory usage is probably a secondary concern.

Cheers,
Mike

Christopher Barker wrote:

M2Crypto woes

Thanks

(It was in Mike's original message.)

I must have missed that. Guido says:

"""
I've yet to see an extension module using SWIG that doesn't make me think it was a mistake to use SWIG instead of manually written wrappers. The extra time paid upfront to create hand-crafted wrappers is gained back hundredfold by time saved debugging the SWIG-generated code later.
"""

hmm. wxPython is my prime example. I can't imagine that ever being done enough to be useful without auto code generation. Period. Of course, that doesn't apply to far smaller libraries.

I agree with you that some form of automation is absolutely necessary for these large libraries. However, wxPython also proves, IMHO, that using SWIG specifically is no walk in the park. wxPython has put a lot of effort into patching bugs in SWIG (which never seem to make it into upstream SWIG), and writing complicated extensions and workarounds. It is not a silver bullet for reference counting -- wxPython has had reference counting bugs as recently as 2.8.3. In fact, since SWIG typemaps are required so often (which is just raw Python/C API code anyway), there are still opportunities to make reference counting bugs, and they are harder to track down since you're one step removed from the real code.

And many of the hand-written wrappers I've seen are nightmares of incorrect reference counting. I think hand-wrapping is a just plain bad idea, not when you have ctypes and pyrex and Boost (and CXX?) as options instead.

I'll have to respectfully disagree with your assessment of hand-wrapping. It is the least likely to become unsupported of any of these options, and the Python/C API has been reasonably stable over a number of revisions (certainly since new-style classes were introduced, at least), and is extremely well-documented.

Cheers,
Mike