terminate called after throwing an instance of 'Py::AttributeError'

Hi,

I have a web application using matplotlib which is unpredictably
crashing with the error message from the subject. It seems to be
happening in ft2font, but I can't be certain at this stage that it's
only occurring there (although since isolating it via logging
statements, every time it has occurred has been in that spot). The
crash occurs at load time, seemingly through a chain of import
statements (starting with wsgi app -> django -> my app):
matplotlib.colorbar -> matplotlib.lines -> matplotlib.font_manager ->
matplotlib.ft2font

Google is strangely quiet on that particular message; the closest I
have found that also involves ft2font was this rather old one:
http://comments.gmane.org/gmane.comp.python.matplotlib.devel/1332

The unpredictable nature of it suggests that it's thread-related, but
other than that I have no further clues. The unpredictable nature of
the crashes obviously makes testing any theory or avenue quite slow at
times! Does anyone have any suggestions, hints for further
probing,... anything, please?

The particulars:
Server OS: openSUSE 11.3 (x86_64)
matplotlib: 1.0.0 (compiled from source distro)
Server: apache prefork, mod_wsgi
Python version: 2.6.4

Extra factors:
There are two versions of the application, deployed in virtualenvs
(identical matplotlib versions). It does affect both of them,
although I've only been investigating with one. It frequently seems
to affect a group of processes; that is, reloading is required
multiple times before it returns to normal.

mod_wsgi is running in embedded mode, but the same problem was
occurring with mod_python -- that was my main impetus for porting to
wsgi in fact. The same application ran fine on the previous server
however (SUSE Linux Enterprise Server 11 (x86_64)), in fact with 3
versions of the application, using mod_python. It was previously
using matplotlib 0.98.5.2; according to my commit message the upgrade
was prompted by the server move and that version not compiling against
libpng1.4 on the new server.

Thanks, Mark.

···

--
Where the hell is Mark:
http://blog.everythingtastesbetterwithchilli.com/

Can you provide a stack trace -- either a Python one, or a gdb one?

Mike

···

On 05/18/2011 03:25 AM, Mark Hepburn wrote:

Hi,

I have a web application using matplotlib which is unpredictably
crashing with the error message from the subject. It seems to be
happening in ft2font, but I can't be certain at this stage that it's
only occurring there (although since isolating it via logging
statements, every time it has occurred has been in that spot). The
crash occurs at load time, seemingly through a chain of import
statements (starting with wsgi app -> django -> my app):
matplotlib.colorbar -> matplotlib.lines -> matplotlib.font_manager ->
matplotlib.ft2font

Google is strangely quiet on that particular message; the closest I
have found that also involves ft2font was this rather old one:
http://comments.gmane.org/gmane.comp.python.matplotlib.devel/1332

The unpredictable nature of it suggests that it's thread-related, but
other than that I have no further clues. The unpredictable nature of
the crashes obviously makes testing any theory or avenue quite slow at
times! Does anyone have any suggestions, hints for further
probing,... anything, please?

The particulars:
Server OS: openSUSE 11.3 (x86_64)
matplotlib: 1.0.0 (compiled from source distro)
Server: apache prefork, mod_wsgi
Python version: 2.6.4

Extra factors:
There are two versions of the application, deployed in virtualenvs
(identical matplotlib versions). It does affect both of them,
although I've only been investigating with one. It frequently seems
to affect a group of processes; that is, reloading is required
multiple times before it returns to normal.

mod_wsgi is running in embedded mode, but the same problem was
occurring with mod_python -- that was my main impetus for porting to
wsgi in fact. The same application ran fine on the previous server
however (SUSE Linux Enterprise Server 11 (x86_64)), in fact with 3
versions of the application, using mod_python. It was previously
using matplotlib 0.98.5.2; according to my commit message the upgrade
was prompted by the server move and that version not compiling against
libpng1.4 on the new server.

Thanks, Mark.

[Oops... just realised I have been replying to Michael, not the list.
Very sorry.]

I had a closer look at the trace and the ft2font code, and I'm still
none the wiser. It's clear what's happening, I just have no idea why.
I have tried to trigger it in a test-case by calling imp.load_dynamic
on ft2font repeatedly, and using both threads and processes (which I
didn't think would work, but I was clutching at straws) as well, but
still no joy.

I had a brief look at the python source too (importdl.c and import.c),
which does cache the module objects but admittedly doesn't do locking
that I can see.

This has all been working on the assumption that it is indeed a race
condition of some sort, and if that were the case I'm _also_ unsure
where that could arise from. We are now running on a 24 core machine
(up from 8 on the previous server which had no problems), but my
understanding of what memory is shared where in which
apache/mod_python/wsgi configuration is too fuzzy to make sense of
that possibility. (also, recall that the stack trace was captured
from apache in single-threaded debug mode!)

My current plan to fix it is to push the offending imports from module
top-level down into the functions where they are required, but even
assuming that is successful I would dearly love closure on this!

/Mark.

···

On 19 May 2011 12:24, Mark Hepburn <mark.hepburn@...287...> wrote:

I spoke too soon, I hit one! (I am unreasonably excited by this at this
stage). It looks like it's the same issue; it's in FT2Image and arises from
check_unique_method_name -- I'm about to look through the source, but it
seems a likely candidate.
The output of both bt and bt full is attached.
Thanks once again.
/Mark.

On 19 May 2011 12:15, Mark Hepburn <mark.hepburn@...287...> wrote:

Hi, thanks for the reply.
I haven't managed to extract one yet; any hints? I've tried a few times
with "gdb httpd" -> run -X, but unsuccessfully so far. My understanding is
that this runs apache in single-threaded mode, and if it is a threading
problem it is unlikely to reproduce the problem (I think). (The other
complicating factor is that this is the only server it has been a problem
on... which is also the production server, so I've been loathe to slow it
down too much like this. Biting the bullet now, though..)
There's no stack trace in the apache error log either; in fact there's not
even a time-stamp when it crashes, just the message from the subject.
Thanks again, Mark.

On 19 May 2011 00:55, Michael Droettboom <mdroe@...86...> wrote:

Can you provide a stack trace -- either a Python one, or a gdb one?

Mike

On 05/18/2011 03:25 AM, Mark Hepburn wrote:

Hi,

I have a web application using matplotlib which is unpredictably
crashing with the error message from the subject. It seems to be
happening in ft2font, but I can't be certain at this stage that it's
only occurring there (although since isolating it via logging
statements, every time it has occurred has been in that spot). The
crash occurs at load time, seemingly through a chain of import
statements (starting with wsgi app -> django -> my app):
matplotlib.colorbar -> matplotlib.lines -> matplotlib.font_manager ->
matplotlib.ft2font

Google is strangely quiet on that particular message; the closest I
have found that also involves ft2font was this rather old one:
http://comments.gmane.org/gmane.comp.python.matplotlib.devel/1332

The unpredictable nature of it suggests that it's thread-related, but
other than that I have no further clues. The unpredictable nature of
the crashes obviously makes testing any theory or avenue quite slow at
times! Does anyone have any suggestions, hints for further
probing,... anything, please?

The particulars:
Server OS: openSUSE 11.3 (x86_64)
matplotlib: 1.0.0 (compiled from source distro)
Server: apache prefork, mod_wsgi
Python version: 2.6.4

Extra factors:
There are two versions of the application, deployed in virtualenvs
(identical matplotlib versions). It does affect both of them,
although I've only been investigating with one. It frequently seems
to affect a group of processes; that is, reloading is required
multiple times before it returns to normal.

mod_wsgi is running in embedded mode, but the same problem was
occurring with mod_python -- that was my main impetus for porting to
wsgi in fact. The same application ran fine on the previous server
however (SUSE Linux Enterprise Server 11 (x86_64)), in fact with 3
versions of the application, using mod_python. It was previously
using matplotlib 0.98.5.2; according to my commit message the upgrade
was prompted by the server move and that version not compiling against
libpng1.4 on the new server.

Thanks, Mark.

------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its
next-generation tools to help Windows* and Linux* C/C++ and Fortran
developers boost performance applications - including clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
matplotlib-users List Signup and Options

--
Where the hell is Mark:
http://blog.everythingtastesbetterwithchilli.com/