Re: Issues with C++ exception handling in an FDW

Lists: pgsql-hackers
From: "Soules, Craig" <craig(dot)soules(at)hp(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Issues with C++ exception handling in an FDW
Date: 2012-01-30 23:04:02
Message-ID: 4D00A61DE9C15F4E9A8644D52DFF9A0537907F39@G4W3296.americas.hpqcorp.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I've run into a very odd issue calling C++ code that uses exceptions from within our PostgreSQL FDW. Specifically, we have broken our FDW into two components, a C layer that looks quite similar to the FDW for text files and a C++ layer that is called into by the C layer to interface with our storage file format.

We compile these two components into separate shared libraries, thus we have:

c-fdw.so
c++-fdw.so

and the c-fdw.so is compiled using -Wl,-rpath to allow it to find the c++-fdw.so at load time.

When there are no errors everything works flawlessly, however, we noticed that even throwing an exception in the C++ layer was causing an immediate segmentation fault. Even when encapsulated in a try { } catch(...) { } block.

If anyone has seen anything like this, any pointers or suggestions would be much appreciated. I have followed all of the recommendations in the PostgreSQL documentation, with no luck. I am not overloading the _init() functions in either shared library (another potential source of errors I have read about).

Thanks!
Craig Soules


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Soules, Craig" <craig(dot)soules(at)hp(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Issues with C++ exception handling in an FDW
Date: 2012-01-31 18:44:13
Message-ID: CA+TgmoZ+FMe_EkhEGY0vq1beZYxVwQGz-bBQDm4z_UqhvRKQJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 30, 2012 at 6:04 PM, Soules, Craig <craig(dot)soules(at)hp(dot)com> wrote:
> When there are no errors everything works flawlessly, however, we noticed that even throwing an exception in the C++ layer was causing an immediate segmentation fault.  Even when encapsulated in a try { } catch(...) { } block.

Stack backtrace?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: "Soules, Craig" <craig(dot)soules(at)hp(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Issues with C++ exception handling in an FDW
Date: 2012-01-31 18:52:52
Message-ID: CAEYLb_WCA-mgRQxGZ3nJXf0jYiFGEmBG=6SwY0QpCk08t8krew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 30 January 2012 23:04, Soules, Craig <craig(dot)soules(at)hp(dot)com> wrote:
> When there are no errors everything works flawlessly, however, we noticed that even throwing an exception in the C++ layer was causing an immediate segmentation fault.  Even when encapsulated in a try { } catch(...) { } block.
>
> If anyone has seen anything like this, any pointers or suggestions would be much appreciated.  I have followed all of the recommendations in the PostgreSQL documentation, with no luck.  I am not overloading the _init() functions in either shared library (another potential source of errors I have read about).

I suggest that you generalise from the example of PLV8. The basic
problem is that the effect of longjmp()ing over an area of the stack
with a C++ non-POD type is undefined. I don't think you can even use
structs, as they have implicit destructors in C++.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, "Soules, Craig" <craig(dot)soules(at)hp(dot)com>
Subject: Re: Issues with C++ exception handling in an FDW
Date: 2012-01-31 19:01:05
Message-ID: 201201312001.06000.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday, January 31, 2012 07:52:52 PM Peter Geoghegan wrote:
> On 30 January 2012 23:04, Soules, Craig <craig(dot)soules(at)hp(dot)com> wrote:
> > When there are no errors everything works flawlessly, however, we noticed
> > that even throwing an exception in the C++ layer was causing an
> > immediate segmentation fault. Even when encapsulated in a try { }
> > catch(...) { } block.
> >
> > If anyone has seen anything like this, any pointers or suggestions would
> > be much appreciated. I have followed all of the recommendations in the
> > PostgreSQL documentation, with no luck. I am not overloading the
> > _init() functions in either shared library (another potential source of
> > errors I have read about).
>
> I suggest that you generalise from the example of PLV8. The basic
> problem is that the effect of longjmp()ing over an area of the stack
> with a C++ non-POD type is undefined. I don't think you can even use
> structs, as they have implicit destructors in C++.
The PODness of a struct depends on its contents.

Andres


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, "Soules, Craig" <craig(dot)soules(at)hp(dot)com>
Subject: Re: Issues with C++ exception handling in an FDW
Date: 2012-01-31 19:42:19
Message-ID: CAEYLb_Uk_TT2QO=n3PHBJP9b4buPGHJKSKZ-81GE3s+1iHhgkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 31 January 2012 19:01, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> I suggest that you generalise from the example of PLV8. The basic
>> problem is that the effect of longjmp()ing over an area of the stack
>> with a C++ non-POD type is undefined. I don't think you can even use
>> structs, as they have implicit destructors in C++.
> The PODness of a struct depends on its contents.

Right. If I was going to invest much effort in this sort of thing, I
might even write a static assertion that verified a given type's
POD-ness over time, by declaring it within a union...which would
work....unless you were using C++11. Or, just use std::is_pod to build
a static assertion.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: "Soules, Craig" <craig(dot)soules(at)hp(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Issues with C++ exception handling in an FDW
Date: 2012-01-31 21:21:38
Message-ID: 4D00A61DE9C15F4E9A8644D52DFF9A0537908E6A@G4W3296.americas.hpqcorp.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I suggest that you generalise from the example of PLV8. The basic
> problem is that the effect of longjmp()ing over an area of the stack
> with a C++ non-POD type is undefined. I don't think you can even use
> structs, as they have implicit destructors in C++.

I had thought that this was only an issue if you tried to longjmp() over a section of C++ code starting from a postgres backend C function? From the PostgreSQL documentation:

" If calling backend functions from C++ code, be sure that the C++ call stack contains only plain old data structures (POD). This is necessary because backend errors generate a distant longjmp() that does not properly unroll a C++ call stack with non-POD objects."

But this is not what our code is doing. Our code is a C++ function that only does the following:

try {
throw 1;
} catch (int e) {
} catch (...) {
}

which causes an immediate segmentation fault. To answer another responders question, the stack trace looks as follows:

#0 0x00002b3ce8f40fa5 in __cxa_allocate_exception ()
from /usr/lib64/libstdc++.so.6
#1 0x00002b3ce77b6256 in initMBSource (state=0x1ab87a80)
at /data/soules/metaboxA-bugfix/Metabox/debug_build/src/lib/query/dsFdwShim.cpp:16791
#2 0x00002b3ce6c0b0aa in dsBeginForeignScan (node=0x1ab872d0,
eflags=<value optimized out>) at dataseries_fdw.c:819
#3 0x000000000057606c in ExecInitForeignScan ()
#4 0x000000000055c715 in ExecInitNode ()
#5 0x000000000056874c in ExecInitAgg ()
#6 0x000000000055c6a5 in ExecInitNode ()
#7 0x000000000055b944 in standard_ExecutorStart ()
#8 0x0000000000621b96 in PortalStart ()
#9 0x000000000061edad in exec_simple_query ()
#10 0x000000000061f624 in PostgresMain ()
#11 0x00000000005e4c5c in ServerLoop ()
#12 0x00000000005e595c in PostmasterMain ()
#13 0x000000000058a77e in main ()

Note: #2 is the entry into our C library and #1 is the entry into our C++ library

This appears to be some kind of allocation error, but the machine on which I'm running has plenty of free ram:

[~]$ free
total used free shared buffers cached
Mem: 24682888 10505920 14176968 0 1220496 7412352
-/+ buffers/cache: 1873072 22809816
Swap: 2096472 0 2096472

I also don't understand how it could truly be an allocation issue since we new/delete plenty of memory during a successful run (as well as using plenty of C++ containers which do internal allocation).

Hopefully this helps jog thoughts on my issue!

Thanks again!
Craig