Lists: | pgsql-hackers |
---|
From: | Kurt Roeckx <kurt(at)roeckx(dot)be> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Gcc 4.4 causes abort in plpython. |
Date: | 2008-12-26 17:47:50 |
Message-ID: | 20081226174750.GA26150@roeckx.be |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
I've been trying a gcc 4.4 snapshot (20081213) on buildfarm member
panda. It gets a abort during the pl-install-check part.
Here is the backtrace:
Core was generated by `postgres: build-farm pl_regression [local] SELECT '.
Program terminated with signal 6, Aborted.
[New process 3588]
#0 0x00002b41e7662ed5 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00002b41e7662ed5 in raise () from /lib/libc.so.6
#1 0x00002b41e76643f3 in abort () from /lib/libc.so.6
#2 0x00000000006a889d in ExceptionalCondition (
conditionName=<value optimized out>, errorType=<value optimized out>,
fileName=<value optimized out>, lineNumber=<value optimized out>)
at assert.c:57
#3 0x00000000006c8033 in MemoryContextAlloc (context=0x0, size=112)
at mcxt.c:507
#4 0x00000000006abe82 in CopyErrorData () at elog.c:1082
#5 0x00002b41ea61a755 in PLy_spi_execute_plan (ob=<value optimized out>,
list=<value optimized out>, limit=<value optimized out>) at plpython.c:2587
#6 0x00002b41ea61a9a6 in PLy_spi_execute (self=<value optimized out>,
args=0x2b41eae11d20) at plpython.c:2477
#7 0x00002b41ea8e5fdd in PyEval_EvalFrameEx ()
from /usr/lib/libpython2.5.so.1.0
#8 0x00002b41ea8e7385 in PyEval_EvalFrameEx ()
from /usr/lib/libpython2.5.so.1.0
#9 0x00002b41ea8e7bfd in PyEval_EvalCodeEx ()
from /usr/lib/libpython2.5.so.1.0
#10 0x00002b41ea8e7df2 in PyEval_EvalCode () from /usr/lib/libpython2.5.so.1.0
#11 0x00002b41ea61b89b in PLy_procedure_call (proc=0xc62880,
kargs=<value optimized out>, vargs=<value optimized out>) at plpython.c:962
#12 0x00002b41ea61eaae in PLy_function_handler (fcinfo=<value optimized out>,
proc=<value optimized out>) at plpython.c:790
#13 0x00002b41ea61f359 in plpython_call_handler (fcinfo=<value optimized out>)
at plpython.c:355
#14 0x000000000054f171 in ExecMakeFunctionResult (
fcache=<value optimized out>, econtext=<value optimized out>,
isNull=0xbdd3d0 "\177~\177\177\177\177\177\177", isDone=0xbdd488)
at execQual.c:1635
#15 0x000000000054a39b in ExecProject (projInfo=<value optimized out>,
isDone=<value optimized out>) at execQual.c:4922
#16 0x000000000055dfab in ExecResult (node=0xbdc7d8) at nodeResult.c:155
#17 0x0000000000549928 in ExecProcNode (node=0xbdc7d8) at execProcnode.c:338
#18 0x00000000005474c9 in standard_ExecutorRun (
queryDesc=<value optimized out>, direction=ForwardScanDirection,
count=<value optimized out>) at execMain.c:1343
#19 0x00000000005fc878 in PortalRunSelect (portal=0xbd6c58,
forward=<value optimized out>, count=0, dest=0xbd4c60) at pquery.c:942
#20 0x00000000005fdd30 in PortalRun (portal=<value optimized out>,
count=<value optimized out>, isTopLevel=<value optimized out>,
dest=<value optimized out>, altdest=<value optimized out>,
completionTag=<value optimized out>) at pquery.c:768
#21 0x00000000005f90cd in exec_simple_query (
query_string=<value optimized out>) at postgres.c:992
#22 0x00000000005fa707 in PostgresMain (argc=<value optimized out>,
argv=<value optimized out>, username=<value optimized out>)
at postgres.c:3569
#23 0x00000000005c7227 in ServerLoop () at postmaster.c:3258
#24 0x00000000005c963d in PostmasterMain (argc=3, argv=0xaf3720)
at postmaster.c:1031
#25 0x0000000000571695 in main (argc=3, argv=0xaf3710) at main.c:188
(gdb) frame 3
#3 0x00000000006c8033 in MemoryContextAlloc (context=0x0, size=112)
at mcxt.c:507
507 AssertArg(MemoryContextIsValid(context));
(gdb) p context
$1 = (MemoryContext) 0x0
I've tried looking at it, but I have no idea what could be wrong.
Note that this might be a compiler bug, and it would be nice
if someone could figure out if it's a bug in pgsql or gcc.
kurt
Kurt
From: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
---|---|
To: | Kurt Roeckx <kurt(at)roeckx(dot)be> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Gcc 4.4 causes abort in plpython. |
Date: | 2008-12-29 12:25:47 |
Message-ID: | 20081229122547.GC4545@alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Kurt Roeckx wrote:
> #3 0x00000000006c8033 in MemoryContextAlloc (context=0x0, size=112)
> at mcxt.c:507
> #4 0x00000000006abe82 in CopyErrorData () at elog.c:1082
> #5 0x00002b41ea61a755 in PLy_spi_execute_plan (ob=<value optimized out>,
> list=<value optimized out>, limit=<value optimized out>) at plpython.c:2587
It's calling CopyErrorData with CurrentMemoryContext pointing to NULL,
which is not impossible since the GCC-inlined version of
MemoryContextSwitchTo does not check that it wasn't (the other version
does -- should we fix that?).
The question is why is that memory context set to NULL. The code looks
like this:
PLy_spi_execute_plan( ... )
{
MemoryContext oldcontext;
...
oldcontext = CurrentMemoryContext;
PG_TRY();
{
...
}
PG_CATCH();
{
MemoryContextSwitchTo(oldcontext);
CopyErrorData();
...
}
This has been like this for quite a while, which I find surprising
because I got scolded for a similar coding pattern awhile back. I think
I found that the variable was reversed to the value it had on entering
the block by the longjmp call. (IIRC Tom complained because his
compiler threw a "variable might be clobbered by longjmp" warning). We
at Command Prompt also had a similar case on the then-proprietary
Replicator code.
I think a simplistic solution is to declare the variable volatile.
Would you test that and report back?
Thanks.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
From: | Kurt Roeckx <kurt(at)roeckx(dot)be> |
---|---|
To: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Gcc 4.4 causes abort in plpython. |
Date: | 2008-12-29 14:24:16 |
Message-ID: | 20081229142416.GA10372@roeckx.be |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Dec 29, 2008 at 09:25:47AM -0300, Alvaro Herrera wrote:
>
> I think a simplistic solution is to declare the variable volatile.
> Would you test that and report back?
Yes, making oldcontext volatile makes the test pass.
It now fails at the ECPG-Check stage, but it seems that is a common
problem.
Kurt
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Kurt Roeckx <kurt(at)roeckx(dot)be> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Gcc 4.4 causes abort in plpython. |
Date: | 2008-12-29 16:19:56 |
Message-ID: | 8388.1230567596@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Kurt Roeckx <kurt(at)roeckx(dot)be> writes:
> On Mon, Dec 29, 2008 at 09:25:47AM -0300, Alvaro Herrera wrote:
>> I think a simplistic solution is to declare the variable volatile.
>> Would you test that and report back?
> Yes, making oldcontext volatile makes the test pass.
This is a gcc bug and you should report it. Since the variable is
not assigned within the try-block, volatile marking should not be
necessary.
regards, tom lane
From: | Kurt Roeckx <kurt(at)roeckx(dot)be> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Gcc 4.4 causes abort in plpython. |
Date: | 2008-12-29 17:26:34 |
Message-ID: | 20081229172634.GA26149@roeckx.be |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Dec 29, 2008 at 11:19:56AM -0500, Tom Lane wrote:
> Kurt Roeckx <kurt(at)roeckx(dot)be> writes:
> > On Mon, Dec 29, 2008 at 09:25:47AM -0300, Alvaro Herrera wrote:
> >> I think a simplistic solution is to declare the variable volatile.
> >> Would you test that and report back?
>
> > Yes, making oldcontext volatile makes the test pass.
>
> This is a gcc bug and you should report it. Since the variable is
> not assigned within the try-block, volatile marking should not be
> necessary.
Reported as:
http://gcc.gnu.org/PR38660
kurt