Lists: | pgsql-hackers |
---|
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | exec_execute_message crush |
Date: | 2009-12-29 01:06:08 |
Message-ID: | 20091229.100608.37592217.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
While inspecting a complain from a pgpool user, I found that
PostgreSQL crushes with following statck trace:
#0 0x0826436a in list_length (l=0xaabe4e28)
at ../../../src/include/nodes/pg_list.h:94
#1 0x08262168 in IsTransactionStmtList (parseTrees=0xaabe4e28)
at postgres.c:2429
#2 0x0826132e in exec_execute_message (portal_name=0x857bab0 "", max_rows=0)
at postgres.c:1824
#3 0x08263b2a in PostgresMain (argc=4, argv=0x84f6c28,
username=0x84f6b08 "t-ishii") at postgres.c:3671
#4 0x0823299e in BackendRun (port=0x8511e68) at postmaster.c:3449
#5 0x08231f78 in BackendStartup (port=0x8511e68) at postmaster.c:3063
#6 0x0822f90a in ServerLoop () at postmaster.c:1387
#7 0x0822f131 in PostmasterMain (argc=3, argv=0x84f4bf8) at postmaster.c:1040
#8 0x081c6217 in main (argc=3, argv=0x84f4bf8) at main.c:188
This happens with following extended commands sequence:
parse
bind
describe
execute
<normaly done>
parse invalid SQL thus abort a transaction
bind (error)
describe (error)
execute (crush)
exec_execute_message crushes here:
/* Does the portal contain a transaction command? */
is_xact_command = IsTransactionStmtList(portal->stmts);
Looking into portal:
$5 = {name = 0x85727bc "", prepStmtName = 0x0, heap = 0x8596798, resowner = 0x0,
cleanup = 0, createSubid = 1,
sourceText = 0x859ac78 " SELECT NULL AS TABLE_CAT, n.nspname AS TABLE_SCHEM, ct.relname AS TABLE_NAME, a.attname AS COLUMN_NAME, a.attnum AS KEY_SEQ, ci.relname AS PK_NAME FROM pg_catalog.pg_namespace n, pg_catalog.pg_c"...,
commandTag = 0x84682aa "SELECT", stmts = 0xaabe4e28, cplan = 0x0,
portalParams = 0x0, strategy = PORTAL_ONE_SELECT, cursorOptions = 4,
status = PORTAL_READY, queryDesc = 0x0, tupDesc = 0x85db060,
formats = 0x859b0c8, holdStore = 0x0, holdContext = 0x0, atStart = 1 '\001',
atEnd = 1 '\001', posOverflow = 0 '\0', portalPos = 0,
creation_time = 315313855337710, visible = 1 '\001'}
Problem is, stmts points to invalid memory address:
(gdb) p *portal->stmts
Cannot access memory at address 0xaabe4e28
It seems the source of the problem is, exec_execute_message tries to
execute unamed portal which has unnamed statement which has already
gone.
Please note that without pgpool backend does not crush. This is
because JDBC driver does not do execute() if prior parse, bind
etc. failed, I think.
The crush happens PostgreSQL 8.3.8, 8.3.9 and 8.4.2.
Any thought?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 01:36:37 |
Message-ID: | 5190.1262050597@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> It seems the source of the problem is, exec_execute_message tries to
> execute unamed portal which has unnamed statement which has already
> gone.
Could we see an actual test case?
regards, tom lane
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 01:52:30 |
Message-ID: | 20091229.105230.96920423.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> > It seems the source of the problem is, exec_execute_message tries to
> > execute unamed portal which has unnamed statement which has already
> > gone.
>
> Could we see an actual test case?
If you don't mind to use pgpool, it would be possible. If not, I need
to write a small program which handles frontend/backend protocol
directly. What shall I do?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 02:30:04 |
Message-ID: | 5747.1262053804@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
>> Could we see an actual test case?
> If you don't mind to use pgpool, it would be possible. If not, I need
> to write a small program which handles frontend/backend protocol
> directly. What shall I do?
Hm, can't you get libpq to do it?
regards, tom lane
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 02:35:52 |
Message-ID: | 20091229.113552.48667790.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> > If you don't mind to use pgpool, it would be possible. If not, I need
> > to write a small program which handles frontend/backend protocol
> > directly. What shall I do?
>
> Hm, can't you get libpq to do it?
That depends on how libpq is "intelligent":-) Let me try...
Another idea is a "packet recorder", which could record packets from
pgpool to PostgreSQL and replay them. I don't remember at present, but
I vaguely recall something like that exists.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 03:47:55 |
Message-ID: | 20091229.124755.51301287.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> > Hm, can't you get libpq to do it?
>
> That depends on how libpq is "intelligent":-) Let me try...
>
> Another idea is a "packet recorder", which could record packets from
> pgpool to PostgreSQL and replay them. I don't remember at present, but
> I vaguely recall something like that exists.
It seems we can't get libpq to do it. libpq does not provide a
function which can execute bind alone. In my understanding
PQexecPrepared does bind + execute.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 04:02:20 |
Message-ID: | 9290.1262059340@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
>>> Hm, can't you get libpq to do it?
> It seems we can't get libpq to do it. libpq does not provide a
> function which can execute bind alone. In my understanding
> PQexecPrepared does bind + execute.
The event sequence you mentioned had bind followed by execute, so
I'm not seeing the problem.
(In any case, some kind of quick lobotomy in libpq would be easier
than writing a standalone test program, no?)
regards, tom lane
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 04:07:43 |
Message-ID: | 20091229.130743.00076757.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> (In any case, some kind of quick lobotomy in libpq would be easier
> than writing a standalone test program, no?)
Sounds nice idea.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Kris Jurka <books(at)ejurka(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 05:49:19 |
Message-ID: | alpine.BSO.2.00.0912290043330.25395@leary.csoft.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, 29 Dec 2009, Tatsuo Ishii wrote:
> parse
> bind
> describe
> execute
> <normaly done>
> parse invalid SQL thus abort a transaction
> bind (error)
> describe (error)
> execute (crush)
>
> Please note that without pgpool backend does not crush. This is
> because JDBC driver does not do execute() if prior parse, bind
> etc. failed, I think.
The JDBC driver will fire away parse, bind, and execute all at once before
a sync, to avoid network roundtrips, so your assumption of what's going on
here without pgpool doesn't seem accurate. Attached is a test case that
tries to duplicate what you've described and it errors out normally.
Below is the JDBC driver's protocol level logging.
21:41:39.407 (1) FE=> Parse(stmt=S_1,query="BEGIN",oids={})
21:41:39.407 (1) FE=> Bind(stmt=S_1,portal=null)
21:41:39.407 (1) FE=> Execute(portal=null,limit=0)
21:41:39.408 (1) FE=> Parse(stmt=null,query="SELECT $1 ",oids={23})
21:41:39.408 (1) FE=> Bind(stmt=null,portal=null,$1=<'1'>)
21:41:39.408 (1) FE=> Describe(portal=null)
21:41:39.408 (1) FE=> Execute(portal=null,limit=0)
21:41:39.408 (1) FE=> Parse(stmt=null,query=" SELECT SELECT $1
",oids={23})
21:41:39.408 (1) FE=> Bind(stmt=null,portal=null,$1=<'2'>)
21:41:39.409 (1) FE=> Describe(portal=null)
21:41:39.409 (1) FE=> Execute(portal=null,limit=0)
21:41:39.409 (1) FE=> Sync
21:41:39.443 (1) <=BE ParseComplete [S_1]
21:41:39.443 (1) <=BE BindComplete [null]
21:41:39.443 (1) <=BE CommandStatus(BEGIN)
21:41:39.443 (1) <=BE ParseComplete [null]
21:41:39.443 (1) <=BE BindComplete [null]
21:41:39.444 (1) <=BE RowDescription(1)
21:41:39.444 (1) <=BE DataRow
21:41:39.444 (1) <=BE CommandStatus(SELECT)
21:41:39.454 (1) <=BE ErrorMessage(ERROR: syntax error at or near
"SELECT"
Position: 9)
So this shows everything working as expected. Perhaps enabling this
logging on your JDBC client would show more clearly what it is trying to
do.
Kris Jurka
Attachment | Content-Type | Size |
---|---|---|
Crash.java | text/plain | 453 bytes |
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | books(at)ejurka(dot)com |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crush |
Date: | 2009-12-29 06:22:31 |
Message-ID: | 20091229.152231.68307810.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> > parse
> > bind
> > describe
> > execute
> > <normaly done>
> > parse invalid SQL thus abort a transaction
> > bind (error)
> > describe (error)
> > execute (crush)
> >
> > Please note that without pgpool backend does not crush. This is
> > because JDBC driver does not do execute() if prior parse, bind
> > etc. failed, I think.
>
> The JDBC driver will fire away parse, bind, and execute all at once before
> a sync, to avoid network roundtrips, so your assumption of what's going on
> here without pgpool doesn't seem accurate. Attached is a test case that
> tries to duplicate what you've described and it errors out normally.
> Below is the JDBC driver's protocol level logging.
>
> 21:41:39.407 (1) FE=> Parse(stmt=S_1,query="BEGIN",oids={})
> 21:41:39.407 (1) FE=> Bind(stmt=S_1,portal=null)
> 21:41:39.407 (1) FE=> Execute(portal=null,limit=0)
> 21:41:39.408 (1) FE=> Parse(stmt=null,query="SELECT $1 ",oids={23})
> 21:41:39.408 (1) FE=> Bind(stmt=null,portal=null,$1=<'1'>)
> 21:41:39.408 (1) FE=> Describe(portal=null)
> 21:41:39.408 (1) FE=> Execute(portal=null,limit=0)
> 21:41:39.408 (1) FE=> Parse(stmt=null,query=" SELECT SELECT $1
> ",oids={23})
> 21:41:39.408 (1) FE=> Bind(stmt=null,portal=null,$1=<'2'>)
> 21:41:39.409 (1) FE=> Describe(portal=null)
> 21:41:39.409 (1) FE=> Execute(portal=null,limit=0)
> 21:41:39.409 (1) FE=> Sync
> 21:41:39.443 (1) <=BE ParseComplete [S_1]
> 21:41:39.443 (1) <=BE BindComplete [null]
> 21:41:39.443 (1) <=BE CommandStatus(BEGIN)
> 21:41:39.443 (1) <=BE ParseComplete [null]
> 21:41:39.443 (1) <=BE BindComplete [null]
> 21:41:39.444 (1) <=BE RowDescription(1)
> 21:41:39.444 (1) <=BE DataRow
> 21:41:39.444 (1) <=BE CommandStatus(SELECT)
> 21:41:39.454 (1) <=BE ErrorMessage(ERROR: syntax error at or near
> "SELECT"
> Position: 9)
>
> So this shows everything working as expected. Perhaps enabling this
> logging on your JDBC client would show more clearly what it is trying to
> do.
Thanks for clarification. I will look into more between pgpool and
PostgreSQL packet exchange.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2009-12-30 12:46:35 |
Message-ID: | 20091230.214635.09776596.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> While inspecting a complain from a pgpool user, I found that
> PostgreSQL crushes with following statck trace:
>
> #0 0x0826436a in list_length (l=0xaabe4e28)
> at ../../../src/include/nodes/pg_list.h:94
> #1 0x08262168 in IsTransactionStmtList (parseTrees=0xaabe4e28)
> at postgres.c:2429
> #2 0x0826132e in exec_execute_message (portal_name=0x857bab0 "", max_rows=0)
> at postgres.c:1824
> #3 0x08263b2a in PostgresMain (argc=4, argv=0x84f6c28,
> username=0x84f6b08 "t-ishii") at postgres.c:3671
> #4 0x0823299e in BackendRun (port=0x8511e68) at postmaster.c:3449
> #5 0x08231f78 in BackendStartup (port=0x8511e68) at postmaster.c:3063
> #6 0x0822f90a in ServerLoop () at postmaster.c:1387
> #7 0x0822f131 in PostmasterMain (argc=3, argv=0x84f4bf8) at postmaster.c:1040
> #8 0x081c6217 in main (argc=3, argv=0x84f4bf8) at main.c:188
Ok, I think I understand what's going on.
parse
bind
describe
execute
This sequence of commands create cached plan in unnamed portal.
$5 = {name = 0x8574de4 "", prepStmtName = 0x0, heap = 0x8598400,
resowner = 0x8598488, cleanup = 0x81632ca <PortalCleanup>, createSubid = 1,
sourceText = 0x85ab818 " SELECT <omitted>"...,
commandTag = 0x84682ca "SELECT", stmts = 0xaabf43b0, cplan = 0xaabf4950,
portalParams = 0x0, strategy = PORTAL_ONE_SELECT, cursorOptions = 4,
status = PORTAL_READY, queryDesc = 0x85abc20, tupDesc = 0x85ddcb0,
formats = 0x85abc68, holdStore = 0x0, holdContext = 0x0, atStart = 1 '\001',
atEnd = 1 '\001', posOverflow = 0 '\0', portalPos = 0,
creation_time = 315487957498169, visible = 1 '\001'}
The cached plan(portal->cplan) and statements(portal->stmts) are
created by exec_bind_message():
/*
* Revalidate the cached plan; this may result in replanning. Any
* cruft will be generated in MessageContext. The plan refcount will
* be assigned to the Portal, so it will be released at portal
* destruction.
*/
cplan = RevalidateCachedPlan(psrc, false);
plan_list = cplan->stmt_list;
Please note that cplan and stmts belong to the same memory context.
Then following commands are coming:
parse invalid SQL thus abort a transaction
bind (error)
describe (error)
execute (crash)
parse causes transaction to abort, which causes call to
AbortCurrentTransaction->AbortTransaction->AtAbort_portals->ReleaseCachedPlan. It
calls ReleaseCachePlan(portal->cplan). ReleaseCachePlan calls
MemoryContextDelete(plan->context) which destroys both portal->cplan
and portal->stmts.
That was the reason why I had segfault by accessing portal->stmts.
To fix this I think exec_execute_message should throw an error if
portal->cleanup is NULL, since portal->cleanup is NULLed by
AtAbort_Portals at transaction abort (or portal is dropped).
Here is a suggested fix:
diff -c postgres.c~ postgres.c
*** postgres.c~ 2009-06-18 19:08:08.000000000 +0900
--- postgres.c 2009-12-30 21:34:49.000000000 +0900
***************
*** 1804,1810 ****
dest = DestRemoteExecute;
portal = GetPortalByName(portal_name);
! if (!PortalIsValid(portal))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_CURSOR),
errmsg("portal \"%s\" does not exist", portal_name)));
--- 1804,1810 ----
dest = DestRemoteExecute;
portal = GetPortalByName(portal_name);
! if (!PortalIsValid(portal) || (PortalIsValid(portal) && portal->cleanup == NULL))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_CURSOR),
errmsg("portal \"%s\" does not exist", portal_name)));
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2009-12-30 12:59:25 |
Message-ID: | 4B3B4EAD.5090001@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Tatsuo Ishii wrote:
> ! if (!PortalIsValid(portal) || (PortalIsValid(portal) && portal->cleanup == NULL))
>
>
Surely the second call to PortalIsValid() is redundant.
if (( !PortalIsValid(portal)) || portal->cleanup == NULL)
should do it, no?
cheers
andrew
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | andrew(at)dunslane(dot)net |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2009-12-30 14:26:00 |
Message-ID: | 20091230.232600.89610118.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> Tatsuo Ishii wrote:
> > ! if (!PortalIsValid(portal) || (PortalIsValid(portal) && portal->cleanup == NULL))
> >
> >
>
>
> Surely the second call to PortalIsValid() is redundant.
>
> if (( !PortalIsValid(portal)) || portal->cleanup == NULL)
>
> should do it, no?
Oops. You are right.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2009-12-30 16:51:10 |
Message-ID: | 27108.1262191870@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> parse causes transaction to abort, which causes call to
> AbortCurrentTransaction->AbortTransaction->AtAbort_portals->ReleaseCachedPlan. It
> calls ReleaseCachePlan(portal->cplan). ReleaseCachePlan calls
> MemoryContextDelete(plan->context) which destroys both portal->cplan
> and portal->stmts.
> That was the reason why I had segfault by accessing portal->stmts.
> To fix this I think exec_execute_message should throw an error if
> portal->cleanup is NULL, since portal->cleanup is NULLed by
> AtAbort_Portals at transaction abort (or portal is dropped).
This is just a kluge, and a rather bad one I think. The real problem
here is that AtAbort_Portals destroys the portal contents and doesn't
do anything to record the fact. It should probably be putting the
portal into PORTAL_FAILED state, and what exec_execute_message ought
to be doing is checking for that. It might be a good idea to explicitly
zero out the now-dangling pointers in the Portal struct, too.
It'd be nice to have a test case for this, hint hint ...
regards, tom lane
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2009-12-31 01:48:48 |
Message-ID: | 20091231.104848.34368524.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> This is just a kluge, and a rather bad one I think. The real problem
> here is that AtAbort_Portals destroys the portal contents and doesn't
> do anything to record the fact. It should probably be putting the
> portal into PORTAL_FAILED state, and what exec_execute_message ought
> to be doing is checking for that.
Yeah I thought about that too. in AtAbort_Portals:
--------------------------------------------------------------------------
/*
* Abort processing for portals.
*
* At this point we reset "active" status and run the cleanup hook if
* present, but we can't release the portal's memory until the cleanup call.
*
* The reason we need to reset active is so that we can replace the unnamed
* portal, else we'll fail to execute ROLLBACK when it arrives.
*/
void
AtAbort_Portals(void)
{
HASH_SEQ_STATUS status;
PortalHashEnt *hentry;
hash_seq_init(&status, PortalHashTable);
while ((hentry = (PortalHashEnt *) hash_seq_search(&status)) != NULL)
{
Portal portal = hentry->portal;
if (portal->status == PORTAL_ACTIVE)
portal->status = PORTAL_FAILED;
--------------------------------------------------------------------------
Should I change the last if clause to?
if (portal->status == PORTAL_ACTIVE || portal->status == PORTAL_READY)
portal->status = PORTAL_FAILED;
> zero out the now-dangling pointers in the Portal struct, too.
portal->cplan is already zero out by PortalReleaseCachedPlan. Problem
is, portal->stmts may belong to PortalContext or others (in this
particluar case). So if we want to zero out portal->stmts, we need to
memorize the memory context which it belongs to and we need add a new
struct member to portal. I'm afraid this is an overkill...
> It'd be nice to have a test case for this, hint hint ...
Still working on...
--
Tatsuo Ishii
SRA OSS, Inc. Japan
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2009-12-31 06:56:11 |
Message-ID: | 20091231.155611.128866384.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> > It'd be nice to have a test case for this, hint hint ...
>
> Still working on...
Done. Inclded are C test program along with modified fe-exec.c.
The modification made to fe-exec.c is sending Sync after Parse, Bind
and Describe. Pgpool-II does this in order to get current transaction
status.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/plain | 1.7 KB |
unknown_filename | text/plain | 77.9 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2009-12-31 23:37:52 |
Message-ID: | 20349.1262302672@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> Done. Inclded are C test program along with modified fe-exec.c.
> The modification made to fe-exec.c is sending Sync after Parse, Bind
> and Describe. Pgpool-II does this in order to get current transaction
> status.
I tried this but didn't have any luck crashing the backend. libpq gets
tremendously confused by the extra ReadyForQuery responses, which is
unsurprising. The postmaster log shows
LOG: could not send data to client: Broken pipe
ERROR: relation "foo" does not exist at character 15
STATEMENT: SELECT * FROM foo
ERROR: unnamed prepared statement does not exist
ERROR: current transaction is aborted, commands ignored until end of transaction block
ERROR: current transaction is aborted, commands ignored until end of transaction block
STATEMENT: SELECT NULL , n.nspname, ct.relname, a.attname, a.attnum, ci.relname FROM pg_catalog.pg_namespace n, pg_catalog.pg_class ct, pg_catalog.pg_class ci, pg_catalog.pg_attribute a, pg_catalog.pg_index i WHERE ct.oid=i.indrelid AND ci.oid=i.indexrelid AND a.attrelid=ci.oid AND i.indisprimary AND ct.relname = 'mst_Ucompany_feature_setting' AND ct.relnamespace = n.oid AND n.nspname = 'foo' ORDER BY 1, 2, 3
So the "unnamed prepared statement does not exist" bit seems to be
related to what you are talking about, but it doesn't actually fail.
regards, tom lane
From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | tgl(at)sss(dot)pgh(dot)pa(dot)us |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: exec_execute_message crash |
Date: | 2010-01-03 12:00:33 |
Message-ID: | 20100103.210033.38086058.t-ishii@sraoss.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
> I tried this but didn't have any luck crashing the backend. libpq gets
> tremendously confused by the extra ReadyForQuery responses, which is
> unsurprising. The postmaster log shows
>
> LOG: could not send data to client: Broken pipe
> ERROR: relation "foo" does not exist at character 15
> STATEMENT: SELECT * FROM foo
> ERROR: unnamed prepared statement does not exist
> ERROR: current transaction is aborted, commands ignored until end of transaction block
> ERROR: current transaction is aborted, commands ignored until end of transaction block
> STATEMENT: SELECT NULL , n.nspname, ct.relname, a.attname, a.attnum, ci.relname FROM pg_catalog.pg_namespace n, pg_catalog.pg_class ct, pg_catalog.pg_class ci, pg_catalog.pg_attribute a, pg_catalog.pg_index i WHERE ct.oid=i.indrelid AND ci.oid=i.indexrelid AND a.attrelid=ci.oid AND i.indisprimary AND ct.relname = 'mst_Ucompany_feature_setting' AND ct.relnamespace = n.oid AND n.nspname = 'foo' ORDER BY 1, 2, 3
>
> So the "unnamed prepared statement does not exist" bit seems to be
> related to what you are talking about, but it doesn't actually fail.
I have put some debugging codes to make sure that portal->cplan and
portal->stmts belong to the same memory context by calling
GetMemoryChunkContext and surely they did. It appears that the memory
was surely deleted by MemeoryContextDelete in ReleaseCachedPlan. Also
I defined CLOBBER_FREED_MEMORY in aset.c to fill with 0x7f the freed
memory. Strange thing was portal->smts was not clobbered by 0x7f.
It seems I have missed something here...
--
Tatsuo Ishii
SRA OSS, Inc. Japan