Re: [JDBC] Strange server error with current 8.0beta driver

Lists: pgsql-hackerspgsql-jdbc
From: "Barry Lind" <blind(at)xythos(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-jdbc(at)postgresql(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [JDBC] Strange server error with current 8.0beta driver
Date: 2004-11-24 05:15:32
Message-ID: 03E7D3E231BB7B4A915A6581D4296CC6CC9794@NSNOVPS00411.nacio.xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Tom,

Here is what you requested. (Thanks to Oliver for the good logging in
the jdbc driver).

I also have the test case (in java) down to the bare minimum that
generated the following output (that test case is attached). (Note that
if the FETCH in the test case is not executed then the backend crashes;
with the FETCH you get an error: "ERROR: unrecognized node type: 0")

Thanks,
--Barry

PostgreSQL 8.0devel JDBC3 with SSL (build 308)
Trying to establish a protocol version 3 connection to localhost:5432
FE=> StartupPacket(user=blind, database=fileswfs43,
client_encoding=UNICODE, DateStyle=ISO)
<=BE AuthenticationOk
<=BE ParameterStatus(client_encoding = UNICODE)
<=BE ParameterStatus(DateStyle = ISO, MDY)
<=BE ParameterStatus(integer_datetimes = off)
<=BE ParameterStatus(is_superuser = on)
<=BE ParameterStatus(server_encoding = UNICODE)
<=BE ParameterStatus(server_version = 8.0.0beta4)
<=BE ParameterStatus(session_authorization = blind)
<=BE BackendKeyData(pid=3348,ckey=914259969)
<=BE ReadyForQuery(I)
compatible = 8.0
loglevel = 2
prepare threshold = 1
getConnection returning
driver[className=org.postgresql.Driver,org(dot)postgresql(dot)Driver(at)a9ae05]
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er(at)1df5a8f, maxRows=0, fetchSize=0, flags=0
FE=> Parse(stmt=S_1,query="BEGIN",oids={})
FE=> Bind(stmt=S_1,portal=null)
FE=> Execute(portal=null,limit=0)
FE=> Parse(stmt=S_2,query="DECLARE CUR CURSOR FOR SELECT 1",oids={})
FE=> Bind(stmt=S_2,portal=null)
FE=> Describe(portal=null)
FE=> Execute(portal=null,limit=0)
FE=> Sync
<=BE ParseComplete [S_1]
<=BE BindComplete [null]
<=BE CommandStatus(BEGIN)
<=BE ParseComplete [S_2]
<=BE BindComplete [null]
<=BE NoData
<=BE CommandStatus(DECLARE CURSOR)
<=BE ReadyForQuery(T)
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er(at)1db699b, maxRows=0, fetchSize=0, flags=0
FE=> Parse(stmt=S_3,query="FETCH FORWARD 10 FROM CUR",oids={})
FE=> Bind(stmt=S_3,portal=null)
FE=> Describe(portal=null)
FE=> Execute(portal=null,limit=0)
FE=> Sync
<=BE ParseComplete [S_3]
<=BE BindComplete [null]
<=BE RowDescription(1)
<=BE DataRow
<=BE CommandStatus(FETCH)
<=BE ReadyForQuery(T)
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er(at)12a0f6c, maxRows=0, fetchSize=0, flags=0
FE=> Parse(stmt=S_4,query="CLOSE CUR",oids={})
FE=> Bind(stmt=S_4,portal=null)
FE=> Describe(portal=null)
FE=> Execute(portal=null,limit=0)
FE=> Sync
<=BE ParseComplete [S_4]
<=BE BindComplete [null]
<=BE NoData
<=BE CommandStatus(CLOSE CURSOR)
<=BE ReadyForQuery(T)
simple execute,
handler=org.postgresql.jdbc2.AbstractJdbc2Statement$StatementResultHandl
er(at)134a7d8, maxRows=0, fetchSize=0, flags=0
FE=> Bind(stmt=S_2,portal=null)
FE=> Describe(portal=null)
FE=> Execute(portal=null,limit=0)
FE=> Sync
<=BE BindComplete [null]
<=BE NoData
<=BE ErrorMessage(ERROR: unrecognized node type: 0
Location: File: clauses.c, Routine: expression_tree_mutator, Line:
3220
Server SQLState: XX000)
java.sql.SQLException: ERROR: unrecognized node type: 0
Location: File: clauses.c, Routine: expression_tree_mutator, Line:
3220
Server SQLState: XX000
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecu
torImpl.java:1356)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImp
l.java:1151)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:
166)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:363)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdb
c2Statement.java:308)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:299)
at test80.main(test80.java:31)
SQLException: SQLState(XX000)
<=BE ReadyForQuery(E)
java.sql.SQLException: ERROR: unrecognized node type: 0
Location: File: clauses.c, Routine: expression_tree_mutator, Line:
3220
Server SQLState: XX000

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Tuesday, November 23, 2004 7:10 AM
To: Barry Lind
Cc: pgsql-jdbc(at)postgresql(dot)org; pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] [JDBC] Strange server error with current 8.0beta
driver

"Barry Lind" <blind(at)xythos(dot)com> writes:
> OK, getting closer. The error happens if in jdbc I reuse
> PreparedStatement objects to reexecute the same set of queries
> multiple times.

The TRAP you showed looked like it might have something to do with
trying to execute code outside any transaction. But I dunno how it got
there.

> If you still need a test case, let me know, and I will continue to
> package up what I have been working on.

What I'd actually rather have is a list of the exact sequence of
messages sent to the server.

regards, tom lane

Attachment Content-Type Size
test80.java application/octet-stream 1.4 KB

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Barry Lind <blind(at)xythos(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [JDBC] Strange server error with current 8.0beta driver
Date: 2004-11-24 09:21:44
Message-ID: 41A452A8.9010505@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Barry Lind wrote:

> I also have the test case (in java) down to the bare minimum that
> generated the following output (that test case is attached). (Note that
> if the FETCH in the test case is not executed then the backend crashes;
> with the FETCH you get an error: "ERROR: unrecognized node type: 0")

I narrowed this down to:

while (true) {
l_stmtDeclare.execute();
}

producing:

> FE=> Parse(stmt=S_1,query="BEGIN",oids={})
> FE=> Bind(stmt=S_1,portal=null)
> FE=> Execute(portal=null,limit=0)
> FE=> Parse(stmt=S_2,query="DECLARE CUR CURSOR FOR SELECT 1",oids={})
> FE=> Bind(stmt=S_2,portal=null)
> FE=> Describe(portal=null)
> FE=> Execute(portal=null,limit=0)
> FE=> Sync
> <=BE ParseComplete [S_1]
> <=BE BindComplete [null]
> <=BE CommandStatus(BEGIN)
> <=BE ParseComplete [S_2]
> <=BE BindComplete [null]
> <=BE NoData
> <=BE CommandStatus(DECLARE CURSOR)
> <=BE ReadyForQuery(T)
> simple execute, handler=org(dot)postgresql(dot)jdbc2(dot)AbstractJdbc2Statement$StatementResultHandler(at)7ecd2c3c, maxRows=0, fetchSize=0, flags=0
> FE=> Bind(stmt=S_2,portal=null)
> FE=> Describe(portal=null)
> FE=> Execute(portal=null,limit=0)
> FE=> Sync
> <=BE BindComplete [null]
> <=BE NoData
> <=BE ErrorMessage(ERROR: unrecognized node type: 2139062143
> Location: File: clauses.c, Routine: expression_tree_mutator, Line: 3237
> Server SQLState: XX000)

Valgrind says this is the culprit:

> ==26451== Invalid read of size 4
> ==26451== at 0x8185C86: eval_const_expressions_mutator (clauses.c:1185)
> ==26451== by 0x8185C32: eval_const_expressions (clauses.c:1152)
> ==26451== by 0x817D1A6: preprocess_expression (planner.c:415)
> ==26451== by 0x817CEBF: subquery_planner (planner.c:240)
> ==26451== by 0x817CD59: planner (planner.c:129)
> ==26451== by 0x810DF03: PerformCursorOpen (portalcmds.c:87)
> ==26451== by 0x81C1402: PortalRunUtility (pquery.c:934)
> ==26451== by 0x81C1762: PortalRunMulti (pquery.c:1001)
> ==26451== by 0x81C0D8E: PortalRun (pquery.c:617)
> ==26451== by 0x81BDDA7: exec_execute_message (postgres.c:1673)
> ==26451== by 0x81BF6E1: PostgresMain (postgres.c:3035)
> ==26451== by 0x818FC39: BackendRun (postmaster.c:2817)
> ==26451== by 0x818F642: BackendStartup (postmaster.c:2453)
> ==26451== by 0x818D989: ServerLoop (postmaster.c:1198)
> ==26451== by 0x818CDBA: PostmasterMain (postmaster.c:917)
> ==26451== by 0x81570F4: main (main.c:268)
> ==26451== Address 0x1BBBF704 is 260 bytes inside a block of size 1024 free'd
> ==26451== at 0x1B905460: free (vg_replace_malloc.c:153)
> ==26451== by 0x8245706: AllocSetDelete (aset.c:466)
> ==26451== by 0x82468B8: MemoryContextDelete (mcxt.c:193)
> ==26451== by 0x8247BCF: PortalDrop (portalmem.c:384)
> ==26451== by 0x82475B5: CreatePortal (portalmem.c:179)
> ==26451== by 0x81BD735: exec_bind_message (postgres.c:1369)
> ==26451== by 0x81BF4EF: PostgresMain (postgres.c:3023)
> ==26451== by 0x818FC39: BackendRun (postmaster.c:2817)
> ==26451== by 0x818F642: BackendStartup (postmaster.c:2453)
> ==26451== by 0x818D989: ServerLoop (postmaster.c:1198)
> ==26451== by 0x818CDBA: PostmasterMain (postmaster.c:917)
> ==26451== by 0x81570F4: main (main.c:268)

With a bit of gdb work, I think what is happening is this:

The first Execute of S_2, running in portal context, calls the planner
on the query contained in S_2's DeclareCursorStmt. The planner modifies
the query tree in the course of planning it (specifically, it modifies
parse->targetList). Memory allocated for the modified query comes from
the portal context.

The portal context is freed implicitly by the second Bind of S_2 (second
stack trace above).

The second Execute of S_2 then tries to use parse->targetList when
planning (first stack trace above), but that's now pointing to freed
memory. Boom.

Perhaps PerformCursorOpen should copy the query tree before planning, or
plan in a different memory context?

-O


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Barry Lind <blind(at)xythos(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [JDBC] Strange server error with current 8.0beta driver
Date: 2004-11-24 09:37:35
Message-ID: 41A4565F.2080804@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Oliver Jowett wrote:

> Perhaps PerformCursorOpen should copy the query tree before planning, or
> plan in a different memory context?

Patch attached. It moves query planning inside the new portal's memory
context. With this applied I can run Barry's testcase without errors,
and valgrind seems OK with it too.

-O

Attachment Content-Type Size
pgsql-declare-memory-clobber.txt text/plain 2.0 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Barry Lind <blind(at)xythos(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [JDBC] Strange server error with current 8.0beta driver
Date: 2004-11-28 22:23:04
Message-ID: 21575.1101680584@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-jdbc

Oliver Jowett <oliver(at)opencloud(dot)com> writes:
>> Perhaps PerformCursorOpen should copy the query tree before planning, or
>> plan in a different memory context?

> Patch attached. It moves query planning inside the new portal's memory
> context. With this applied I can run Barry's testcase without errors,
> and valgrind seems OK with it too.

I think the better solution is the first way (copy the querytree first).
The problem with the way you did it is that all the temporary structures
built by the planner will be left behind in the cursor's memory context,
and can't be reclaimed until the cursor is destroyed. In the case of a
complex query that could represent a pretty serious memory leak. It
seems better to eat the cost of copying the querytree an extra time,
especially since this way forms a patch that's easy to reverse whenever
we fix the planner to be less cavalier about scribbling on its input.

I've applied the attached patch instead (and analogously in 7.4 branch).
Would you confirm it fixes the problem you see?

regards, tom lane

*** src/backend/commands/portalcmds.c.orig Thu Sep 16 12:58:28 2004
--- src/backend/commands/portalcmds.c Sun Nov 28 17:02:22 2004
***************
*** 62,73 ****
RequireTransactionChain((void *) stmt, "DECLARE CURSOR");

/*
* The query has been through parse analysis, but not rewriting or
* planning as yet. Note that the grammar ensured we have a SELECT
* query, so we are not expecting rule rewriting to do anything
* strange.
*/
! rewritten = QueryRewrite((Query *) stmt->query);
if (list_length(rewritten) != 1 || !IsA(linitial(rewritten), Query))
elog(ERROR, "unexpected rewrite result");
query = (Query *) linitial(rewritten);
--- 62,82 ----
RequireTransactionChain((void *) stmt, "DECLARE CURSOR");

/*
+ * Because the planner is not cool about not scribbling on its input,
+ * we make a preliminary copy of the source querytree. This prevents
+ * problems in the case that the DECLARE CURSOR is in a portal and is
+ * executed repeatedly. XXX the planner really shouldn't modify its
+ * input ... FIXME someday.
+ */
+ query = copyObject(stmt->query);
+
+ /*
* The query has been through parse analysis, but not rewriting or
* planning as yet. Note that the grammar ensured we have a SELECT
* query, so we are not expecting rule rewriting to do anything
* strange.
*/
! rewritten = QueryRewrite(query);
if (list_length(rewritten) != 1 || !IsA(linitial(rewritten), Query))
elog(ERROR, "unexpected rewrite result");
query = (Query *) linitial(rewritten);