Re: pgsql_fdw, FDW for PostgreSQL server

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Date: 2012-02-17 05:08:42
Message-ID: 4F3DE0DA.8030709@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2012/02/17 2:02), Kohei KaiGai wrote:
> I found a strange behavior with v10. Is it available to reproduce?
<snip>
> I tried to raise an error on remote side.
>
> postgres=# select * FROM ftbl WHERE 100 / (a - 3)> 0;
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.
> !> \q

I could reproduce the error by omitting CFLAGS=-O0 from configure
option. I usually this for coding environment so that gdb debugging
works correctly, so I haven't noticed this issue. I should test
optimized environment too...

Expected result in that case is:

postgres=# select * from pgbench_accounts where 100 / (aid - 3) > 0;
ERROR: could not fetch rows from foreign server
DETAIL: ERROR: division by zero

HINT: FETCH 10000 FROM pgsql_fdw_cursor_0
postgres=#

> This is the PG_CATCH block at execute_query(). fetch_result() raises
> an error, then it shall be catched to release PGresult.
> Although "res" should be NULL at this point, PQclear was called with
> a non-zero value according to the call trace.
>
> More strangely, I tried to inject elog(INFO, ...) to show the value of "res"
> at this point. Then, it become unavailable to reproduce when I tried to
> show the pointer of "res" with elog(INFO, "&res = %p",&res);
>
> Why the "res" has a non-zero value, even though it was cleared prior
> to fetch_result() and an error was raised within this function?

I've found the the problem is uninitialized PGresult variables.
Uninitialized PGresult pointer is used in some places, so its value is
garbage in PG_CATCH block when assignment code has been interrupted by
longjmp.

Probably recommended style would be like this:

<pseudo_code>
PGresult *res = NULL; /* must be NULL in PG_CATCH */

PG_TRY();
{
res = func_might_throw_exception();
if (PQstatus(res) != PGRES_xxx_OK)
{
/* error handling, pass message to caller */
ereport(ERROR, ...);
}

/* success case, use result of query and release it */
...
PQclear(res);
}
PG_CATCH();
{
PQclear(res);
PG_RE_THROW();
/* caller should catch this exception. */
}
</pseudo_code>

I misunderstood that PGresult pointer always has valid value after that
line, because I had wrote assignment of PGresult pointer before PG_TRY
block. Fixes for this issue are:

(1) Initialize PGresult pointer with NULL, if it is used in PG_CATCH.
(2) Move PGresult assignment into PG_TRY block so that we can get
compiler warning of uninitialized variable, just in case.

Please find attached a patch including fixes for this issue.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
pgsql_fdw_v11.patch text/plain 108.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2012-02-17 05:27:18 Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)
Previous Message Robert Haas 2012-02-17 05:02:27 Re: Command Triggers