Re: pgsql_fdw, FDW for PostgreSQL server

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Date: 2012-02-16 13:41:01
Message-ID: 4F3D076D.40907@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kaigai-san,

Thanks for the review. Attached patches are revised version, though
only fdw_helper_v5.patch is unchanged.

(2012/02/16 0:09), Kohei KaiGai wrote:
> [memory context of tuple store]
> It calls tuplestore_begin_heap() under the memory context of
> festate->scan_cxt at pgsqlBeginForeignScan.

Yes, it's because tuplestore uses a context which was current when
tuplestore_begin_heap was called. I want to use per-scan context for
tuplestore, to keep its content tuples alive through the scan.

> On the other hand, tuplestore_gettupleslot() is called under the
> memory context of festate->tuples.

Yes, result tuples to be returned to executor should be allocated in
per-scan context and live until next IterateForeignScan (or
EndForeignScan), because such tuple will be released via ExecClearTuple
in next IterateForeignScan call. If we don't switch context to per-scan
context, result tuple is allocated in per-tuple context and cause
double-free and server crash.

> I could not find a callback functions being invoked on errors,
> so I doubt the memory objects acquired within tuplestore_begin_heap()
> shall be leaked, even though it is my suggestion to create a sub-context
> under the existing one.

How do you confirmed that no callback function is invoked on errors? I
think that memory objects acquired within tuplestore_begin_heap (I guess
you mean excluding stored tuples, right?) are released during cleanup of
aborted transaction. I tested that by adding elog(ERROR) to the tail of
store_result() for intentional error, and execute large query 100 times
in a session. I saw VIRT value (via top command) comes down to constant
level after every query.

> In my opinion, it is a good choice to use es_query_cxt of the supplied EState.
> What does prevent to apply this per-query memory context?

Ah, I've confused context management of pgsql_fdw... I fixed pgsql_fdw
to create per-scan context as a child of es_query_cxt in
BeginForeignScan, and use it for tuplestore of the scan. So, tuplestore
and its contents are released correctly at EndForeignScan, or cleanup of
aborted transaction in error case.

> You mention about PGresult being malloc()'ed. However, it seems to me
> fetch_result() and store_result() once copy the contents on malloc()'ed
> area to the palloc()'ed area, and PQresult is released on an error using
> PG_TRY() ... PG_CATCH() block.

During thinking about this comment, I found double-free bug of PGresult
in execute_query, thanks :)

But, sorry, I'm not sure what the concern you show here is. The reason
why copying tuples from malloc'ed area to palloc'ed area is to release
PGresult before returning from the IterateForeingScan call. The reason
why using PG_TRY block is to sure that PGresult is released before jump
back to upstream in error case.

> [Minor comments]
> Please set NULL to "sql" variable at begin_remote_tx().
> Compiler raises a warnning due to references of uninitialized variable,
> even though the code path never run.

Fixed. BTW, just out of curiosity, which compiler do you use? My
compiler ,gcc (GCC) 4.6.0 20110603 (Red Hat 4.6.0-10) on Fedora 15,
doesn't warn it.

> It potentially causes a problem in case when fetch_result() raises an
> error because of unexpected status (!= PGRES_TUPLES_OK).
> One code path is not protected with PG_TRY(), and other code path
> will call PQclear towards already released PQresult.

Fixed.

> Although it is just a preference of mine, is the exprFunction necessary?
> It seems to me, the point of push-down check is whether the supplied
> node is built-in object, or not. So, an sufficient check is is_builtin() onto
> FuncExpr->funcid, OpExpr->opno, ScalarArrayOpExpr->opno and so on.
> It does not depend on whether the function implementing these nodes
> are built-in or not.

Got rid of exprFunction and fixed foreign_expr_walker to check function
oid in each case label.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
fdw_helper_v5.patch text/plain 12.8 KB
pgsql_fdw_v10.patch text/plain 107.2 KB
pgsql_fdw_pushdown_v6.patch text/plain 34.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-02-16 13:53:48 Re: 16-bit page checksums for 9.2
Previous Message Albert Cervera i Areny 2012-02-16 11:48:13 Re: 16-bit page checksums for 9.2