Re: Async execution of postgres_fdw.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Async execution of postgres_fdw.
Date: 2014-12-24 07:32:12
Message-ID: 20141224.163212.21062518.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, thank you for the comment, Ashutosh.

I'll return after the New Year holidays. Very sorry not
addressing them sooner but then I will have more time on this.

Have a happy holidays.

regards,

=====
> Hi Horiguchi-san,
> Here are my comments for the patches together
>
> Sanity
> 1. The patch applies cleanly but has trailing white spaces.
> [ashutosh(at)ubuntu coderoot]git apply
> /mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch
> /mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:41: trailing whitespace.
> entry->conn =
> /mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:44: trailing whitespace.
>
> /mnt/hgfs/tmp/0001-Async-exec-of-postgres_fdw.patch:611: trailing
> whitespace.
>
> warning: 3 lines add whitespace errors.
>
> 2. The patches compile cleanly.
> 3. The regression is clean, even in contrib/postgres_fdw and
> contrib/file_fdw
>
> Tests
> -------
> We need tests to make sure that the logic remains intact even after further
> changes in this area. Couple of tests which require multiple foreign scans
> within the same query fetching rows more than fetch size (100) would be
> required. Also, some DMLs, which involve multiple foreign scans would test
> the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
> scans I mean both multiple scans on a single foreign server and multiple
> scans spread across multiple foreign servers.
>
> Code
> -------
> Because previous "conn" is now replaced by "conn->pgconn", the double
> indirection makes the code a bit ugly and prone to segfaults (conn being
> NULL or invalid pointer). Can we minimize such code or wrap it under a
> macro?
>
> We need some comments about the structure definition of PgFdwConn and its
> members explaining the purpose of this structure and its members.
>
> Same is the case with enum PgFdwConnState. In fact, the state diagram of a
> connection has become more complicated with the async connections, so it
> might be better to explain that state diagram at one place in the code
> (through comments). The definition of the enum might be a good place to do
> that. Otherwise, the logic of connection maintenance is spread at multiple
> places and is difficult to understand by looking at the code.
>
> In function GetConnection(), at line
> elog(DEBUG3, "new postgres_fdw connection %p for server \"%s\"",
> - entry->conn, server->servername);
> + entry->conn->pgconn, server->servername);
>
> entry->conn->pgconn may not necessarily be a new connection and may be NULL
> (as the next line check it for being NULL). So, I think this line should be
> moved within the following if block after pgconn has been initialised with
> the new connection.
> + if (entry->conn->pgconn == NULL)
> + {
> + entry->conn->pgconn = connect_pg_server(server, user);
> + entry->conn->nscans = 0;
> + entry->conn->async_state = PGFDW_CONN_IDLE;
> + entry->conn->async_scan = NULL;
> + }
>
> The if condition if (entry->conn == NULL) in GetConnection(), used to track
> whether there is a PGConn active for the given entry, now it tracks whether
> it has PgFdwConn for the same.
>
> Please see more comments inline.
>
> On Mon, Dec 15, 2014 at 2:39 PM, Kyotaro HORIGUCHI <
> horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> >
> >
> > * Outline of this patch
> >
> > From some consideration after the previous discussion and
> > comments from others, I judged the original (WIP) patch was
> > overdone as the first step. So I reduced the patch to minimal
> > function. The new patch does the following,
> >
> > - Wrapping PGconn by PgFdwConn in order to handle multiple scans
> > on one connection.
> >
> > - The core async logic was added in fetch_more_data().
> >
>
> It might help if you can explain this logic in this mail as well as in code
> (as per my comment above).
>
>
> >
> > - Invoking remote commands asynchronously in ExecInitForeignScan.
> >
>
> > - Canceling async invocation if any other foreign scans,
> > modifies, deletes use the same connection.
> >
>
> > Cancellation is done by immediately fetching the return of
> > already-invoked acync command.
> >
>
> > * Where this patch will be effective.
> >
> > With upcoming inheritance-partition feature, this patch enables
> > stating and running foreign scans asynchronously. It will be more
> > effective for longer TAT or remote startup times, and larger
> > number of foreign servers. No negative performance effect on
> > other situations.
> >
> >
> AFAIU, this logic sends only the first query in asynchronous manner not all
> of them. Is that right? If yes, I think it's a sever limitation of the
> feature. For a query involving multiple foreign scans, only the first one
> will be done in async fashion and not the rest. Sorry, if my understanding
> is wrong.
>
> I think, we need some data which shows the speed up by this patch. You may
> construct a case, where a single query involved multiple foreign scans, and
> we can check what is the speed up obtained against the number of foreign
> scans.
>
>
>
> > * Concerns about this patch.
> >
> > - This breaks the assumption that scan starts at ExecForeignScan,
> > not ExecInitForeignScan, which might cause some problem.
> >
> >
> This should be fine as long as it doesn't have any side effects like
> sending query during EXPLAIN (which has been taken care of in this patch.)
> Do you think, we need any special handling for PREPAREd statements?
>
>
> > - error reporting code in do_sql_command is quite ugly..
> >
> >
> >
> > regards,
> >
> > --
> > Kyotaro Horiguchi
> > NTT Open Source Software Center

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2014-12-24 07:56:31 Re: bin checks taking too long.
Previous Message Kyotaro HORIGUCHI 2014-12-24 07:26:11 Re: alter user set local_preload_libraries.