Re: Optimization for updating foreign tables in Postgres FDW

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Date: 2014-07-09 15:34:27
Message-ID: CA+TgmoZko+cWGuyjxKt6rfGTtGnEYUOzkng4O0dzpEeANFgqLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 8, 2014 at 3:07 AM, Etsuro Fujita
<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Attached is a WIP patch for the following:
>
> /*
> * postgresPlanForeignModify
> * Plan an insert/update/delete operation on a foreign table
> *
> * Note: currently, the plan tree generated for UPDATE/DELETE will always
> * include a ForeignScan that retrieves ctids (using SELECT FOR UPDATE)
> * and then the ModifyTable node will have to execute individual remote
> * UPDATE/DELETE commands. If there are no local conditions or joins
> * needed, it'd be better to let the scan node do UPDATE/DELETE RETURNING
> * and then do nothing at ModifyTable. Room for future optimization ...
> */
>
> In the patch postgresPlanForeignModify has been modified so that if, in
> addition to the above condition, the followings are satisfied, then the
> ForeignScan and ModifyTable node will work that way.
>
> - There are no local BEFORE/AFTER triggers.
> - In UPDATE it's safe to evaluate expressions to assign to the target
> columns on the remote server.
>
> Here is a simple performance test.
>
> On remote side:
> postgres=# create table t (id serial primary key, inserted timestamp
> default clock_timestamp(), data text);
> CREATE TABLE
> postgres=# insert into t(data) select random() from generate_series(0,
> 99999);
> INSERT 0 100000
> postgres=# vacuum t;
> VACUUM
>
> On local side:
> postgres=# create foreign table ft (id integer, inserted timestamp, data
> text) server myserver options (table_name 't');
> CREATE FOREIGN TABLE
>
> Unpatched:
> postgres=# explain analyze verbose delete from ft where id < 10000;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------
> Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual
> time=1275.255..1275.255 rows=0 loops=1)
> Remote SQL: DELETE FROM public.t WHERE ctid = $1
> -> Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6)
> (actual time=1.180..52.095 rows=9999 loops=1)
> Output: ctid
> Remote SQL: SELECT ctid FROM public.t WHERE ((id < 10000)) FOR
> UPDATE
> Planning time: 0.112 ms
> Execution time: 1275.733 ms
> (7 rows)
>
> Patched (Note that the DELETE command has been pushed down.):
> postgres=# explain analyze verbose delete from ft where id < 10000;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------
> Delete on public.ft (cost=100.00..162.32 rows=910 width=6) (actual
> time=0.006..0.006 rows=0 loops=1)
> -> Foreign Scan on public.ft (cost=100.00..162.32 rows=910 width=6)
> (actual time=0.001..0.001 rows=0 loops=1)
> Output: ctid
> Remote SQL: DELETE FROM public.t WHERE ((id < 10000))
> Planning time: 0.101 ms
> Execution time: 8.808 ms
> (6 rows)
>
> I'll add this to the next CF. Comments are welcome.

I haven't looked at the code, but +1 for the general idea. The
concept seems good to me, and that's a very large performance
improvement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2014-07-09 15:45:48 Re: RLS Design
Previous Message Robert Haas 2014-07-09 15:28:11 Re: better atomics - v0.5