Quick Links

Re: Introducing coarse grain parallelism by postgres_fdw.

Lists:	pgsql-hackers

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-07-25 08:35:04
Message-ID:	20140725.173504.267457780.horiguchi.kyotaro@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello,

I noticed that postgresql_fdw can run in parallel by very small
change. The attached patch let scans by postgres_fdws on
different foreign servers run sumiltaneously. This seems a
convenient entry point to parallel execution.

For the testing configuration which the attched sql script makes,
it almost halves the response time because the remote queries
take far longer startup time than running time. The two foreign
tables fvs1, fvs2 and fvs1_2 are defined on the same table but
fvs1 and fvs1_2 are on the same foreign server pgs1 and fvs2 is
on the another foreign server pgs2.

=# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM fvs1 a join fvs1_2 b on (a.a = b.a);
QUERY PLAN
-----------------------------------------------------------------------
Hash Join (actual time=12083.640..12083.657 rows=16 loops=1)
Hash Cond: (a.a = b.a)
-> Foreign Scan on fvs1 a (actual time=6091.405..6091.407 rows=10 loops=1)
-> Hash (actual time=5992.212..5992.212 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 7kB
-> Foreign Scan on fvs1_2 b (actual time=5992.191..5992.198 rows=10 loops=1)
Execution time: 12085.330 ms
(7 rows)

=# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM fvs1 a join fvs2 b on (a.a = b.a);
QUERY PLAN
-----------------------------------------------------------------------
Hash Join (actual time=6325.004..6325.019 rows=16 loops=1)
Hash Cond: (a.a = b.a)
-> Foreign Scan on fvs1 a (actual time=6324.910..6324.913 rows=10 loops=1)
-> Hash (actual time=0.073..0.073 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 7kB
-> Foreign Scan on fvs2 b (actual time=0.048..0.052 rows=10 loops=1)
Execution time: 6327.708 ms
(7 rows)

In turn, pure local query is executed as below..

=# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM v a join v b on (a.a = b.a);
QUERY PLAN
------------------------------------------------------------------------------
Hash Join (actual time=15757.915..15757.925 rows=16 loops=1)
Hash Cond: (a.a = b.a)
-> Limit (actual time=7795.919..7795.922 rows=10 loops=1)
-> Sort (actual time=7795.915..7795.915 rows=10 loops=1)
-> Nested Loop (actual time=54.769..7795.618 rows=252 loops=1)
-> Seq Scan on t a (actual time=0.010..2.117 rows=5000 loops=1)
-> Materialize (actual time=0.000..0.358 rows=5000 loops=5000)
-> Seq Scan on t b_1 (actual time=0.004..2.829 rows=5000 ...
-> Hash (actual time=7961.969..7961.969 rows=10 loops=1)
-> Subquery Scan on b (actual time=7961.948..7961.952 rows=10 loops=1)
-> Limit (actual time=7961.946..7961.950 rows=10 loops=1)
-> Sort (actual time=7961.946..7961.948 rows=10 loops=1)
-> Nested Loop (actual time=53.518..7961.611 rows=252 loops=1)
-> Seq Scan on t a_1 (actual time=0.004..2.247 rows=5000...
-> Materialize (actual time=0.000..0.357 rows=5000...
-> Seq Scan on t b_2 (actual time=0.001..1.565 rows=500..
Execution time: 15758.629 ms
(26 rows)

I will try this way for the present.

Any opinions or suggestions?

- Is this a correct entry point?

- Parallel postgres_fdw is of course a intermediate shape. It
should go toward more intrinsic form.

- Planner should be aware of parallelism. The first step seems to
be doable since postgres_fdw can get correct startup and running
costs. But they might should be calculated locally for loopback
connections finally. Dedicated node would be needed.

- The far effective intercommunication means between backends
including backend workers (which seems to be discussed in
another thread) is needed and this could be the test bench for
it.

- This patch is the minimal implement to get parallel scan
available. A facility to exporting/importing execution trees may
promise far flexible parallelism. Deparsing is usable to
reconstruct partial query?

- The means for resource management, especially on number of
backends is required. This could be done on foreign server in a
simple form for the present. Finally this will be moved into
intrinsic loopback connection manager?

- Any other points to consider?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment	Content-Type	Size
0001_postgres_fdw_parallelism_v0.patch	text/x-patch	5.5 KB
unknown_filename	text/plain	1.4 KB

From:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-07-25 10:07:01
Message-ID:	CAFjFpRfjwgESmLELcuvSz9jBdAueUROn7OqX6oMrNObXbm7Owg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi Kyotaro,
fetch_more_rows() always runs "FETCH 100 FROM <cursor_name>" on the foreign
server to get the next set of rows. The changes you have made seem to run
only the first FETCHes from all the nodes but not the subsequent ones. The
optimization will be helpful only when there are less than 100 rows per
postgres connection in the query. If there are more than 100 rows from a
single foreign server, the second onwards FETCHes will be serialized.

Is my understanding correct?

On Fri, Jul 25, 2014 at 2:05 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello,
>
> I noticed that postgresql_fdw can run in parallel by very small
> change. The attached patch let scans by postgres_fdws on
> different foreign servers run sumiltaneously. This seems a
> convenient entry point to parallel execution.
>
> For the testing configuration which the attched sql script makes,
> it almost halves the response time because the remote queries
> take far longer startup time than running time. The two foreign
> tables fvs1, fvs2 and fvs1_2 are defined on the same table but
> fvs1 and fvs1_2 are on the same foreign server pgs1 and fvs2 is
> on the another foreign server pgs2.
>
> =# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM fvs1 a join
> fvs1_2 b on (a.a = b.a);
> QUERY PLAN
> -----------------------------------------------------------------------
> Hash Join (actual time=12083.640..12083.657 rows=16 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a (actual time=6091.405..6091.407 rows=10
> loops=1)
> -> Hash (actual time=5992.212..5992.212 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 7kB
> -> Foreign Scan on fvs1_2 b (actual time=5992.191..5992.198 rows=10
> loops=1)
> Execution time: 12085.330 ms
> (7 rows)
>
> =# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM fvs1 a join
> fvs2 b on (a.a = b.a);
> QUERY PLAN
> -----------------------------------------------------------------------
> Hash Join (actual time=6325.004..6325.019 rows=16 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a (actual time=6324.910..6324.913 rows=10
> loops=1)
> -> Hash (actual time=0.073..0.073 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 7kB
> -> Foreign Scan on fvs2 b (actual time=0.048..0.052 rows=10 loops=1)
> Execution time: 6327.708 ms
> (7 rows)
>
> In turn, pure local query is executed as below..
>
> =# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM v a join v b
> on (a.a = b.a);
> QUERY PLAN
>
> ------------------------------------------------------------------------------
> Hash Join (actual time=15757.915..15757.925 rows=16 loops=1)
> Hash Cond: (a.a = b.a)
> -> Limit (actual time=7795.919..7795.922 rows=10 loops=1)
> -> Sort (actual time=7795.915..7795.915 rows=10 loops=1)
> -> Nested Loop (actual time=54.769..7795.618 rows=252 loops=1)
> -> Seq Scan on t a (actual time=0.010..2.117 rows=5000
> loops=1)
> -> Materialize (actual time=0.000..0.358 rows=5000
> loops=5000)
> -> Seq Scan on t b_1 (actual time=0.004..2.829 rows=5000
> ...
> -> Hash (actual time=7961.969..7961.969 rows=10 loops=1)
> -> Subquery Scan on b (actual time=7961.948..7961.952 rows=10
> loops=1)
> -> Limit (actual time=7961.946..7961.950 rows=10 loops=1)
> -> Sort (actual time=7961.946..7961.948 rows=10 loops=1)
> -> Nested Loop (actual time=53.518..7961.611 rows=252
> loops=1)
> -> Seq Scan on t a_1 (actual time=0.004..2.247
> rows=5000...
> -> Materialize (actual time=0.000..0.357 rows=5000...
> -> Seq Scan on t b_2 (actual time=0.001..1.565
> rows=500..
> Execution time: 15758.629 ms
> (26 rows)
>
>
> I will try this way for the present.
>
> Any opinions or suggestions?
>
> - Is this a correct entry point?
>
> - Parallel postgres_fdw is of course a intermediate shape. It
> should go toward more intrinsic form.
>
> - Planner should be aware of parallelism. The first step seems to
> be doable since postgres_fdw can get correct startup and running
> costs. But they might should be calculated locally for loopback
> connections finally. Dedicated node would be needed.
>
> - The far effective intercommunication means between backends
> including backend workers (which seems to be discussed in
> another thread) is needed and this could be the test bench for
> it.
>
> - This patch is the minimal implement to get parallel scan
> available. A facility to exporting/importing execution trees may
> promise far flexible parallelism. Deparsing is usable to
> reconstruct partial query?
>
> - The means for resource management, especially on number of
> backends is required. This could be done on foreign server in a
> simple form for the present. Finally this will be moved into
> intrinsic loopback connection manager?
>
> - Any other points to consider?
>
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>
> DROP SERVER IF EXISTS pgs1 CASCADE;
> DROP SERVER IF EXISTS pgs2 CASCADE;
> DROP VIEW IF EXISTS v CASCADE;
> DROP TABLE IF EXISTS t CASCADE;
>
> CREATE SERVER pgs1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '/tmp',
> dbname 'postgres', use_remote_estimate 'true');
> CREATE SERVER pgs2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '/tmp',
> dbname 'postgres', use_remote_estimate 'true');
>
> CREATE USER MAPPING FOR CURRENT_USER SERVER pgs1;
> CREATE USER MAPPING FOR CURRENT_USER SERVER pgs2;
>
> CREATE TABLE t (a int, b int, c text);
> ALTER TABLE t ALTER COLUMN c SET STORAGE PLAIN;
> INSERT INTO t (SELECT random() * 10000, random() * 10000, repeat('X',
> (random() * 1000)::int) FROM generate_series(0, 4999));
> -- EXPLAIN ANALYZE SELECT * FROM t a, t b WHERE a.b + b.b = 1000 ORDER BY
> a.b LIMIT 10;
> CREATE VIEW v AS SELECT a.a, a.b, a.c, b.a AS a2, b.b AS b2, b.c AS c2
> FROM t a, t b WHERE a.b + b.b = 1000 ORDER BY a.b LIMIT 10;
>
> CREATE FOREIGN TABLE fvs1 (a int, b int, c text, a2 int, b2 int, c2 text)
> SERVER pgs1 OPTIONS (table_name 'v');
> CREATE FOREIGN TABLE fvs1_2 (a int, b int, c text, a2 int, b2 int, c2
> text) SERVER pgs1 OPTIONS (table_name 'v');
> CREATE FOREIGN TABLE fvs2 (a int, b int, c text, a2 int, b2 int, c2 text)
> SERVER pgs2 OPTIONS (table_name 'v');
>
>
> EXPLAIN ANALYZE SELECT a.a, a.b, b.c FROM fvs1 a join fvs2 b on (a.a =
> b.a);
> EXPLAIN ANALYZE SELECT a.a, a.b, b.c FROM fvs1 a join fvs1_2 b on (a.a =
> b.a);
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

From:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-07-25 10:40:37
Message-ID:	CAFjFpRd+CKSGyfO9CgjMrQS3bX+K0CatDJaif4+vxLGe=kmL-w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

In order to minimize the impact, what can be done is to execute
fetch_more_data() in asynchronous mode every time, when there only few rows
left to be consumed. So in current code below
1019 /*
1020 * Get some more tuples, if we've run out.
1021 */
1022 if (fsstate->next_tuple >= fsstate->num_tuples)
1023 {
1024 /* No point in another fetch if we already detected EOF,
though. */
1025 if (!fsstate->eof_reached)
1026 fetch_more_data(node, false);
1027 /* If we didn't get any tuples, must be end of data. */
1028 if (fsstate->next_tuple >= fsstate->num_tuples)
1029 return ExecClearTuple(slot);
1030 }

replace line 1022 with if (fsstate->next_tuple >= fsstate->num_tuples)
with if (fsstate->next_tuple >= fsstate->num_tuples -
SOME_BUFFER_NUMBER_ROWS)

Other possibility is to call PQsendQuery(conn, sql), after line 2100 and if
eof_reached is false.

2096 /* Must be EOF if we didn't get as many tuples as we asked
for. */
2097 fsstate->eof_reached = (numrows < fetch_size);
2098
2099 PQclear(res);
2100 res = NULL;

On Fri, Jul 25, 2014 at 3:37 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> Hi Kyotaro,
> fetch_more_rows() always runs "FETCH 100 FROM <cursor_name>" on the
> foreign server to get the next set of rows. The changes you have made seem
> to run only the first FETCHes from all the nodes but not the subsequent
> ones. The optimization will be helpful only when there are less than 100
> rows per postgres connection in the query. If there are more than 100 rows
> from a single foreign server, the second onwards FETCHes will be serialized.
>
> Is my understanding correct?
>
>
> On Fri, Jul 25, 2014 at 2:05 PM, Kyotaro HORIGUCHI <
> horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
>> Hello,
>>
>> I noticed that postgresql_fdw can run in parallel by very small
>> change. The attached patch let scans by postgres_fdws on
>> different foreign servers run sumiltaneously. This seems a
>> convenient entry point to parallel execution.
>>
>> For the testing configuration which the attched sql script makes,
>> it almost halves the response time because the remote queries
>> take far longer startup time than running time. The two foreign
>> tables fvs1, fvs2 and fvs1_2 are defined on the same table but
>> fvs1 and fvs1_2 are on the same foreign server pgs1 and fvs2 is
>> on the another foreign server pgs2.
>>
>> =# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM fvs1 a join
>> fvs1_2 b on (a.a = b.a);
>> QUERY PLAN
>> -----------------------------------------------------------------------
>> Hash Join (actual time=12083.640..12083.657 rows=16 loops=1)
>> Hash Cond: (a.a = b.a)
>> -> Foreign Scan on fvs1 a (actual time=6091.405..6091.407 rows=10
>> loops=1)
>> -> Hash (actual time=5992.212..5992.212 rows=10 loops=1)
>> Buckets: 1024 Batches: 1 Memory Usage: 7kB
>> -> Foreign Scan on fvs1_2 b (actual time=5992.191..5992.198 rows=10
>> loops=1)
>> Execution time: 12085.330 ms
>> (7 rows)
>>
>> =# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM fvs1 a join
>> fvs2 b on (a.a = b.a);
>> QUERY PLAN
>> -----------------------------------------------------------------------
>> Hash Join (actual time=6325.004..6325.019 rows=16 loops=1)
>> Hash Cond: (a.a = b.a)
>> -> Foreign Scan on fvs1 a (actual time=6324.910..6324.913 rows=10
>> loops=1)
>> -> Hash (actual time=0.073..0.073 rows=10 loops=1)
>> Buckets: 1024 Batches: 1 Memory Usage: 7kB
>> -> Foreign Scan on fvs2 b (actual time=0.048..0.052 rows=10 loops=1)
>> Execution time: 6327.708 ms
>> (7 rows)
>>
>> In turn, pure local query is executed as below..
>>
>> =# EXPLAIN (ANALYZE on, COSTS off) SELECT a.a, a.b, b.c FROM v a join v b
>> on (a.a = b.a);
>> QUERY PLAN
>>
>> ------------------------------------------------------------------------------
>> Hash Join (actual time=15757.915..15757.925 rows=16 loops=1)
>> Hash Cond: (a.a = b.a)
>> -> Limit (actual time=7795.919..7795.922 rows=10 loops=1)
>> -> Sort (actual time=7795.915..7795.915 rows=10 loops=1)
>> -> Nested Loop (actual time=54.769..7795.618 rows=252 loops=1)
>> -> Seq Scan on t a (actual time=0.010..2.117 rows=5000
>> loops=1)
>> -> Materialize (actual time=0.000..0.358 rows=5000
>> loops=5000)
>> -> Seq Scan on t b_1 (actual time=0.004..2.829 rows=5000
>> ...
>> -> Hash (actual time=7961.969..7961.969 rows=10 loops=1)
>> -> Subquery Scan on b (actual time=7961.948..7961.952 rows=10
>> loops=1)
>> -> Limit (actual time=7961.946..7961.950 rows=10 loops=1)
>> -> Sort (actual time=7961.946..7961.948 rows=10 loops=1)
>> -> Nested Loop (actual time=53.518..7961.611 rows=252
>> loops=1)
>> -> Seq Scan on t a_1 (actual time=0.004..2.247
>> rows=5000...
>> -> Materialize (actual time=0.000..0.357 rows=5000...
>> -> Seq Scan on t b_2 (actual time=0.001..1.565
>> rows=500..
>> Execution time: 15758.629 ms
>> (26 rows)
>>
>>
>> I will try this way for the present.
>>
>> Any opinions or suggestions?
>>
>> - Is this a correct entry point?
>>
>> - Parallel postgres_fdw is of course a intermediate shape. It
>> should go toward more intrinsic form.
>>
>> - Planner should be aware of parallelism. The first step seems to
>> be doable since postgres_fdw can get correct startup and running
>> costs. But they might should be calculated locally for loopback
>> connections finally. Dedicated node would be needed.
>>
>> - The far effective intercommunication means between backends
>> including backend workers (which seems to be discussed in
>> another thread) is needed and this could be the test bench for
>> it.
>>
>> - This patch is the minimal implement to get parallel scan
>> available. A facility to exporting/importing execution trees may
>> promise far flexible parallelism. Deparsing is usable to
>> reconstruct partial query?
>>
>> - The means for resource management, especially on number of
>> backends is required. This could be done on foreign server in a
>> simple form for the present. Finally this will be moved into
>> intrinsic loopback connection manager?
>>
>> - Any other points to consider?
>>
>>
>> regards,
>>
>> --
>> Kyotaro Horiguchi
>> NTT Open Source Software Center
>>
>> DROP SERVER IF EXISTS pgs1 CASCADE;
>> DROP SERVER IF EXISTS pgs2 CASCADE;
>> DROP VIEW IF EXISTS v CASCADE;
>> DROP TABLE IF EXISTS t CASCADE;
>>
>> CREATE SERVER pgs1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host
>> '/tmp', dbname 'postgres', use_remote_estimate 'true');
>> CREATE SERVER pgs2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host
>> '/tmp', dbname 'postgres', use_remote_estimate 'true');
>>
>> CREATE USER MAPPING FOR CURRENT_USER SERVER pgs1;
>> CREATE USER MAPPING FOR CURRENT_USER SERVER pgs2;
>>
>> CREATE TABLE t (a int, b int, c text);
>> ALTER TABLE t ALTER COLUMN c SET STORAGE PLAIN;
>> INSERT INTO t (SELECT random() * 10000, random() * 10000, repeat('X',
>> (random() * 1000)::int) FROM generate_series(0, 4999));
>> -- EXPLAIN ANALYZE SELECT * FROM t a, t b WHERE a.b + b.b = 1000 ORDER BY
>> a.b LIMIT 10;
>> CREATE VIEW v AS SELECT a.a, a.b, a.c, b.a AS a2, b.b AS b2, b.c AS c2
>> FROM t a, t b WHERE a.b + b.b = 1000 ORDER BY a.b LIMIT 10;
>>
>> CREATE FOREIGN TABLE fvs1 (a int, b int, c text, a2 int, b2 int, c2 text)
>> SERVER pgs1 OPTIONS (table_name 'v');
>> CREATE FOREIGN TABLE fvs1_2 (a int, b int, c text, a2 int, b2 int, c2
>> text) SERVER pgs1 OPTIONS (table_name 'v');
>> CREATE FOREIGN TABLE fvs2 (a int, b int, c text, a2 int, b2 int, c2 text)
>> SERVER pgs2 OPTIONS (table_name 'v');
>>
>>
>> EXPLAIN ANALYZE SELECT a.a, a.b, b.c FROM fvs1 a join fvs2 b on (a.a =
>> b.a);
>> EXPLAIN ANALYZE SELECT a.a, a.b, b.c FROM fvs1 a join fvs1_2 b on (a.a =
>> b.a);
>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To:	ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-07-28 09:15:45
Message-ID:	20140728.181545.225059735.horiguchi.kyotaro@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello, thank you for the comment.

> Hi Kyotaro,
> fetch_more_rows() always runs "FETCH 100 FROM <cursor_name>" on the foreign
> server to get the next set of rows. The changes you have made seem to run
> only the first FETCHes from all the nodes but not the subsequent ones. The
> optimization will be helpful only when there are less than 100 rows per
> postgres connection in the query. If there are more than 100 rows from a
> single foreign server, the second onwards FETCHes will be serialized.
>
> Is my understanding correct?

Yes, you're right. So I wrote that as following.

Me> it almost halves the response time because the remote queries
Me> take far longer startup time than running time.

Parallelizing all FETCHes would be effective if the connection
transfers bytes at a speed near the row fetch speed but I
excluded the case because of the my assumption that the chance is
relatively lower for the gain, and for the simplicity as PoC. If
this approach is not so inappropriate and not getting objections,
I will work on this for the more complete implement, including
cost estimation.

> On Fri, Jul 25, 2014 at 2:05 PM, Kyotaro HORIGUCHI <
> horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> > Hello,
> >
> > I noticed that postgresql_fdw can run in parallel by very small
> > change. The attached patch let scans by postgres_fdws on
> > different foreign servers run sumiltaneously. This seems a
> > convenient entry point to parallel execution.
> >
> > For the testing configuration which the attched sql script makes,
> > it almost halves the response time because the remote queries
> > take far longer startup time than running time. The two foreign
> > tables fvs1, fvs2 and fvs1_2 are defined on the same table but
> > fvs1 and fvs1_2 are on the same foreign server pgs1 and fvs2 is
> > on the another foreign server pgs2.

--
Kyotaro Horiguchi
NTT Open Source Software Center

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To:	ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-07-28 09:24:43
Message-ID:	20140728.182443.54408478.horiguchi.kyotaro@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello,

> In order to minimize the impact, what can be done is to execute
> fetch_more_data() in asynchronous mode every time, when there only few rows
> left to be consumed. So in current code below
> 1019 /*
> 1020 * Get some more tuples, if we've run out.
> 1021 */
> 1022 if (fsstate->next_tuple >= fsstate->num_tuples)
> 1023 {
> 1024 /* No point in another fetch if we already detected EOF,
> though. */
> 1025 if (!fsstate->eof_reached)
> 1026 fetch_more_data(node, false);
> 1027 /* If we didn't get any tuples, must be end of data. */
> 1028 if (fsstate->next_tuple >= fsstate->num_tuples)
> 1029 return ExecClearTuple(slot);
> 1030 }
>
> replace line 1022 with if (fsstate->next_tuple >= fsstate->num_tuples)
> with if (fsstate->next_tuple >= fsstate->num_tuples -
> SOME_BUFFER_NUMBER_ROWS)
> Other possibility is to call PQsendQuery(conn, sql), after line 2100 and if
> eof_reached is false.
>
> 2096 /* Must be EOF if we didn't get as many tuples as we asked
> for. */
> 2097 fsstate->eof_reached = (numrows < fetch_size);
> 2098
> 2099 PQclear(res);
> 2100 res = NULL;

I see, I'll consider it. If late (lazy) error detection is
allowed, single row mode seems available, too.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To:	ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-08-01 09:10:55
Message-ID:	20140801.181055.143272972.horiguchi.kyotaro@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello, this is the new version which is complete to some extent
of parallelism based on postgres_fdw.

This compares the costs for parallel and non-parallel execution
and choose parallel one if it is faster by some extent specified
by GUCs. The attached files are,

0001_parallel_exec_planning_v0.patch:
- PostgreSQL body stuff for parallel execution planning.

0002_enable_postgres_fdw_to_run_in_parallel_v0.patch:
- postgres_fdw parallelization.

0003_file_fdw_changes_to_avoid_error.patch:
- error avoidig stuff for file_fdw (not necessary for this patch)

env.sql:
- simple test script to try this patch.

=====

- planner stuff to handle cost of parallel execution. Including
indication of parallel execution.

- GUCs to control how easy to go parallel.

parallel_cost_threshold is the threshold of path total cost
where to enable parallel execution.

prallel_ratio_threshond is the threshold of the ratio of
parallel cost to non-parallel cost where to choose the
parallel path.

- postgres_fdw which can run in multiple sessions using snapshot
export and fetches in parallel for foreign scans on dedicated
connections.

foreign server has a new option 'max_aux_connections', which
limits the number of connections for parallel execution per
(server, user) pairs.

- change file_fdw to follow the changes of planner stuff.

Whth the patch attached, the attached sql script shows the
following result (after some line breaks are added).

postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
FROM fvs1 a join fvs1_2 b on (a.a = b.a);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=9573392.96..9573393.34 rows=1 width=40 parallel)
(actual time=2213.400..2213.407 rows=12 loops=1)
Hash Cond: (a.a = b.a)
-> Foreign Scan on fvs1 a
(cost=9573392.96..9573393.29 rows=10 width=8 parallel)
(actual time=2199.992..2199.993 rows=10 loops=1)
-> Hash (cost=9573393.29..9573393.29 rows=10 width=36)
(actual time=13.388..13.388 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 6kB
-> Foreign Scan on fvs1_2 b
(cost=9573392.96..9573393.29 rows=10 width=36 parallel)
(actual time=13.376..13.379 rows=10 loops=1)
Planning time: 4.761 ms
Execution time: 2227.462 ms
(8 rows)
postgres=# SET parallel_ratio_threshold to 0.0;
postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
FROM fvs1 a join fvs1 b on (a.a = b.a);
QUERY PLAN
------------------------------------------------------------------------------
Hash Join (cost=318084.32..318084.69 rows=1 width=40)
(actual time=4302.913..4302.928 rows=12 loops=1)
Hash Cond: (a.a = b.a)
-> Foreign Scan on fvs1 a (cost=159041.93..159042.26 rows=10 width=8)
(actual time=2122.989..2122.992 rows=10 loops=1)
-> Hash (cost=159042.26..159042.26 rows=10 width=500)
(actual time=2179.900..2179.900 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 6kB
-> Foreign Scan on fvs1 b
(cost=159041.93..159042.26 rows=10 width=500)
(actual time=2179.856..2179.864 rows=10 loops=1)
Planning time: 5.085 ms
Execution time: 4303.728 ms
(8 rows)

Where, "parallel" indicates that the node includes nodes run in
parallel. The latter EXPLAIN shows the result when parallel
execution is inhibited.

Since the lack of time, sorry that the details for this patch is
comming later.

Is there any suggestions or opinions?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment	Content-Type	Size
0001_parallel_exec_planning_v0.patch	text/x-patch	30.5 KB
0002_enable_postgres_fdw_to_run_in_parallel_v0.patch	text/x-patch	27.5 KB
0003_file_fdw_changes_to_avoid_error.patch	text/x-patch	627 bytes
unknown_filename	text/plain	1.1 KB

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To:	ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-08-01 09:18:37
Message-ID:	20140801.181837.27561179.horiguchi.kyotaro@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello,

> Hello, this is the new version which is complete to some extent
> of parallelism based on postgres_fdw.
>
> This compares the costs for parallel and non-parallel execution
> and choose parallel one if it is faster by some extent specified
> by GUCs. The attached files are,
>
> 0001_parallel_exec_planning_v0.patch:
> - PostgreSQL body stuff for parallel execution planning.
>
> 0002_enable_postgres_fdw_to_run_in_parallel_v0.patch:
> - postgres_fdw parallelization.
>
> 0003_file_fdw_changes_to_avoid_error.patch:
> - error avoidig stuff for file_fdw (not necessary for this patch)
>
> env.sql:
> - simple test script to try this patch.
>
> =====
>
> - planner stuff to handle cost of parallel execution. Including
> indication of parallel execution.
>
> - GUCs to control how easy to go parallel.
>
> parallel_cost_threshold is the threshold of path total cost
> where to enable parallel execution.
>
> prallel_ratio_threshond is the threshold of the ratio of
> parallel cost to non-parallel cost where to choose the
> parallel path.
>
> - postgres_fdw which can run in multiple sessions using snapshot
> export and fetches in parallel for foreign scans on dedicated
> connections.

But now the effect of async execution of FETCH'es is omitted
during planning.

> foreign server has a new option 'max_aux_connections', which
> limits the number of connections for parallel execution per
> (server, user) pairs.
>
> - change file_fdw to follow the changes of planner stuff.
>
>
> Whth the patch attached, the attached sql script shows the
> following result (after some line breaks are added).
>
> postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> FROM fvs1 a join fvs1_2 b on (a.a = b.a);
> QUERY PLAN
> ----------------------------------------------------------------------------
> Hash Join (cost=9573392.96..9573393.34 rows=1 width=40 parallel)
> (actual time=2213.400..2213.407 rows=12 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a
> (cost=9573392.96..9573393.29 rows=10 width=8 parallel)
> (actual time=2199.992..2199.993 rows=10 loops=1)
> -> Hash (cost=9573393.29..9573393.29 rows=10 width=36)
> (actual time=13.388..13.388 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 6kB
> -> Foreign Scan on fvs1_2 b
> (cost=9573392.96..9573393.29 rows=10 width=36 parallel)
> (actual time=13.376..13.379 rows=10 loops=1)
> Planning time: 4.761 ms
> Execution time: 2227.462 ms
> (8 rows)
> postgres=# SET parallel_ratio_threshold to 0.0;
> postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> FROM fvs1 a join fvs1 b on (a.a = b.a);
> QUERY PLAN
> ------------------------------------------------------------------------------
> Hash Join (cost=318084.32..318084.69 rows=1 width=40)
> (actual time=4302.913..4302.928 rows=12 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a (cost=159041.93..159042.26 rows=10 width=8)
> (actual time=2122.989..2122.992 rows=10 loops=1)
> -> Hash (cost=159042.26..159042.26 rows=10 width=500)
> (actual time=2179.900..2179.900 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 6kB
> -> Foreign Scan on fvs1 b
> (cost=159041.93..159042.26 rows=10 width=500)
> (actual time=2179.856..2179.864 rows=10 loops=1)
> Planning time: 5.085 ms
> Execution time: 4303.728 ms
> (8 rows)
>
> Where, "parallel" indicates that the node includes nodes run in
> parallel. The latter EXPLAIN shows the result when parallel
> execution is inhibited.
>
> Since the lack of time, sorry that the details for this patch is
> comming later.
>
> Is there any suggestions or opinions?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

From:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-08-05 11:05:38
Message-ID:	CAFjFpRe3DMxnqqmL09yeyuSmtMLnJ8saDkwVKQdB2TOwppqg0w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi Kyotaro,
I looked at the patches and felt that the approach taken here is too
intrusive, considering that the feature is only for foreign scans.

There are quite a few members added to the generic Path, Plan structures,
whose use is is induced only through foreign scans. Each path now stores
two sets of costs, one with parallelism and one without. The parallel
values will make sense only when there is a foreign scan, which uses
parallelism, in the plan tree. So, those costs are maintained unnecessarily
or the memory for those members is wasted in most of the cases, where the
tables involved are not foreign. Also, not many foreign tables will be able
to use the parallelism, e.g. file_fdw. Although, that's my opinion; I would
like hear from others.

Instead, an FDW which can use parallelism can add two paths one with and
one without parallelism with appropriate costs and let the logic choosing
the cheapest path take care of the actual choice. In fact, I thought,
parallelism would be always faster than the non-parallel one, except when
the foreign server is too much loaded. But we won't be able to check that
anyway. Can you point out a case where the parallelism may not win over
serial execution?

BTW, the name parallelism seems to be misleading here. All, it will be able
to do is fire the queries (or data fetch requests) asynchronously. So, we
might want to change the naming appropriately.

On Fri, Aug 1, 2014 at 2:48 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello,
>
> > Hello, this is the new version which is complete to some extent
> > of parallelism based on postgres_fdw.
> >
> > This compares the costs for parallel and non-parallel execution
> > and choose parallel one if it is faster by some extent specified
> > by GUCs. The attached files are,
> >
> > 0001_parallel_exec_planning_v0.patch:
> > - PostgreSQL body stuff for parallel execution planning.
> >
> > 0002_enable_postgres_fdw_to_run_in_parallel_v0.patch:
> > - postgres_fdw parallelization.
> >
> > 0003_file_fdw_changes_to_avoid_error.patch:
> > - error avoidig stuff for file_fdw (not necessary for this patch)
> >
> > env.sql:
> > - simple test script to try this patch.
> >
> > =====
> >
> > - planner stuff to handle cost of parallel execution. Including
> > indication of parallel execution.
> >
> > - GUCs to control how easy to go parallel.
> >
> > parallel_cost_threshold is the threshold of path total cost
> > where to enable parallel execution.
> >
> > prallel_ratio_threshond is the threshold of the ratio of
> > parallel cost to non-parallel cost where to choose the
> > parallel path.
> >
> > - postgres_fdw which can run in multiple sessions using snapshot
> > export and fetches in parallel for foreign scans on dedicated
> > connections.
>
> But now the effect of async execution of FETCH'es is omitted
> during planning.
>
> > foreign server has a new option 'max_aux_connections', which
> > limits the number of connections for parallel execution per
> > (server, user) pairs.
> >
> > - change file_fdw to follow the changes of planner stuff.
> >
> >
> > Whth the patch attached, the attached sql script shows the
> > following result (after some line breaks are added).
> >
> > postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> > FROM fvs1 a join fvs1_2 b on (a.a = b.a);
> > QUERY PLAN
> >
> ----------------------------------------------------------------------------
> > Hash Join (cost=9573392.96..9573393.34 rows=1 width=40 parallel)
> > (actual time=2213.400..2213.407 rows=12 loops=1)
> > Hash Cond: (a.a = b.a)
> > -> Foreign Scan on fvs1 a
> > (cost=9573392.96..9573393.29 rows=10 width=8 parallel)
> > (actual time=2199.992..2199.993 rows=10 loops=1)
> > -> Hash (cost=9573393.29..9573393.29 rows=10 width=36)
> > (actual time=13.388..13.388 rows=10 loops=1)
> > Buckets: 1024 Batches: 1 Memory Usage: 6kB
> > -> Foreign Scan on fvs1_2 b
> > (cost=9573392.96..9573393.29 rows=10 width=36
> parallel)
> > (actual time=13.376..13.379 rows=10 loops=1)
> > Planning time: 4.761 ms
> > Execution time: 2227.462 ms
> > (8 rows)
> > postgres=# SET parallel_ratio_threshold to 0.0;
> > postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> > FROM fvs1 a join fvs1 b on (a.a = b.a);
> > QUERY PLAN
> >
> ------------------------------------------------------------------------------
> > Hash Join (cost=318084.32..318084.69 rows=1 width=40)
> > (actual time=4302.913..4302.928 rows=12 loops=1)
> > Hash Cond: (a.a = b.a)
> > -> Foreign Scan on fvs1 a (cost=159041.93..159042.26 rows=10
> width=8)
> > (actual time=2122.989..2122.992 rows=10
> loops=1)
> > -> Hash (cost=159042.26..159042.26 rows=10 width=500)
> > (actual time=2179.900..2179.900 rows=10 loops=1)
> > Buckets: 1024 Batches: 1 Memory Usage: 6kB
> > -> Foreign Scan on fvs1 b
> > (cost=159041.93..159042.26 rows=10 width=500)
> > (actual time=2179.856..2179.864 rows=10 loops=1)
> > Planning time: 5.085 ms
> > Execution time: 4303.728 ms
> > (8 rows)
> >
> > Where, "parallel" indicates that the node includes nodes run in
> > parallel. The latter EXPLAIN shows the result when parallel
> > execution is inhibited.
> >
> > Since the lack of time, sorry that the details for this patch is
> > comming later.
> >
> > Is there any suggestions or opinions?
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-08-06 12:08:58
Message-ID:	CA+TgmoZ0udKLs6F1zFG-Uoe6uo7JJJCWOPwQQu0xa2Oq8pE6ig@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 5, 2014 at 7:05 AM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> There are quite a few members added to the generic Path, Plan structures,
> whose use is is induced only through foreign scans. Each path now stores two
> sets of costs, one with parallelism and one without. The parallel values
> will make sense only when there is a foreign scan, which uses parallelism,
> in the plan tree. So, those costs are maintained unnecessarily or the memory
> for those members is wasted in most of the cases, where the tables involved
> are not foreign.

Yeah, I don't think that's going to be acceptable.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To:	ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-08-08 03:24:10
Message-ID:	20140808.122410.237073172.horiguchi.kyotaro@lab.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi, thank you for the comment.

> Hi Kyotaro,
> I looked at the patches and felt that the approach taken here is too
> intrusive, considering that the feature is only for foreign scans.

I agree to you premising that it's only for foreign scans but I
regard it as an example of parallel execution planning.

> There are quite a few members added to the generic Path, Plan structures,
> whose use is is induced only through foreign scans. Each path now stores
> two sets of costs, one with parallelism and one without. The parallel
> values will make sense only when there is a foreign scan, which uses
> parallelism, in the plan tree. So, those costs are maintained unnecessarily
> or the memory for those members is wasted in most of the cases, where the
> tables involved are not foreign. Also, not many foreign tables will be able
> to use the parallelism, e.g. file_fdw. Although, that's my opinion; I would
> like hear from others.

I intended to discuss what the estimation and planning for
parallel exexution (not limited to foreign scan) would be
like. Backgroud worker would be able to take on executing some
portion of path tree in 'parallel'. The postgres_fdw for this
patch is simply a case in planning of parallel
executions. Although, as you see, it does only choosing whether
to go parallel for the path constructed regardless of parallel
execution but thinking of the possible alternate paths of
parallel execution will cost too much.

Limiting to parallel scans for this discussion, the overall gain
by multiple simultaneous scans distributed in path/plan tree
won't be known before cost counting is done up to the root node
(more precisely the common parent of them). This patch foolishly
does bucket brigade of parallel cost up to root node, but there
should be smarter way to shortcut it, for example, simplly
picking up parallelizable nodes by scanning completed path/plan
tree and calculate the probably-eliminable costs from them, then
subtract it from or compare to the total (nonparallel) cost. This
might be more acceptable for everyone than current implement.

> Instead, an FDW which can use parallelism can add two paths one with and
> one without parallelism with appropriate costs and let the logic choosing
> the cheapest path take care of the actual choice. In fact, I thought,
> parallelism would be always faster than the non-parallel one, except when
> the foreign server is too much loaded. But we won't be able to check that
> anyway. Can you point out a case where the parallelism may not win over
> serial execution?

It always wins against serial execution if parallel execution can
launched with no extra cost. But actually it costs extra resource
so I thought that parallel execution should be curbed for small
gain. It's the two GUCs added by this patch and what
choose_parallel_scans() does, although in non-automated way. The
overloading issue is not a matter confined to parallel execution
but surely it will be more severe since it is less visible and
controllable from users. However, it anyhow would should go to
manual configuration at end.

> BTW, the name parallelism seems to be misleading here. All, it will be able
> to do is fire the queries (or data fetch requests) asynchronously. So, we
> might want to change the naming appropriately.

It is right ragarding what I did exactly to postgres_fdw. But not
allowing all intermedate tuples from child execution nodes in
parallel to be piled up on memory without restriction, I suppose
all 'parallel' execution to be a kind of this 'asynchronous'
startup/fething. As for postgres_fdw, it would look more like
'parallel' (and perhaps more effeicient) by processing queries
using libpq's single-row mode instead of a cursor but the similar
processing takes place under system calls even for the case.

Well, I will try to make the version not including parallel costs
in plan/path structs, and single-row mode for postgres_fdw. I
hope it will go towards anything.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

From:	Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Introducing coarse grain parallelism by postgres_fdw.
Date:	2014-08-08 06:23:13
Message-ID:	CAFjFpRfdt+kUN5HKR8PhPhsOZBGe1zQYoubmQ9CTVKkDLHqtwg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Aug 8, 2014 at 8:54 AM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hi, thank you for the comment.
>
> > Hi Kyotaro,
> > I looked at the patches and felt that the approach taken here is too
> > intrusive, considering that the feature is only for foreign scans.
>
> I agree to you premising that it's only for foreign scans but I
> regard it as an example of parallel execution planning.
>
> > There are quite a few members added to the generic Path, Plan structures,
> > whose use is is induced only through foreign scans. Each path now stores
> > two sets of costs, one with parallelism and one without. The parallel
> > values will make sense only when there is a foreign scan, which uses
> > parallelism, in the plan tree. So, those costs are maintained
> unnecessarily
> > or the memory for those members is wasted in most of the cases, where the
> > tables involved are not foreign. Also, not many foreign tables will be
> able
> > to use the parallelism, e.g. file_fdw. Although, that's my opinion; I
> would
> > like hear from others.
>
> I intended to discuss what the estimation and planning for
> parallel exexution (not limited to foreign scan) would be
> like. Backgroud worker would be able to take on executing some
> portion of path tree in 'parallel'. The postgres_fdw for this
> patch is simply a case in planning of parallel
> executions. Although, as you see, it does only choosing whether
> to go parallel for the path constructed regardless of parallel
> execution but thinking of the possible alternate paths of
> parallel execution will cost too much.
>
> Limiting to parallel scans for this discussion, the overall gain
> by multiple simultaneous scans distributed in path/plan tree
> won't be known before cost counting is done up to the root node
> (more precisely the common parent of them). This patch foolishly
> does bucket brigade of parallel cost up to root node, but there
> should be smarter way to shortcut it, for example, simplly
> picking up parallelizable nodes by scanning completed path/plan
> tree and calculate the probably-eliminable costs from them, then
> subtract it from or compare to the total (nonparallel) cost. This
> might be more acceptable for everyone than current implement.
>
>
Planning for parallel execution, would be a much harder problem to solve.
Just to give a glimpse, how many worker backends can be spawned depends
entirely on the load at the time of execution. For prepared queries, the
load condition can change between planning and execution and thus the
number of parallel backends, which would decide the actual time of
execution and hence cost, can not be estimated at the time of the planning.
Mixing this that parallelism with FDW's parallelism would make things even
more complicated. I think those two problems are to be solved in different
ways.

> > Instead, an FDW which can use parallelism can add two paths one with and
> > one without parallelism with appropriate costs and let the logic choosing
> > the cheapest path take care of the actual choice. In fact, I thought,
> > parallelism would be always faster than the non-parallel one, except when
> > the foreign server is too much loaded. But we won't be able to check that
> > anyway. Can you point out a case where the parallelism may not win over
> > serial execution?
>
> It always wins against serial execution if parallel execution can
> launched with no extra cost. But actually it costs extra resource
> so I thought that parallel execution should be curbed for small
> gain. It's the two GUCs added by this patch and what
> choose_parallel_scans() does, although in non-automated way. The
> overloading issue is not a matter confined to parallel execution
> but surely it will be more severe since it is less visible and
> controllable from users. However, it anyhow would should go to
> manual configuration at end.
>

I am not sure, whether the way this patch provides manual control is really
effective or in-effective without understanding the full impact. Do we have
any numbers to show the cases, when parallelism would effective and when it
would not and how those GUCs help choose the effective one?

>
> > BTW, the name parallelism seems to be misleading here. All, it will be
> able
> > to do is fire the queries (or data fetch requests) asynchronously. So, we
> > might want to change the naming appropriately.
>
> It is right ragarding what I did exactly to postgres_fdw. But not
> allowing all intermedate tuples from child execution nodes in
> parallel to be piled up on memory without restriction, I suppose
> all 'parallel' execution to be a kind of this 'asynchronous'
> startup/fething. As for postgres_fdw, it would look more like
> 'parallel' (and perhaps more effeicient) by processing queries
> using libpq's single-row mode instead of a cursor but the similar
> processing takes place under system calls even for the case.
>
>
By single mode, do you mean executing FETCH for every row? That wouldn't be
efficient, since each row will then incur messaging cost between local and
foreign server, which can not be neglected for libpq at least.

>
> Well, I will try to make the version not including parallel costs
> in plan/path structs, and single-row mode for postgres_fdw. I
> hope it will go towards anything.
>
> regards,
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company