Re: Introducing coarse grain parallelism by postgres_fdw.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Introducing coarse grain parallelism by postgres_fdw.
Date: 2014-08-01 09:10:55
Message-ID: 20140801.181055.143272972.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, this is the new version which is complete to some extent
of parallelism based on postgres_fdw.

This compares the costs for parallel and non-parallel execution
and choose parallel one if it is faster by some extent specified
by GUCs. The attached files are,

0001_parallel_exec_planning_v0.patch:
- PostgreSQL body stuff for parallel execution planning.

0002_enable_postgres_fdw_to_run_in_parallel_v0.patch:
- postgres_fdw parallelization.

0003_file_fdw_changes_to_avoid_error.patch:
- error avoidig stuff for file_fdw (not necessary for this patch)

env.sql:
- simple test script to try this patch.

=====

- planner stuff to handle cost of parallel execution. Including
indication of parallel execution.

- GUCs to control how easy to go parallel.

parallel_cost_threshold is the threshold of path total cost
where to enable parallel execution.

prallel_ratio_threshond is the threshold of the ratio of
parallel cost to non-parallel cost where to choose the
parallel path.

- postgres_fdw which can run in multiple sessions using snapshot
export and fetches in parallel for foreign scans on dedicated
connections.

foreign server has a new option 'max_aux_connections', which
limits the number of connections for parallel execution per
(server, user) pairs.

- change file_fdw to follow the changes of planner stuff.

Whth the patch attached, the attached sql script shows the
following result (after some line breaks are added).

postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
FROM fvs1 a join fvs1_2 b on (a.a = b.a);
QUERY PLAN
----------------------------------------------------------------------------
Hash Join (cost=9573392.96..9573393.34 rows=1 width=40 parallel)
(actual time=2213.400..2213.407 rows=12 loops=1)
Hash Cond: (a.a = b.a)
-> Foreign Scan on fvs1 a
(cost=9573392.96..9573393.29 rows=10 width=8 parallel)
(actual time=2199.992..2199.993 rows=10 loops=1)
-> Hash (cost=9573393.29..9573393.29 rows=10 width=36)
(actual time=13.388..13.388 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 6kB
-> Foreign Scan on fvs1_2 b
(cost=9573392.96..9573393.29 rows=10 width=36 parallel)
(actual time=13.376..13.379 rows=10 loops=1)
Planning time: 4.761 ms
Execution time: 2227.462 ms
(8 rows)
postgres=# SET parallel_ratio_threshold to 0.0;
postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
FROM fvs1 a join fvs1 b on (a.a = b.a);
QUERY PLAN
------------------------------------------------------------------------------
Hash Join (cost=318084.32..318084.69 rows=1 width=40)
(actual time=4302.913..4302.928 rows=12 loops=1)
Hash Cond: (a.a = b.a)
-> Foreign Scan on fvs1 a (cost=159041.93..159042.26 rows=10 width=8)
(actual time=2122.989..2122.992 rows=10 loops=1)
-> Hash (cost=159042.26..159042.26 rows=10 width=500)
(actual time=2179.900..2179.900 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 6kB
-> Foreign Scan on fvs1 b
(cost=159041.93..159042.26 rows=10 width=500)
(actual time=2179.856..2179.864 rows=10 loops=1)
Planning time: 5.085 ms
Execution time: 4303.728 ms
(8 rows)

Where, "parallel" indicates that the node includes nodes run in
parallel. The latter EXPLAIN shows the result when parallel
execution is inhibited.

Since the lack of time, sorry that the details for this patch is
comming later.

Is there any suggestions or opinions?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


Attachment Content-Type Size
0001_parallel_exec_planning_v0.patch text/x-patch 30.5 KB
0002_enable_postgres_fdw_to_run_in_parallel_v0.patch text/x-patch 27.5 KB
0003_file_fdw_changes_to_avoid_error.patch text/x-patch 627 bytes
unknown_filename text/plain 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-08-01 09:18:37 Re: Introducing coarse grain parallelism by postgres_fdw.
Previous Message Anastasia Lubennikova 2014-08-01 07:58:38 Index-only scans for GIST