Re: Introducing coarse grain parallelism by postgres_fdw.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Introducing coarse grain parallelism by postgres_fdw.
Date: 2014-08-01 09:18:37
Message-ID: 20140801.181837.27561179.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> Hello, this is the new version which is complete to some extent
> of parallelism based on postgres_fdw.
>
> This compares the costs for parallel and non-parallel execution
> and choose parallel one if it is faster by some extent specified
> by GUCs. The attached files are,
>
> 0001_parallel_exec_planning_v0.patch:
> - PostgreSQL body stuff for parallel execution planning.
>
> 0002_enable_postgres_fdw_to_run_in_parallel_v0.patch:
> - postgres_fdw parallelization.
>
> 0003_file_fdw_changes_to_avoid_error.patch:
> - error avoidig stuff for file_fdw (not necessary for this patch)
>
> env.sql:
> - simple test script to try this patch.
>
> =====
>
> - planner stuff to handle cost of parallel execution. Including
> indication of parallel execution.
>
> - GUCs to control how easy to go parallel.
>
> parallel_cost_threshold is the threshold of path total cost
> where to enable parallel execution.
>
> prallel_ratio_threshond is the threshold of the ratio of
> parallel cost to non-parallel cost where to choose the
> parallel path.
>
> - postgres_fdw which can run in multiple sessions using snapshot
> export and fetches in parallel for foreign scans on dedicated
> connections.

But now the effect of async execution of FETCH'es is omitted
during planning.

> foreign server has a new option 'max_aux_connections', which
> limits the number of connections for parallel execution per
> (server, user) pairs.
>
> - change file_fdw to follow the changes of planner stuff.
>
>
> Whth the patch attached, the attached sql script shows the
> following result (after some line breaks are added).
>
> postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> FROM fvs1 a join fvs1_2 b on (a.a = b.a);
> QUERY PLAN
> ----------------------------------------------------------------------------
> Hash Join (cost=9573392.96..9573393.34 rows=1 width=40 parallel)
> (actual time=2213.400..2213.407 rows=12 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a
> (cost=9573392.96..9573393.29 rows=10 width=8 parallel)
> (actual time=2199.992..2199.993 rows=10 loops=1)
> -> Hash (cost=9573393.29..9573393.29 rows=10 width=36)
> (actual time=13.388..13.388 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 6kB
> -> Foreign Scan on fvs1_2 b
> (cost=9573392.96..9573393.29 rows=10 width=36 parallel)
> (actual time=13.376..13.379 rows=10 loops=1)
> Planning time: 4.761 ms
> Execution time: 2227.462 ms
> (8 rows)
> postgres=# SET parallel_ratio_threshold to 0.0;
> postgres=# EXPLAIN ANALYZE SELECT a.a, a.b, b.c
> FROM fvs1 a join fvs1 b on (a.a = b.a);
> QUERY PLAN
> ------------------------------------------------------------------------------
> Hash Join (cost=318084.32..318084.69 rows=1 width=40)
> (actual time=4302.913..4302.928 rows=12 loops=1)
> Hash Cond: (a.a = b.a)
> -> Foreign Scan on fvs1 a (cost=159041.93..159042.26 rows=10 width=8)
> (actual time=2122.989..2122.992 rows=10 loops=1)
> -> Hash (cost=159042.26..159042.26 rows=10 width=500)
> (actual time=2179.900..2179.900 rows=10 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 6kB
> -> Foreign Scan on fvs1 b
> (cost=159041.93..159042.26 rows=10 width=500)
> (actual time=2179.856..2179.864 rows=10 loops=1)
> Planning time: 5.085 ms
> Execution time: 4303.728 ms
> (8 rows)
>
> Where, "parallel" indicates that the node includes nodes run in
> parallel. The latter EXPLAIN shows the result when parallel
> execution is inhibited.
>
> Since the lack of time, sorry that the details for this patch is
> comming later.
>
> Is there any suggestions or opinions?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-08-01 09:28:26 Re: SKIP LOCKED DATA (work in progress)
Previous Message Kyotaro HORIGUCHI 2014-08-01 09:10:55 Re: Introducing coarse grain parallelism by postgres_fdw.