Re: Parallel Seq Scan

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-11-11 19:46:11
Message-ID: CAFj8pRD9EPnTuRyJ0-iKvZo7FtDidK_NUT0xsWYQtM=w5_x3Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2015-11-11 20:26 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:

> On Wed, Nov 11, 2015 at 12:59 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
> > I have a first query
> >
> > I looked on EXPLAIN ANALYZE output and the numbers of filtered rows are
> > differen
>
> Hmm, I see I was right about people finding more bugs once this was
> committed. That didn't take long.
>

It is super feature, nobody can to wait to check it :). Much more people
can to put feedback and can do tests now.

>
> There's supposed to be code to handle this - see the
> SharedPlanStateInstrumentation stuff in execParallel.c - but it's
> evidently a few bricks shy of a load.
> ExecParallelReportInstrumentation is supposed to transfer the counts
> from each worker to the DSM:
>
> ps_instrument = &instrumentation->ps_instrument[i];
> SpinLockAcquire(&ps_instrument->mutex);
> InstrAggNode(&ps_instrument->instr, planstate->instrument);
> SpinLockRelease(&ps_instrument->mutex);
>
> And ExecParallelRetrieveInstrumentation is supposed to slurp those
> counts back into the leader's PlanState objects:
>
> /* No need to acquire the spinlock here; workers have exited
> already. */
> ps_instrument = &instrumentation->ps_instrument[i];
> InstrAggNode(planstate->instrument, &ps_instrument->instr);
>
> This might be a race condition, or it might be just wrong logic.
> Could you test what happens if you insert something like a 1-second
> sleep in ExecParallelFinish just after the call to
> WaitForParallelWorkersToFinish()? If that makes the results
> consistent, this is a race. If it doesn't, something else is wrong:
> then it would be useful to know whether the workers are actually
> calling ExecParallelReportInstrumentation, and whether the leader is
> actually calling ExecParallelRetrieveInstrumentation, and if so
> whether they are doing it for the correct set of nodes.
>

I did there pg_usleep(1000000L) without success

postgres=# EXPLAIN ANALYZE select count(*) from xxx where a % 10 = 0;
QUERY
PLAN
═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Aggregate (cost=9282.50..9282.51 rows=1 width=0) (actual
time=154.535..154.535 rows=1 loops=1)
-> Gather (cost=1000.00..9270.00 rows=5000 width=0) (actual
time=0.675..142.320 rows=100000 loops=1)
Number of Workers: 2
-> Parallel Seq Scan on xxx (cost=0.00..7770.00 rows=5000
width=0) (actual time=0.075..445.999 rows=168927 loops=1)
Filter: ((a % 10) = 0)
Rows Removed by Filter: 1520549
Planning time: 0.117 ms
Execution time: 1155.505 ms
(8 rows)

expected

postgres=# EXPLAIN ANALYZE select count(*) from xxx where a % 10 = 0;
QUERY
PLAN
═══════════════════════════════════════════════════════════════════════════════════════════════════════════════
Aggregate (cost=19437.50..19437.51 rows=1 width=0) (actual
time=171.233..171.233 rows=1 loops=1)
-> Seq Scan on xxx (cost=0.00..19425.00 rows=5000 width=0) (actual
time=0.187..162.627 rows=100000 loops=1)
Filter: ((a % 10) = 0)
Rows Removed by Filter: 900000
Planning time: 0.119 ms
Execution time: 171.322 ms
(6 rows)

The tests is based on table xxx

create table xxx(a int);
insert into xxx select generate_series(1,1000000);

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2015-11-11 19:51:33 Re: Parallel Seq Scan
Previous Message Robert Haas 2015-11-11 19:26:56 Re: Parallel Seq Scan