Re: Parallel Seq Scan

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-07-05 21:56:53
Message-ID: 1436133413.4369.121.camel@jeff-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2015-07-03 at 17:35 +0530, Amit Kapila wrote:

> Attached, find the rebased version of patch.
>

Comments:

* The heapam.c changes seem a little ad-hoc. Conceptually, which
portions should be affected by parallelism? How do we know we didn't
miss something?
* Why is initscan getting the number of blocks from the structure? Is it
just to avoid an extra syscall, or is there a correctness issue there?
Is initscan expecting that heap_parallelscan_initialize is always called
first (if parallel)? Please add a comment explaining above.
* What's the difference between scan->rs_nblocks and
scan->rs_parallel->phs_nblocks? Same for rs_rd->rd_id and phs_relid.
* It might be good to separate out some fields which differ between the
normal heap scan and the parallel heap scan. Perhaps put rs_ctup,
rs_cblock, and rs_cbuf into a separate structure, which is always NULL
during a parallel scan. That way we don't accidentally use a
non-parallel field when doing a parallel scan.
* Is there a reason that partial scans can't work with syncscan? It
looks like you're not choosing the starting block in the same way, so it
always starts at zero and never does syncscan. If we don't want to mix
syncscan and partial scan, that's fine, but it should be more explicit.

I'm trying to understand where tqueue.c fits in. It seems very closely
tied to the Funnel operator, because any change to the way Funnel works
would almost certainly require changes in tqueue.c. But "tqueue" is a
generic name for the file, so something seems off. Either we should
explicitly make it the supporting routines for the Funnel operator, or
we should try to generalize it a little.

I still have quite a bit to look at, but this is a start.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2015-07-05 22:55:54 Re: raw output from copy
Previous Message Andrew Dunstan 2015-07-05 21:35:52 Re: PostgreSQL 9.5 Alpha 1 build fail with perl 5.22