Re: Parallel Seq Scan

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-09-18 03:44:13
Message-ID: CAA4eK1L8chMx2UXiUb80Fuq00VuSjR-EUHV4Y3LRm3_sGzD8cQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 17, 2015 at 6:58 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Sep 3, 2015 at 6:21 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > [ new patches ]
>
> + pscan = shm_toc_lookup(node->ss.ps.toc,
PARALLEL_KEY_SCAN);
>
> This is total nonsense. You can't hard-code the key that's used for
> the scan, because we need to be able to support more than one parallel
> operator beneath the same funnel. For example:
>
> Append
> -> Partial Seq Scan
> -> Partial Seq Scan
>

Okay, but I think the same can be achieved with this as well. Basic idea
is that each worker will work on one planned statement at a time and in
above case there will be two different planned statements and they will
store partial seq scan related information in two different loctions in
toc, although the key (PARALLEL_KEY_SCAN) would be same and I think this
will quite similar to what we are already doing for response queues.
The worker will work on one of those keys based on planned statement
which it chooses to execute. I have explained this in somewhat more details
in one of my previous mails [1].

> Each partial sequential scan needs to have a *separate* key, which
> will need to be stored in either the Plan or the PlanState or both
> (not sure exactly). Each partial seq scan needs to get assigned a
> unique key there in the master, probably starting from 0 or 100 or
> something and counting up, and then this code needs to extract that
> value and use it to look up the correct data for that scan.
>

In that case also, multiple workers can worker on same key, assuming
in your above example, multiple workers will be required to execute
each partial seq scan. In this case we might need to see how to map
instrumentation information for a particular execution.

[1] -
http://www.postgresql.org/message-id/CAA4eK1LNt6wQBCxKsMj_QC+GahBuwyKWsQn6UL3nWVQ2savzwg@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-09-18 03:59:21 Re: vacuumdb sentence
Previous Message Robert Haas 2015-09-18 03:05:05 Re: numbering plan nodes