Re: asynchronous execution

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: robertmhaas(at)gmail(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, ah(at)cybertec(dot)at, pgsql-hackers(at)postgresql(dot)org
Subject: Re: asynchronous execution
Date: 2017-07-31 09:42:45
Message-ID: 20170731.184245.06849084.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 28 Jul 2017 17:31:05 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20170728(dot)173105(dot)238045591(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> Thank you for the comment.
>
> At Wed, 26 Jul 2017 17:16:43 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmoYrbgTBnLwnr1v=pk+C=znWg7AgV9=M9ehrq6TDexPQNw(at)mail(dot)gmail(dot)com>
> > regression all the same. Every type of intermediate node will have to
> > have a code path where it uses ExecAsyncRequest() /
> > ExecAyncHogeResponse() rather than ExecProcNode() to get tuples, and
>
> I understand what Robert concerns and I share the same
> opinion. It needs further different framework.
>
> At Thu, 27 Jul 2017 06:39:51 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+Tgmoa=ke_zfucOAa3YEUnBSC=FSXn8SU2aYc8PGBBp=Yy9fw(at)mail(dot)gmail(dot)com>
> > On Wed, Jul 26, 2017 at 5:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > The scheme he has allows $extra_stuff to be injected into ExecProcNode at
> > > no cost when $extra_stuff is not needed, because you simply don't insert
> > > the wrapper function when it's not needed. I'm not sure that it will
...
> > Yeah, I don't quite see how that would apply in this case -- what we
> > need here is not as simple as just conditionally injecting an extra
> > bit.
>
> Thank you for the pointer, Tom. The subject (segfault in HEAD...)
> haven't made me think that this kind of discussion was held
> there. Anyway it seems very closer to asynchronous execution so
> I'll catch up it considering how I can associate with this.

I understand the executor change which has just been made at
master based on the pointed thread. This seems to have the
capability to let exec-node switch to async-aware with no extra
cost on non-async processing. So it would be doable to (just)
*shrink* the current framework by detaching the async-aware side
of the API. But to get the most out of asynchrony, it is required
that multiple async-capable nodes distributed under async-unaware
nodes run simultaneously.

There seems two ways to achieve this.

One is propagating required-async-nodes bitmap up to the topmost
node and waiting for the all required nodes to become ready. In
the long run this requires all nodes to be async-aware and that
apparently would have bad effect on performance to async-unaware
nodes containing async-capable nodes.

Another is getting rid of recursive call to run an execution
tree. It is perhaps the same to what mentioned as "data-centric
processing" in a previous threads [1], [2], but I'd like to I pay
attention on the aspect of "enableing to resume execution tree
from arbitrary leaf node". So I'm considering to realize it
still in one-tuple-by-one manner instead of collecting all tuples
of a leaf node first. Even though I'm not sure it is doable.

[1] https://www.postgresql.org/message-id/BF2827DCCE55594C8D7A8F7FFD3AB77159A9B904@szxeml521-mbs.china.huawei.com
[2] https://www.postgresql.org/message-id/20160629183254.frcm3dgg54ud5m6o@alap3.anarazel.de

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-07-31 09:56:49 Re: map_partition_varattnos() and whole-row vars
Previous Message Remi Colinet 2017-07-31 09:20:13 Re: [PATCH v3] pg_progress() SQL function to monitor progression of long running SQL queries/utilities