Re: asynchronous and vectorized execution

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: asynchronous and vectorized execution
Date: 2016-05-11 15:49:39
Message-ID: 20160511154939.hs2ipvcfhxxfqccj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-05-11 10:12:26 -0400, Robert Haas wrote:
> > I've to admit I'm not that convinced about the speedups in the !fdw
> > case. There seems to be a lot easier avenues for performance
> > improvements.
>
> What I'm talking about is a query like this:
>
> SELECT * FROM inheritance_tree_of_foreign_tables WHERE very_rarely_true;

Note that I said "!fdw case".

> > FWIW, I've even hacked something up for a bunch of simple queries, and
> > the performance improvements were significant. Besides it only being a
> > weekend hack project, the big thing I got stuck on was considering how
> > to exactly determine when to batch and not to batch.
>
> Yeah. I think we need a system for signalling nodes as to when they
> will be run to completion. But a Boolean is somehow unsatisfying;
> LIMIT 1000000 is more like no LIMIT than it it is like LIMIT 1. I'm
> tempted to add a numTuples field to every ExecutorState and give upper
> nodes some way to set it, as a hint.

I was wondering whether we should hand down TupleVectorStates to lower
nodes, and their size determines the max batch size...

> >> Some care is required here because any
> >> functions we execute as scan keys are run with the buffer locked, so
> >> we had better not run anything very complicated. But doing this for
> >> simple things like integer equality operators seems like it could save
> >> quite a few buffer lock/unlock cycles and some other executor overhead
> >> as well.
> >
> > Hm. Do we really have to keep the page locked in the page-at-a-time
> > mode? Shouldn't the pin suffice?
>
> I think we need a lock to examine MVCC visibility information. A pin
> is enough to prevent a tuple from being removed, but not from having
> its xmax and cmax overwritten at almost but not quite exactly the same
> time.

We already batch visibility lookups in page-at-a-time
mode. Cf. heapgetpage() / scan->rs_vistuples. So we can evaluate quals
after releasing the lock, but before the pin is released, without that
much effort. IIRC that isn't used for index lookups, but that's
probably a good idea.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-05-11 16:09:07 Re: ALTER TABLE lock downgrades have broken pg_upgrade
Previous Message Kohei KaiGai 2016-05-11 15:41:46 Re: Academic help for Postgres