Re: Changed SRF in targetlist handling

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changed SRF in targetlist handling
Date: 2016-06-06 16:30:31
Message-ID: CAKFQuwbs-hUru-cifwNJ18cKrLbriSrXM9kWm=ZbAcya8jDgug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Mon, May 23, 2016 at 4:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...). This would
> >> have the same behavior as before if the SRFs all return the same number
> >> of rows, and otherwise would behave differently.
>
> > I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
> > LATERAL ROWS FROM (srf2()), ...
>
> No, because then you get the cross-product of multiple SRFs, not the
> run-in-lockstep behavior.
>
> > The rewrite you propose here seems to NULL-pad rows after the first
> > SRF is exhausted:
>
> Yes. That's why I said it's not compatible if the SRFs don't all return
> the same number of rows. It seems like a reasonable definition to me
> though, certainly much more reasonable than the current run-until-LCM
> behavior.
>

​IOW, this is why this mode query has to fail.

>
> > The latter is how I'd expect SRF-in-targetlist to work.
>
> That's not even close to how it works now. It would break *every*
> existing application that has multiple SRFs in the tlist, not just
> the ones whose SRFs return different numbers of rows. And I'm not
> convinced that it's a more useful behavior.
>

To clarify, the present behavior is basically a combination of both of
Robert's results.

If the SRFs return the same number of rows the first (zippered) result is
returned without an NULL padding.

If the SRFs return a different number of rows the LCM behavior kicks in and
you get Robert's second result.

SELECT generate_series(1, 4), generate_series(1, 4) ORDER BY 1, 2;
is the same as
SELECT * FROM ROWS FROM ( generate_series(1, 4), generate_series(1, 4) );

BUT

​SELECT generate_series(1, 3), generate_series(1, 4) ORDER BY 1, 2;
is the same as
SELECT * FROM ROWS FROM generate_series(1, 3) a, LATERAL ROWS FROM
generate_series(1, 4) b;

Tom's 2.5 proposal basically says we make the former equivalence succeed
and have the later one fail.

The rewrite would be unaware of the cardinality of the SRF and so it cannot
conditionally rewrite the query. One of the two must be chosen and the
incompatible behavior turned into an error.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-06-06 16:34:42 Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Previous Message Tom Lane 2016-06-06 16:15:55 Re: pg9.6 segfault using simple query (related to use fk for join estimates)