Re: Changed SRF in targetlist handling

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changed SRF in targetlist handling
Date: 2016-06-06 18:53:41
Message-ID: 29253.1465239221@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Robert Haas wrote:
>> On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> No, because then you get the cross-product of multiple SRFs, not the
>>> run-in-lockstep behavior.

>> Oh. I assumed that was the expected behavior. But, ah, what do I know?

> Lots, I assume -- but in this case, probably next to nothing, just like
> most of us, because what sane person or application would be really
> relying on the wacko historical behavior, in order to generate some
> collective knowledge? However, I think that it is possible that
> someone, somewhere has two SRFs-in-targetlist that return the same
> number of rows and that the current implementation works fine for them;

Yes. Run-in-lockstep is an extremely useful behavior, so much so that
we made a LATERAL variant for it. I do not see a reason to break such
cases in the targetlist.

> My vote is to raise an error in the case of more than one SRF in targetlist.

Note that that risks breaking cases that the user does not think are "more
than one SRF". Consider this example using a regression-test table:

regression=# create function foo() returns setof int8_tbl as
regression-# 'select * from int8_tbl' language sql;
CREATE FUNCTION
regression=# select foo();
foo
--------------------------------------
(123,456)
(123,4567890123456789)
(4567890123456789,123)
(4567890123456789,4567890123456789)
(4567890123456789,-4567890123456789)
(5 rows)

regression=# explain verbose select foo();
QUERY PLAN
----------------------------------------------
Result (cost=0.00..5.25 rows=1000 width=32)
Output: foo()
(2 rows)

regression=# select (foo()).*;
q1 | q2
------------------+-------------------
123 | 456
123 | 4567890123456789
4567890123456789 | 123
4567890123456789 | 4567890123456789
4567890123456789 | -4567890123456789
(5 rows)

regression=# explain verbose select (foo()).*;
QUERY PLAN
----------------------------------------------
Result (cost=0.00..5.50 rows=1000 width=16)
Output: (foo()).q1, (foo()).q2
(2 rows)

The reason we can get away with this simplistic treatment of
composite-returning SRFs is precisely the run-in-lockstep behavior.
Otherwise the second query would have returned 25 rows.

Now, if we decide to try to rewrite tlist SRFs as LATERAL, it would likely
behoove us to do that rewrite before expanding * not after, so that we can
eliminate the multiple evaluation of foo() that happens currently. (That
makes it a parser problem not a planner problem.) And maybe we should
rewrite non-SRF composite-returning functions this way too, because people
have definitely complained about the extra evaluations in that context.
But my point here is that lockstep evaluation does have practical use
when the SRFs are iterating over matching collections of generated rows.
And that seems like a pretty common use-case.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-06-06 19:06:37 Re: Changed SRF in targetlist handling
Previous Message Robert Haas 2016-06-06 18:50:03 Re: [sqlsmith] Failed assertions on parallel worker shutdown