LATERAL, UNNEST and spec compliance

Lists: pgsql-hackers
From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: LATERAL, UNNEST and spec compliance
Date: 2013-01-24 17:51:46
Message-ID: 20130124175146.GB5766@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Folks,

Andrew Gierth asked me to send this out as his email is in a parlous
state at the moment. My comments will follow in replies. Without
further ado:

SQL2008 says, for 7.6 <table reference>

6)
a) If TR is contained in a <from clause> FC with no intervening <query
expression>, then the scope clause SC of TR is the <select statement:
single row> or innermost <query specification> that contains FC. The
scope of a range variable of TR is the <select list>, <where clause>,
<group by clause>, <having clause>, and <window clause> of SC, together
with every <lateral derived table> that is simply contained in FC and
is preceded by TR, and every <collection derived table> that is simply
contained in FC and is preceded by TR, and the <join condition> of all
<joined table>s contained in SC that contain TR. If SC is the <query
specification> that is the <query expression body> of a simple table
query STQ, then the scope of a range variable of TR also includes the
<order by clause> of STQ.

This is the clause that defines the scope effect of LATERAL, and as can be
seen, it defines <collection derived table>, i.e. UNNEST(), as having the
same behaviour as <lateral derived table>.

It is also worth noting at this point that pg's "FROM func()" syntax is not
in the spec (the nearest is "FROM TABLE(<collection value expression>)").

Our implementation of UNNEST currently deviates from the spec by not being
implicitly LATERAL; given the (sub)query

SELECT * FROM sometable, UNNEST(somearray);

then "somearray" is required to be a parameter or outer reference rather
than a column of "sometable". To get the spec's behaviour for this, we
currently have to do:

SELECT * FROM sometable, LATERAL UNNEST(somearray);

which is non-standard syntax. (In the spec, only <table subquery> can
follow LATERAL.)

(We also don't accept the (optional) syntax of S301, allowing multiple
parameters to UNNEST().)

As I see it, the current options are:

1. Do nothing, and insist on non-standard use of the LATERAL keyword.

2. Add UNNEST to the grammar (or parse analysis) as a special case, making
it implicitly LATERAL.

(This would make implementing S301 easier, but special cases are ugly.)

3. Make all cases of SRFs in the FROM-clause implicitly LATERAL.

(As far as I can tell, those cases whose behaviour would be changed by
this actually produce errors in versions prior to 9.3, so no working
code should be affected.)

Since LATERAL is new in 9.3, I think the pros and cons of these choices
should be considered now, rather than being allowed to slide by unexamined.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-25 05:12:41
Message-ID: 20130125051241.GC31639@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 24, 2013 at 09:51:46AM -0800, David Fetter wrote:
> Folks,
>
> Andrew Gierth asked me to send this out as his email is in a parlous
> state at the moment. My comments will follow in replies. Without
> further ado:
> [snip]
>
> As I see it, the current options are:
>
> 1. Do nothing, and insist on non-standard use of the LATERAL keyword.
>
> 2. Add UNNEST to the grammar (or parse analysis) as a special case, making
> it implicitly LATERAL.
>
> (This would make implementing S301 easier, but special cases are ugly.)
>
> 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL.
>
> (As far as I can tell, those cases whose behaviour would be changed by
> this actually produce errors in versions prior to 9.3, so no working
> code should be affected.)
>
> Since LATERAL is new in 9.3, I think the pros and cons of these choices
> should be considered now, rather than being allowed to slide by unexamined.

Please find attached a patch which implements approach 3. The vast
majority of it is changes to the regression tests. The removed
regression tests in join.{sql,out} are no longer errors, although some
of them are pretty standard DoS attacks, hence they're all removed.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
lateral_srf_in_from_001.patch text/plain 71.0 KB

From: David Fetter <david(at)fetter(dot)org>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-25 17:25:17
Message-ID: 20130125172517.GE31639@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 24, 2013 at 09:12:41PM -0800, David Fetter wrote:
> On Thu, Jan 24, 2013 at 09:51:46AM -0800, David Fetter wrote:
> > Folks,
> >
> > Andrew Gierth asked me to send this out as his email is in a parlous
> > state at the moment. My comments will follow in replies. Without
> > further ado:
> > [snip]
> >
> > As I see it, the current options are:
> >
> > 1. Do nothing, and insist on non-standard use of the LATERAL keyword.
> >
> > 2. Add UNNEST to the grammar (or parse analysis) as a special case, making
> > it implicitly LATERAL.
> >
> > (This would make implementing S301 easier, but special cases are ugly.)
> >
> > 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL.
> >
> > (As far as I can tell, those cases whose behaviour would be changed by
> > this actually produce errors in versions prior to 9.3, so no working
> > code should be affected.)
> >
> > Since LATERAL is new in 9.3, I think the pros and cons of these choices
> > should be considered now, rather than being allowed to slide by unexamined.
>
> Please find attached a patch which implements approach 3. The vast
> majority of it is changes to the regression tests. The removed
> regression tests in join.{sql,out} are no longer errors, although some
> of them are pretty standard DoS attacks, hence they're all removed.
>
> Cheers,
> David.

Oops. Misspelled rtekind in the previous patch. Here's a corrected
one, much shorter.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
lateral_srf_in_from_002.patch text/plain 5.1 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-25 18:14:24
Message-ID: 20130125181424.GS16126@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* David Fetter (david(at)fetter(dot)org) wrote:
> As I see it, the current options are:
>
> 1. Do nothing, and insist on non-standard use of the LATERAL keyword.

I'm not a big fan of this. Providing a good error message saying "you
need to use LATERAL for this query to work" makes it slightly better,
but I don't feel like there's really any ambiguity here.

> 2. Add UNNEST to the grammar (or parse analysis) as a special case, making
> it implicitly LATERAL.
>
> (This would make implementing S301 easier, but special cases are ugly.)

This I really don't like.

> 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL.
>
> (As far as I can tell, those cases whose behaviour would be changed by
> this actually produce errors in versions prior to 9.3, so no working
> code should be affected.)

+1 for me on this idea. If you're calling an SRF, passing in a lateral
value, 'LATERAL' seems like it's just a noise word, and apparently the
SQL authors felt the same, as they don't require it for unnest().

> Since LATERAL is new in 9.3, I think the pros and cons of these choices
> should be considered now, rather than being allowed to slide by unexamined.

I agree that we should really hammer this down before 9.3 is out the
door.

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-25 18:33:12
Message-ID: 15925.1359138792@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * David Fetter (david(at)fetter(dot)org) wrote:
>> 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL.
>>
>> (As far as I can tell, those cases whose behaviour would be changed by
>> this actually produce errors in versions prior to 9.3, so no working
>> code should be affected.)

> +1 for me on this idea. If you're calling an SRF, passing in a lateral
> value, 'LATERAL' seems like it's just a noise word, and apparently the
> SQL authors felt the same, as they don't require it for unnest().

At first I didn't like this idea, but it's growing on me.

However ... David is wrong to claim that it's zero-risk. It's true that
an SRF can't contain any side-references today, but it can contain an
outer reference. Consider a case like

SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...)

In existing releases the "y" could be a valid outer reference to a.y.
If b also has a column y, David's proposal would cause us to prefer
that interpretation, since b.y would be more closely nested than a.y.
If you're lucky, you'd get a type-mismatch error, but if the two y's
are of similar datatypes the query would just silently do something
different than it used to.

This is a little bit far-fetched, but it could happen. As against that,
we make incompatible changes in every release, and it does seem like
assuming LATERAL for functions in FROM would be a usability gain most
of the time. And special-casing UNNEST to satisfy the standard seems
*really* ugly.

> I agree that we should really hammer this down before 9.3 is out the
> door.

Yeah, if we're going to do this it'd make the most sense to do it in the
same release that introduces LATERAL.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-25 18:54:33
Message-ID: 20130125185433.GV16126@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> However ... David is wrong to claim that it's zero-risk. It's true that
> an SRF can't contain any side-references today, but it can contain an
> outer reference. Consider a case like
>
> SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...)

I see what you mean, but on the other hand, that looks like something we
might actually want to complain about as 'y' is pretty clearly ambiguous
here. I'm a bit surprised that doesn't already throw an error.

> This is a little bit far-fetched, but it could happen. As against that,
> we make incompatible changes in every release, and it does seem like
> assuming LATERAL for functions in FROM would be a usability gain most
> of the time. And special-casing UNNEST to satisfy the standard seems
> *really* ugly.

It's definitely far-fetched, imv. If it's possible, within reason, to
explicitly throw a "please disambiguate 'y'" type of error in those
specific cases, that'd be nice, but I don't think it'd be required. A
mention in the release notes would be sufficient.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-25 19:46:46
Message-ID: 20130125194645.GW16126@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...)

Actually, this appears to fail already, at least in 9.2.2:

=> select * from (values (1)) v(a) where v.a in (select x from (values (2)) v2(a),
-> generate_series(1,a) x);
ERROR: function expression in FROM cannot refer to other relations of same query level
LINE 2: generate_series(1,a) x);
^

Unless it's something else that you were referring to...?

Thanks,

Stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-25 20:26:56
Message-ID: 5217.1359145616@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...)

> Actually, this appears to fail already, at least in 9.2.2:

> => select * from (values (1)) v(a) where v.a in (select x from (values (2)) v2(a),
> -> generate_series(1,a) x);
> ERROR: function expression in FROM cannot refer to other relations of same query level
> LINE 2: generate_series(1,a) x);
> ^

Huh ... you're right, I'd forgotten about that. That's an ancient bug
that got fixed in passing in the LATERAL work. So, as long as we're not
going to fix that bug in the back branches (which would be difficult
anyway IIRC), we don't have a compatibility problem ...

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LATERAL, UNNEST and spec compliance
Date: 2013-01-26 00:29:59
Message-ID: 29069.1359160199@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
> Please find attached a patch which implements approach 3. The vast
> majority of it is changes to the regression tests. The removed
> regression tests in join.{sql,out} are no longer errors, although some
> of them are pretty standard DoS attacks, hence they're all removed.

Here's a less quick-hack-y approach to that.

regards, tom lane

Attachment Content-Type Size
implicit-lateral-1.patch text/x-patch 18.1 KB