Re: UNNEST with multiple args, and TABLE with multiple funcs

Lists: pgsql-hackers
From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-13 13:54:41
Message-ID: 48bb41eca62e428687cc9b8241661427@news-out.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Summary:

This patch implements a method for expanding multiple SRFs in parallel
that does not have the surprising LCM behaviour of SRFs-in-select-list.
(Functions returning fewer rows are padded with nulls instead.)

It then uses this method combined with a parse-time hack to implement
the (intended to be) spec-conforming behaviour of UNNEST with multiple
parameters, including flattening of composite results.

The upshot is that given a table like this:

postgres=# select * from t1;
a | b | c
---------------+-------------------+----------------------------------------------
{11,12,13} | {wombat} |
{5,10} | {foo,bar} | {"(123,xyzzy)","(456,plugh)","(789,plover)"}
{21,31,41,51} | {fred,jim,sheila} | {"(111,xyzzy)","(222,plugh)"}
(3 rows)

(where column "c" is an array of a composite type with 2 cols, "x" and "y")

You can do this:

postgres=# select u.* from t1, unnest(a,b,c) with ordinality as u;
?column? | ?column? | x | y | ordinality
----------+----------+-----+--------+------------
11 | wombat | | | 1
12 | | | | 2
13 | | | | 3
5 | foo | 123 | xyzzy | 1
10 | bar | 456 | plugh | 2
| | 789 | plover | 3
21 | fred | 111 | xyzzy | 1
31 | jim | 222 | plugh | 2
41 | sheila | | | 3
51 | | | | 4
(10 rows)

Or for an example of general combination of functions:

postgres=# select * from table(generate_series(10,20,5), unnest(array['fred','jim']));
?column? | ?column?
----------+----------
10 | fred
15 | jim
20 |
(3 rows)

Implementation Details:

The spec syntax for table function calls, <table function derived table>
in <table reference>, looks like TABLE(func(args...)) AS ...

This patch implements that, plus an extension: it allows multiple
functions, TABLE(func1(...), func2(...), func3(...)) [WITH ORDINALITY]
and defines this as meaning that the functions are to be evaluated in
parallel.

This is implemented by changing RangeFunction, function RTEs, and the
FunctionScan node to take lists of function calls rather than a single
function. The calling convention for SRFs is completely unchanged; each
function returns its own rows (or a tuplestore in materialize mode) just
as before, and FunctionScan combines the results into a single output
tuple (keeping track of which functions are exhausted in order to
correctly fill in nulls on a backwards scan).

Then, a hack in the parser converts unnest(...) appearing as a
func_table (and only there) into a list of unnest() calls, one for each
parameter. So

select ... from unnest(a,b,c)

is converted to

select ... from TABLE(unnest(a),unnest(b),unnest(c))

and if unnest appears as part of an existing list inside TABLE(), it's
expanded to multiple entries there too.

This parser hackery is of course somewhat ugly. But given the objective
of implementing the spec's unnest syntax, it seems to be the least ugly
of the possible approaches. (The hard part of doing it any other way
would be generating the description of the result type; composite array
parameters expand into multiple result columns.)

Overall, it's my intention here to remove as many as feasible of the old
reasons why one might use an SRF in the select list. This should also
address the points that Josh brought up in discussion of ORDINALITY
regarding use of SRF-in-select to unnest multiple arrays.

(As a side issue, this patch also sets up pathkeys for ordinality along
the lines of a patch I suggested to Greg a while back in response to
his.)

Current patch status:

This is a first working cut: no docs, no tests, not enough comments, the
deparse logic probably needs more work (it deparses correctly but the
formatting may be suboptimal). However all the functionality is believed
to be in place.

--
Andrew (irc:RhodiumToad)

Attachment Content-Type Size
table-functions.patch text/plain 85.4 KB

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-14 00:22:06
Message-ID: 520ACDAE.6090404@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/13/2013 06:54 AM, Andrew Gierth wrote:
> Summary:
>
> This patch implements a method for expanding multiple SRFs in parallel
> that does not have the surprising LCM behaviour of SRFs-in-select-list.
> (Functions returning fewer rows are padded with nulls instead.)

BTW, if anyone is unsure of the use-case for this, I have some uses for it:

1. denormalized data stored in same-length arrays (usually for
compression reasons)

2. use with PL/Python-Numpy and PL/R functions which return multiple
arrays or 2D arrays.

In other words, I have *lots* of uses for this functionality, and I
think the analytics crowd will like it. Which means that I need to get
on testing it, of course ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-14 00:32:41
Message-ID: 520AD029.1060001@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/14/2013 08:22 AM, Josh Berkus wrote:
> On 08/13/2013 06:54 AM, Andrew Gierth wrote:
>> Summary:
>>
>> This patch implements a method for expanding multiple SRFs in parallel
>> that does not have the surprising LCM behaviour of SRFs-in-select-list.
>> (Functions returning fewer rows are padded with nulls instead.)
>
> BTW, if anyone is unsure of the use-case for this, I have some uses for it:
>
> 1. denormalized data stored in same-length arrays (usually for
> compression reasons)
>
> 2. use with PL/Python-Numpy and PL/R functions which return multiple
> arrays or 2D arrays.
>
> In other words, I have *lots* of uses for this functionality, and I
> think the analytics crowd will like it. Which means that I need to get
> on testing it, of course ...

Similarly, I see uses for this come up a lot, and usually have to work
around it with ugly invocations of multiple SRFs in the SELECT list in a
subquery.

I was thinking of implementing multi-argument unnest directly with `any`
parameters if I could get it to work, but hadn't started on it yet.

This looks like a really clever approach and it handles multiple
spec-compliance items. I'll grab the patch and try it out.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-19 16:23:44
Message-ID: 52124690.8050309@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

2013-08-13 15:54 keltezéssel, Andrew Gierth írta:
> Summary:
>
> This patch implements a method for expanding multiple SRFs in parallel
> that does not have the surprising LCM behaviour of SRFs-in-select-list.
> (Functions returning fewer rows are padded with nulls instead.)
>
> It then uses this method combined with a parse-time hack to implement
> the (intended to be) spec-conforming behaviour of UNNEST with multiple
> parameters, including flattening of composite results.
>
> The upshot is that given a table like this:
>
> postgres=# select * from t1;
> a | b | c
> ---------------+-------------------+----------------------------------------------
> {11,12,13} | {wombat} |
> {5,10} | {foo,bar} | {"(123,xyzzy)","(456,plugh)","(789,plover)"}
> {21,31,41,51} | {fred,jim,sheila} | {"(111,xyzzy)","(222,plugh)"}
> (3 rows)
>
> (where column "c" is an array of a composite type with 2 cols, "x" and "y")
>
> You can do this:
>
> postgres=# select u.* from t1, unnest(a,b,c) with ordinality as u;
> ?column? | ?column? | x | y | ordinality
> ----------+----------+-----+--------+------------
> 11 | wombat | | | 1
> 12 | | | | 2
> 13 | | | | 3
> 5 | foo | 123 | xyzzy | 1
> 10 | bar | 456 | plugh | 2
> | | 789 | plover | 3
> 21 | fred | 111 | xyzzy | 1
> 31 | jim | 222 | plugh | 2
> 41 | sheila | | | 3
> 51 | | | | 4
> (10 rows)
>
> Or for an example of general combination of functions:
>
> postgres=# select * from table(generate_series(10,20,5), unnest(array['fred','jim']));
> ?column? | ?column?
> ----------+----------
> 10 | fred
> 15 | jim
> 20 |
> (3 rows)
>
> Implementation Details:
>
> The spec syntax for table function calls, <table function derived table>
> in <table reference>, looks like TABLE(func(args...)) AS ...
>
> This patch implements that, plus an extension: it allows multiple
> functions, TABLE(func1(...), func2(...), func3(...)) [WITH ORDINALITY]
> and defines this as meaning that the functions are to be evaluated in
> parallel.
>
> This is implemented by changing RangeFunction, function RTEs, and the
> FunctionScan node to take lists of function calls rather than a single
> function. The calling convention for SRFs is completely unchanged; each
> function returns its own rows (or a tuplestore in materialize mode) just
> as before, and FunctionScan combines the results into a single output
> tuple (keeping track of which functions are exhausted in order to
> correctly fill in nulls on a backwards scan).
>
> Then, a hack in the parser converts unnest(...) appearing as a
> func_table (and only there) into a list of unnest() calls, one for each
> parameter. So
>
> select ... from unnest(a,b,c)
>
> is converted to
>
> select ... from TABLE(unnest(a),unnest(b),unnest(c))
>
> and if unnest appears as part of an existing list inside TABLE(), it's
> expanded to multiple entries there too.
>
> This parser hackery is of course somewhat ugly. But given the objective
> of implementing the spec's unnest syntax, it seems to be the least ugly
> of the possible approaches. (The hard part of doing it any other way
> would be generating the description of the result type; composite array
> parameters expand into multiple result columns.)

Harder maybe but it may still be cleaner in the long run.

> Overall, it's my intention here to remove as many as feasible of the old
> reasons why one might use an SRF in the select list.

Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
WITH ORDINALITY and this feature, I would vote for removing
SRF-in-targetlist and call the release PostgreSQL 10.0.

> This should also
> address the points that Josh brought up in discussion of ORDINALITY
> regarding use of SRF-in-select to unnest multiple arrays.
>
> (As a side issue, this patch also sets up pathkeys for ordinality along
> the lines of a patch I suggested to Greg a while back in response to
> his.)
>
> Current patch status:
>
> This is a first working cut: no docs, no tests, not enough comments, the
> deparse logic probably needs more work (it deparses correctly but the
> formatting may be suboptimal). However all the functionality is believed
> to be in place.

With this last paragraph in mind, I am trying a little review.

* Is the patch in a patch format which has context? (eg: context diff format)

Yes.

* Does it apply cleanly to the current git master?

Applies with some offsets on a few files but without fuzz.

* Does it include reasonable tests, necessary doc patches, etc?

No, as told by the patch author.

* Does the patch actually implement what it's supposed to do?

Yes.

* Do we want that?

Yes.

* Do we already have it?

No.

* Does it follow SQL spec, or the community-agreed behavior?

The SQL spec says these:

In 7.6 <table reference>

<table primary> ::=
...
| <table function derived table> [ AS ] <correlation name> [ <left paren> <derived column
list> <right paren> ]

also later in the same section:

<table function derived table> ::=
TABLE <left paren> <collection value expression> <right paren>

In 6.26 <value expression>

<collection value expression> ::=
<array value expression> | <multiset value expression>

In 6.36 <array value expression>

<array value expression> ::= <array concatenation> | <array primary>
<array concatenation> ::= <array value expression 1> <concatenation operator> <array primary>
<array value expression 1> ::= <array value expression>
<array primary> ::= <array value function> | <value expression primary>

6.3 <value expression primary>

<value expression primary> ::=
<parenthesized value expression>
| <nonparenthesized value expression primary>

<parenthesized value expression> ::= <left paren> <value expression> <right paren>

<nonparenthesized value expression primary> ::=
<unsigned value specification>
| <column reference>
| <set function specification>
| <window function>
| <nested window function>
| <scalar subquery>
| <case expression>
| <cast specification>
| <field reference>
| <subtype treatment>
| <method invocation>
| <static method invocation>
| <new specification>
| <attribute or method reference>
| <reference resolution>
| <collection value constructor>
| <array element reference>
| <multiset element reference>
| <next value expression>
| <routine invocation>

collection value constructor> ::=
| <array value constructor>
| <multiset value constructor>

So, the FROM TABLE(...) AS (...) syntax is a big can of worms and
I haven't even quoted <multiset value expression>.

As far as I can tell, these should also be allowed but isn't:

zozo=# select * from table('a'::text) as x;
ERROR: syntax error at or near "'a'"
LINE 1: select * from table('a'::text) as x;
^
zozo=# select x.* from t1, table(t1.a) as x;
ERROR: syntax error at or near ")"
LINE 1: select x.* from t1, table(t1.a) as x;
^
zozo=# select x.* from table((6)) as x(a int4);
ERROR: syntax error at or near "("
LINE 1: select x.* from table((6)) as x(a int4);
^
zozo=# select x.* from table(values (6)) as x(a int4);
ERROR: syntax error at or near "("
LINE 1: select x.* from table(values (6)) as x(a int4);
^
zozo=# select x.* from table(values(6)) as x(a int4);
ERROR: syntax error at or near "("
LINE 1: select x.* from table(values(6)) as x(a int4);
^

What the patch implements is only the last choice for
<nonparenthesized value expression primary>: <routine invocation>

When you add documentation, it would be nice to mention it.

Also, the grammar extension is a start for adding all the other
standard choices for TABLE().

* Does it include pg_dump support (if applicable)?

n/a

* Are there dangers?

I can't see any. 8-)

* Have all the bases been covered?

My previous comments about the TABLE() syntax says it all.
You can interpret it either way. :-)

* Does the feature work as advertised?

Yes.

* Are there corner cases the author has failed to consider?

I don't know.

* Are there any assertion failures or crashes?

No.

* Does the patch slow down simple tests?

No.

* If it claims to improve performance, does it?

It certainly improves writing queries, as functions inside
unnest() get processed in one scan.

* Does it slow down other things?

I don't think so.

* Does it follow the project coding guidelines?

Yes.

* Are there portability issues?

No.

* Will it work on Windows/BSD etc?

It should, the code uses standard internal PostgreSQL APIs
and extends them. No new system call.

* Are the comments sufficient and accurate?

According to the author, no.

* Does it do what it says, correctly?

Yes.

* Does it produce compiler warnings?

No.

* Can you make it crash?

No.

* Is everything done in a way that fits together coherently with other features/modules?

I think so

* Are there interdependencies that can cause problems?

I don't know.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-19 17:45:23
Message-ID: CAFj8pRCFDW0QKQS4coTgpWXX2KLX82dLGpGJ9LD53hP2nhEbbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

Harder maybe but it may still be cleaner in the long run.
>
> Overall, it's my intention here to remove as many as feasible of the old
>> reasons why one might use an SRF in the select list.
>>
>
> Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
> WITH ORDINALITY and this feature, I would vote for removing
> SRF-in-targetlist and call the release PostgreSQL 10.0.
>

Although I would to remove SRF from targetlist, I don't think so this hurry
strategy is good idea. We should to provide new functionality and old
functionality one year as minimum, and we should to announce so this
feature is deprecated - and maybe use a GUC for disabling, warning and
deprecating. More, I would to see 9.4 release:). x.4 are happy PostgreSQL
releases :)

Regards

Pavel


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-19 18:03:52
Message-ID: 52125E08.4020107@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/19/2013 09:23 AM, Boszormenyi Zoltan wrote:
>
> Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
> WITH ORDINALITY and this feature, I would vote for removing
> SRF-in-targetlist and call the release PostgreSQL 10.0.

That's not realistic. We'd have to deprecate that syntax and
repeatedly and loudly warn people about it for at least 3 years after
the release of 9.3. You're talking about asking people to refactor
hundreds or thousands of lines of code which makes current use of things
like regex_match() in the target list.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-19 18:19:14
Message-ID: 521261A2.6000403@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-08-19 20:03 keltezéssel, Josh Berkus írta:
> On 08/19/2013 09:23 AM, Boszormenyi Zoltan wrote:
>> Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
>> WITH ORDINALITY and this feature, I would vote for removing
>> SRF-in-targetlist and call the release PostgreSQL 10.0.
> That's not realistic.

I know. I am such an agent provocateur. :-)

> We'd have to deprecate that syntax and
> repeatedly and loudly warn people about it for at least 3 years after
> the release of 9.3. You're talking about asking people to refactor
> hundreds or thousands of lines of code which makes current use of things
> like regex_match() in the target list.
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-19 20:04:27
Message-ID: c4ea50ccd8b358c03bae2d10f84b05b5@news-out.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan wrote:
>> This parser hackery is of course somewhat ugly. But given the objective
>> of implementing the spec's unnest syntax, it seems to be the least ugly
>> of the possible approaches. (The hard part of doing it any other way
>> would be generating the description of the result type; composite array
>> parameters expand into multiple result columns.)
>
> Harder maybe but it may still be cleaner in the long run.

I'm not so sure.

As far as I'm concerned, though, the situation is fairly simple: there
are no proposals on the table for any mechanism that would allow the
deduction of a return type structure for multi-arg unnest, I have
tried and failed to come up with a usable alternative proposal, and
there is no prospect of one resulting from any other work that I know
about. So the parser hack is the way it goes, and anyone who doesn't
like it is welcome to suggest a better idea.

>> Overall, it's my intention here to remove as many as feasible of the old
>> reasons why one might use an SRF in the select list.
>
> Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
> WITH ORDINALITY and this feature, I would vote for removing
> SRF-in-targetlist and call the release PostgreSQL 10.0.

I want to make this ABSOLUTELY clear: I am not advocating removing
SRF-in-targetlist in the near future and I will not support anyone who
does. Please do not use this code as an argument for that (at least
until a few releases have elapsed). All I'm interested in at this
point is providing an alternative with better semantics.

> The SQL spec says these:
>
> In 7.6 <table reference>
[mega-snip]
>
> As far as I can tell, these should also be allowed but isn't:

No, they're not allowed. You missed this rule (in 7.6 Syntax Rules):

2) If a <table primary> TP simply contains a <table function derived
table> TFDT, then:

a) The <collection value expression> immediately contained in TFDT
shall be a <routine invocation>.

In other words, func(...) is the only allowed form for the collection
value expression inside TABLE( ). (Same rule exists in 201x, but
numbered 6 rather than 2.)

Largely as a matter of consistency, the patch does presently allow
expressions that are not <routine invocation>s but which are part of
the func_expr_windowless production, so things like TABLE(USER) work.
(This is because historically these are allowed in the FROM clause as
tables.) I'm not sure this is a good idea in general; should it be
tightened up to only allow func_application?

> * If it claims to improve performance, does it?
>
> It certainly improves writing queries, as functions inside
> unnest() get processed in one scan.

I'm not making any specific performance claims, but I have tested it
against the idea of doing separate function scans with a full join on
ordinality columns, and my approach is faster (1.5x to 2x in my tests)
even with pathkey support and with elimination of extra materialize
nodes (by allowing mark/restore in FunctionScan).

------

Since the original patch was posted I have done further work on it,
including some tests. I have also come up with an additional
possibility: that of allowing multiple SRFs that return RECORD with
column definition lists, and SRFs-returning-RECORD combined with
ORDINALITY, by extending the syntax further:

select * from TABLE(func1() AS (a text, b integer),
func2() AS (c integer, d text));

select * from TABLE(func1() AS (a text, b integer))
WITH ORDINALITY AS f(a,b,o);

-- shame to have to duplicate the column names, but avoiding that would
-- not have been easy

This removes the restriction of the previous ORDINALITY patch that
prevented its use with SRFs that needed coldef lists.

I'm open to other suggestions on the syntax for this.

(My implementation of this works by making the column definition list
a property of the function call, rather than of the RTE or the
FunctionScan node. This eliminates a few places where TYPEFUNC_RECORD
had to be handled as a special case.)

--
Andrew. (irc:RhodiumToad)


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-20 05:18:45
Message-ID: 5212FC35.20202@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-08-19 22:04 keltezéssel, Andrew Gierth írta:
> Boszormenyi Zoltan wrote:
>>> This parser hackery is of course somewhat ugly. But given the objective
>>> of implementing the spec's unnest syntax, it seems to be the least ugly
>>> of the possible approaches. (The hard part of doing it any other way
>>> would be generating the description of the result type; composite array
>>> parameters expand into multiple result columns.)
>> Harder maybe but it may still be cleaner in the long run.
> I'm not so sure.
>
> As far as I'm concerned, though, the situation is fairly simple: there
> are no proposals on the table for any mechanism that would allow the
> deduction of a return type structure for multi-arg unnest, I have
> tried and failed to come up with a usable alternative proposal, and
> there is no prospect of one resulting from any other work that I know
> about. So the parser hack is the way it goes, and anyone who doesn't
> like it is welcome to suggest a better idea.
>
>>> Overall, it's my intention here to remove as many as feasible of the old
>>> reasons why one might use an SRF in the select list.
>> Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
>> WITH ORDINALITY and this feature, I would vote for removing
>> SRF-in-targetlist and call the release PostgreSQL 10.0.
> I want to make this ABSOLUTELY clear: I am not advocating removing
> SRF-in-targetlist in the near future and I will not support anyone who
> does. Please do not use this code as an argument for that (at least
> until a few releases have elapsed). All I'm interested in at this
> point is providing an alternative with better semantics.
>
>> The SQL spec says these:
>>
>> In 7.6 <table reference>
> [mega-snip]
>> As far as I can tell, these should also be allowed but isn't:
> No, they're not allowed. You missed this rule (in 7.6 Syntax Rules):
>
> 2) If a <table primary> TP simply contains a <table function derived
> table> TFDT, then:
>
> a) The <collection value expression> immediately contained in TFDT
> shall be a <routine invocation>.
>
> In other words, func(...) is the only allowed form for the collection
> value expression inside TABLE( ). (Same rule exists in 201x, but
> numbered 6 rather than 2.)

You are right, I missed it in the standard. Sorry.

> Largely as a matter of consistency, the patch does presently allow
> expressions that are not <routine invocation>s but which are part of
> the func_expr_windowless production, so things like TABLE(USER) work.
> (This is because historically these are allowed in the FROM clause as
> tables.) I'm not sure this is a good idea in general; should it be
> tightened up to only allow func_application?
>
>> * If it claims to improve performance, does it?
>>
>> It certainly improves writing queries, as functions inside
>> unnest() get processed in one scan.
> I'm not making any specific performance claims, but I have tested it
> against the idea of doing separate function scans with a full join on
> ordinality columns, and my approach is faster (1.5x to 2x in my tests)
> even with pathkey support and with elimination of extra materialize
> nodes (by allowing mark/restore in FunctionScan).
>
> ------
>
> Since the original patch was posted I have done further work on it,
> including some tests. I have also come up with an additional
> possibility: that of allowing multiple SRFs that return RECORD with
> column definition lists, and SRFs-returning-RECORD combined with
> ORDINALITY, by extending the syntax further:
>
> select * from TABLE(func1() AS (a text, b integer),
> func2() AS (c integer, d text));
>
> select * from TABLE(func1() AS (a text, b integer))
> WITH ORDINALITY AS f(a,b,o);
>
> -- shame to have to duplicate the column names, but avoiding that would
> -- not have been easy
>
> This removes the restriction of the previous ORDINALITY patch that
> prevented its use with SRFs that needed coldef lists.

Very nice.

> I'm open to other suggestions on the syntax for this.

It's consistent with straight SRFs in FROM, the function
parameters are attached to the functions themselves.
I think it's good as is.

> (My implementation of this works by making the column definition list
> a property of the function call,

Which also makes it easier on the eyes and brain when reading
someone else's SQL.

> rather than of the RTE or the
> FunctionScan node. This eliminates a few places where TYPEFUNC_RECORD
> had to be handled as a special case.)

This is the other plus. No special casing is good.

The only minus is that you haven't attached the new patch. :-)

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-20 05:58:19
Message-ID: 5213057B.7080900@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/20/2013 02:03 AM, Josh Berkus wrote:
> On 08/19/2013 09:23 AM, Boszormenyi Zoltan wrote:
>>
>> Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
>> WITH ORDINALITY and this feature, I would vote for removing
>> SRF-in-targetlist and call the release PostgreSQL 10.0.
>
> That's not realistic. We'd have to deprecate that syntax and
> repeatedly and loudly warn people about it for at least 3 years after
> the release of 9.3. You're talking about asking people to refactor
> hundreds or thousands of lines of code which makes current use of things
> like regex_match() in the target list.

Agreed. Even three years is optimistic; after that long it could
probably be made into an ERROR by default with a backward-compat GUC,
but certainly not removed.

I'm still running into people running 8.2 and having issues upgrading
due to the 8.3 removal of implicit casts from text, and even the removal
of add_missing_from .

If we want people to upgrade this century it's worth minimising the
amount of unnecessary breakage. SRF-in-SELECT might be ugly, but simply
ripping it out certainly counts as unnecessary breakage.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: David Fetter <david(at)fetter(dot)org>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-20 06:07:45
Message-ID: 20130820060745.GB2099@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 19, 2013 at 07:45:23PM +0200, Pavel Stehule wrote:
> Hello
>
> Harder maybe but it may still be cleaner in the long run.
> >
> > Overall, it's my intention here to remove as many as feasible of the old
> >> reasons why one might use an SRF in the select list.
> >>
> >
> > Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
> > WITH ORDINALITY and this feature, I would vote for removing
> > SRF-in-targetlist and call the release PostgreSQL 10.0.
> >
>
> Although I would to remove SRF from targetlist, I don't think so this hurry
> strategy is good idea. We should to provide new functionality and old
> functionality one year as minimum, and we should to announce so this
> feature is deprecated

We could do this in 9.3, but all it would be is an announcement, i.e.
no code change of any nature.

> - and maybe use a GUC for disabling, warning and deprecating.

With utmost respect, I think the general idea of setting SQL grammar
via GUC is a really bad one. When we've done so in the past, it's
done more harm than good, and we should not repeat it.

> More, I would to see 9.4 release:).

Same here! :)

> x.4 are happy PostgreSQL releases :)

Each one has been at least baseline happy for me since 7.1. Some have
made me overjoyed, though.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-20 06:13:04
Message-ID: CAFj8pRB5TJNDkWa4o_o20T1UuQVJGgXQw55RuzPFecJzTHyxmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/8/20 David Fetter <david(at)fetter(dot)org>

> On Mon, Aug 19, 2013 at 07:45:23PM +0200, Pavel Stehule wrote:
> > Hello
> >
> > Harder maybe but it may still be cleaner in the long run.
> > >
> > > Overall, it's my intention here to remove as many as feasible of the
> old
> > >> reasons why one might use an SRF in the select list.
> > >>
> > >
> > > Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
> > > WITH ORDINALITY and this feature, I would vote for removing
> > > SRF-in-targetlist and call the release PostgreSQL 10.0.
> > >
> >
> > Although I would to remove SRF from targetlist, I don't think so this
> hurry
> > strategy is good idea. We should to provide new functionality and old
> > functionality one year as minimum, and we should to announce so this
> > feature is deprecated
>
> We could do this in 9.3, but all it would be is an announcement, i.e.
> no code change of any nature.
>
> > - and maybe use a GUC for disabling, warning and deprecating.
>
> With utmost respect, I think the general idea of setting SQL grammar
> via GUC is a really bad one. When we've done so in the past, it's
> done more harm than good, and we should not repeat it.
>

so as minumum is controlling warning via GUC, we should to help with
identification of problematic queries.

Regards

Pavel

>
> > More, I would to see 9.4 release:).
>
> Same here! :)
>
> > x.4 are happy PostgreSQL releases :)
>
> Each one has been at least baseline happy for me since 7.1. Some have
> made me overjoyed, though.
>
> Cheers,
> David.
> --
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
> Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-20 06:22:57
Message-ID: 52130B41.7050008@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-08-20 08:13 keltezéssel, Pavel Stehule írta:
>
>
>
> 2013/8/20 David Fetter <david(at)fetter(dot)org <mailto:david(at)fetter(dot)org>>
>
> On Mon, Aug 19, 2013 at 07:45:23PM +0200, Pavel Stehule wrote:
> > Hello
> >
> > Harder maybe but it may still be cleaner in the long run.
> > >
> > > Overall, it's my intention here to remove as many as feasible of the old
> > >> reasons why one might use an SRF in the select list.
> > >>
> > >
> > > Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
> > > WITH ORDINALITY and this feature, I would vote for removing
> > > SRF-in-targetlist and call the release PostgreSQL 10.0.
> > >
> >
> > Although I would to remove SRF from targetlist, I don't think so this hurry
> > strategy is good idea. We should to provide new functionality and old
> > functionality one year as minimum, and we should to announce so this
> > feature is deprecated
>
> We could do this in 9.3, but all it would be is an announcement, i.e.
> no code change of any nature.
>
> > - and maybe use a GUC for disabling, warning and deprecating.
>

To really ensure backward compatibility, this sentence should read as
"add a GUC for disabling *the* warning and deprecating." :->

As I said, I am such an agent provocateur.
Let this side track die and concentrate on the merits of the patch itself. :-)

>
> With utmost respect, I think the general idea of setting SQL grammar
> via GUC is a really bad one. When we've done so in the past, it's
> done more harm than good, and we should not repeat it.
>
>
> so as minumum is controlling warning via GUC, we should to help with identification of
> problematic queries.
>
> Regards
>
> Pavel
>
>
> > More, I would to see 9.4 release:).
>
> Same here! :)
>
> > x.4 are happy PostgreSQL releases :)
>
> Each one has been at least baseline happy for me since 7.1. Some have
> made me overjoyed, though.
>
> Cheers,
> David.
> --
> David Fetter <david(at)fetter(dot)org <mailto:david(at)fetter(dot)org>> http://fetter.org/
> Phone: +1 415 235 3778 <tel:%2B1%20415%20235%203778> AIM: dfetter666 Yahoo!: dfetter
> Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com <mailto:david(dot)fetter(at)gmail(dot)com>
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
> <http://www.tripit.com/feed/ical/people/david74/tripit.ics>
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-24 14:56:41
Message-ID: 1377356201.8206.7.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2013-08-13 at 13:54 +0000, Andrew Gierth wrote:
> Summary:
>
> This patch implements a method for expanding multiple SRFs in parallel
> that does not have the surprising LCM behaviour of SRFs-in-select-list.
> (Functions returning fewer rows are padded with nulls instead.)

Fails to build in contrib:

pg_stat_statements.c -MMD -MP -MF .deps/pg_stat_statements.Po
pg_stat_statements.c: In function ‘JumbleRangeTable’:
pg_stat_statements.c:1459:27: error: ‘RangeTblEntry’ has no member named ‘funcexpr’


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-26 23:24:58
Message-ID: 57fdbff8ae01c9784d9c095d10008f7d@news-out.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Latest version of patch. This should be it as far as code goes; there
may be some more regression test work, and a doc patch will be
forthcoming.

This version supports, in addition to the previous stuff:

SELECT ... FROM TABLE(func() AS (colname coltype, ...));

i.e. the column definition list required for functions that return
arbitrary RECORD results can go inside the TABLE() construct. This
allows more than one such function in a call:

SELECT ... FROM TABLE(func1() AS (a integer), func2() AS (b text));

or mixing RECORD functions with ORDINALITY:

SELECT ... FROM TABLE(func1() AS (c text)) WITH ORDINALITY;

The existing FROM func() AS f(c text) is still supported of course,
and the variation FROM TABLE(func()) AS f(c text) is also supported
but only when there's exactly one function and no ORDINALITY.

Other changes:

- function dependence on executor parameters is now tracked
per-function, so that on rescan, only affected functions are
re-executed, and others are simply rescanned from the existing
tuplestore

- some cases where deparse or other code broke because an element
of funcexprs was not actually a FuncExpr have been fixed

- fixed the pg_stat_statements issue

A change I _didn't_ include, but did test, was adding mark/restore to
FunctionScan to allow mergejoins on ordinality columns to work without
needing extra nodes (which I did to do some performance tests referred
to in a previous message). I took this code back out because it didn't
seem to make much difference: the planner often (not always) adds the
Materialize node even when it's not needed, in the belief that it is
faster; the overhead of the extra node doesn't seem serious; and the
case is of limited applicability (only useful when joining against
something other than a function using the ordinal column alone).

--
Andrew (irc:RhodiumToad)

Attachment Content-Type Size
table-functions-2.patch text/plain 164.9 KB

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-27 06:04:44
Message-ID: 521C417C.2030203@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-08-27 01:24 keltezéssel, Andrew Gierth írta:
> Latest version of patch. This should be it as far as code goes; there
> may be some more regression test work, and a doc patch will be
> forthcoming.
>
> This version supports, in addition to the previous stuff:
>
> [snip]

In my limited testing, it works well, and the patch looks good.

When adding regression tests, can you please add intentional
syntax error cases to exercise all the new ereport()s?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-27 13:44:18
Message-ID: 10629.1377611058@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> When adding regression tests, can you please add intentional
> syntax error cases to exercise all the new ereport()s?

Please do not add test cases merely to prove that. Yeah, you should
probably have exercised each error case in devel testing, but that does
not mean that every future run of the regression tests needs to do it too.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-28 04:09:05
Message-ID: 1377662945.14126.1.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2013-08-27 at 09:44 -0400, Tom Lane wrote:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> > When adding regression tests, can you please add intentional
> > syntax error cases to exercise all the new ereport()s?
>
> Please do not add test cases merely to prove that. Yeah, you should
> probably have exercised each error case in devel testing, but that does
> not mean that every future run of the regression tests needs to do it too.

I disagree. The next person who wants to hack on this feature should be
given the confidence that he's not breaking behavior that the last guy
put in.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-08-28 04:11:07
Message-ID: 1377663067.14126.3.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2013-08-26 at 23:24 +0000, Andrew Gierth wrote:
> Latest version of patch. This should be it as far as code goes; there
> may be some more regression test work, and a doc patch will be
> forthcoming.

In src/include/optimizer/paths.h, you are using "operator" as a function
argument name, which breaks cpluspluscheck. Try using opid or
operatorid.


From: Noah Misch <noah(at)leadboat(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-09-05 13:35:52
Message-ID: 20130905133552.GA155910@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 28, 2013 at 12:09:05AM -0400, Peter Eisentraut wrote:
> On Tue, 2013-08-27 at 09:44 -0400, Tom Lane wrote:
> > Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
> > > When adding regression tests, can you please add intentional
> > > syntax error cases to exercise all the new ereport()s?
> >
> > Please do not add test cases merely to prove that. Yeah, you should
> > probably have exercised each error case in devel testing, but that does
> > not mean that every future run of the regression tests needs to do it too.
>
> I disagree. The next person who wants to hack on this feature should be
> given the confidence that he's not breaking behavior that the last guy
> put in.

+1. I wouldn't make full error-outcome test coverage a condition of patch
acceptance. However, when an author chooses to submit high-quality tests with
that level of detail, our source tree is the place to archive them. I share
Tom's desire for a Makefile target that completes quickly and checks only
those behaviors most likely to break, but not at the cost of letting deep test
coverage dissipate in a mailing list attachment or in the feature author's
home directory.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-09-13 19:03:01
Message-ID: 87zjrgimkm.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Latest version of patch, incorporating regression tests and docs, and
fixing the "operator" issue previously raised.

--
Andrew (irc:RhodiumToad)

Attachment Content-Type Size
tablefunc-20130913.patch text/x-patch 204.9 KB

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-09-16 08:39:52
Message-ID: 5236C3D8.4020601@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013-09-13 21:03 keltezéssel, Andrew Gierth írta:
> Latest version of patch, incorporating regression tests and docs, and
> fixing the "operator" issue previously raised.

It looks good. I think it's ready for a committer.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-10-01 18:38:54
Message-ID: 524B16BE.90301@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've spent some time reviewing this patch - looks pretty good! I'm not
through yet, but I wanted to post an update. Attached is a new version,
with some modifications I made. Notably:

I added a new struct to hold the per-function executor state - tupdesc,
tuplestore, rowcount and slot - instead of the lists that need to be
kept in sync. This is more readable.

I replaced the CreateTupleDescCopyMany() function with a function called
TupleDescCopyEntry(), which initializes a single attribute like
TupleDescInitEntry(), but copies the information from another tupledesc.
It is used in a loop to construct the composite tupledec in multiple
function or ordinality case. This is more flexible, no need to create
the dummy single-attribute TupleDesc for ordinality anymore, for example.

I refactored the grammar a bit; the way func_table rule returned a one-
or two element list to essentially pass up a flag was an ugly hack.

Below are a couple of more comments:

On 13.09.2013 22:03, Andrew Gierth wrote:
> ***************
> *** 3529,3534 **** static Expr *
> --- 3539,3545 ----
> simplify_function(Oid funcid, Oid result_type, int32 result_typmod,
> Oid result_collid, Oid input_collid, List **args_p,
> bool funcvariadic, bool process_args, bool allow_non_const,
> + FuncExpr *orig_funcexpr,
> eval_const_expressions_context *context)
> {
> List *args = *args_p;

The new argument needs to be explained in the comment above.

> ! <para>
> ! The special table function <literal>UNNEST</literal> may be called with
> ! any number of array parameters, and returns a corresponding number of
> ! columns, as if <literal>UNNEST</literal>
> ! (see <xref linkend="functions-array">) had been called on each parameter
> ! separately and combined using the <literal>TABLE</literal> construct. The
> ! number of result columns is determined by the sum of the arities of the
> ! array element types; that is to say, any array of composite type is
> ! expanded into separate result columns for each field of the type.
> ! Likewise, the number of rows returned is determined by the largest array
> ! parameter, with smaller values padded with NULLs.
> ! </para>

"Arities", really? :-). Please reword that into more plain English.

I'm going to put this aside for a day or two now, but will get back to
it later to finish the review and commit (unless someone beats me to
it). Meanwhile, if you could do something about that comment and manual
paragraph above, and re-review the changes I made, that would be great.

- Heikki

Attachment Content-Type Size
tablefunc-20131001-heikki-1.patch.gz application/x-gzip 36.9 KB

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-10-01 19:59:50
Message-ID: 87bo38vkvp.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Heikki" == Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:

Heikki> I've spent some time reviewing this patch - looks pretty
Heikki> good! I'm not through yet, but I wanted to post an
Heikki> update. Attached is a new version, with some modifications I
Heikki> made. Notably:

Heikki> I refactored the grammar a bit

And broke the ability to do TABLE(unnest(a,b,c), otherfunc(d)).

Yes, you can still do TABLE(unnest(a), unnest(b), unnest(c), otherfunc(d))
but I view this as a significant loss of functionality; I will see about
fixing that.

I am still looking at the other changes.

--
Andrew (irc:RhodiumToad)


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-10-08 01:01:39
Message-ID: 87fvscfuvy.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here is a new patch with the following changes on top of Heikki's
version (all the changes in which I've otherwise kept):

1. Changed TupleDescCopyEntry parameter order to be (dest,src) for
better consistency with TupleDescInitEntry and general C style.

2. Removed CreateTupleDescCopyExtend which is now dead code

3. Some small cleanups in the building of tupdescs in executor init

4. Refactored the grammar further to reinstant multi-arg unnest inside
table(), and added regression tests for that

5. comment and doc changes requested in Heikki's message

--
Andrew (irc:RhodiumToad)

Attachment Content-Type Size
tablefunc-20131008.patch text/x-patch 181.8 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-18 03:36:42
Message-ID: 13944.1384745802@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> Here is a new patch with the following changes on top of Heikki's
> version (all the changes in which I've otherwise kept):

Here is an updated version:

1. Rebased against HEAD.

2. New code is pgindent'ed (mainly because most of the rebasing pain was
because the previous ORDINALITY patch hadn't been pgindented before
committing) and is now git diff --check clean.

3. Fixed build failure due to recent change of
make_pathkey_from_sortinfo's API. I don't think that what I did here is
actually right --- if we're keeping build_expression_pathkey, then
probably it needs to expose nullable_relids as a parameter. But this
will at least make the thing compilable pending review of that point.

This compiles clean and passes regression tests, but I've not done
any actual reviewing yet.

regards, tom lane

Attachment Content-Type Size
table-functions-20131117.patch text/x-diff 220.0 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-19 16:13:26
Message-ID: 10034.1384877606@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
>> Here is a new patch with the following changes on top of Heikki's
>> version (all the changes in which I've otherwise kept):

> Here is an updated version:

I've been hacking on this patch all day yesterday. What I'm on about at
the moment is reversing the decision to move range functions' funccoltypes
etc into FuncExpr. That's a bad idea on the grounds of bloating FuncExpr,
but the real problem with it is this: what happens if the planner decides
to inline or const-simplify the function expression? You just lost a
critical part of the RTE's infrastructure, that's what. So that's got to
go, and I've been fooling with different ways to represent the info for
multiple functions within RangeTblEntry. What I have at the moment is

/*
* Fields valid for a function RTE (else NIL/zero):
*
* There can be multiple function expressions in a function RTE.
* funccoldeflist is an integer list (of the same length as funcexprs)
* containing true if function had a column definition list, else false.
* funccolcounts is an integer list (of the same length as funcexprs)
* showing the number of RTE output columns produced by each function.
* The length of eref->colnames must be equal to either the sum of the
* funccolcounts entries, or one more than the sum if funcordinality is
* true. funccoltypes, funccoltypmods, and funccolcollations give type
* information about each output column (these lists must have the same
* length as eref->colnames). Remember that when a function returns a
* named composite type, any dropped columns in that type will have dummy
* corresponding entries in these lists.
*
* Note: funccoltypes etc are derived from either the functions' declared
* result types, or their column definition lists in case of functions
* returning RECORD. Storing this data in the RTE is redundant in the
* former case, but for simplicity we store it always.
*/
List *funcexprs; /* expression trees for func calls */
List *funccoldeflist; /* integer list of has-coldeflist booleans */
List *funccolcounts; /* number of output columns from each func */
List *funccoltypes; /* OID list of column type OIDs */
List *funccoltypmods; /* integer list of column typmods */
List *funccolcollations; /* OID list of column collation OIDs */
bool funcordinality; /* is this called WITH ORDINALITY? */

which has the advantage that the ordinality column is no longer such a
special case, it's right there in the lists. However, it turns out that
in most places where I thought we could just consult the entry-per-column
lists, we can't. We still have to do the get_expr_result_type() dance,
because we need up-to-date information about which columns of a
composite-returning function's output have been dropped since the RTE was
made. That means we'd have to chase the entry-per-function lists in
parallel with the entry-per-column lists, which is a PITA.

I'm thinking possibly it's worth inventing a new Node type that would just
be infrastructure for RTE_FUNCTION RTEs, so that we'd have something like
this in RangeTblEntry:

List *functions; /* List of RangeTblFunction nodes */
bool funcordinality; /* is this called WITH ORDINALITY? */

and a node type RangeTblFunction with fields

Node *funcexpr; /* executable expression for the function */
int funccolcount; /* number of columns emitted by function */
/* These lists are NIL unless function had a column definition list: */
List *funccoltypes; /* OID list of column type OIDs */
List *funccoltypmods; /* integer list of column typmods */
List *funccolcollations; /* OID list of column collation OIDs */

BTW, the reason we need to store the column count explicitly is that we
have to ignore the added columns if a composite type has had an ADD COLUMN
done to it since the RTE was made. The submitted patch fails rather
nastily in such cases, if the composite type isn't last in the function
list.

Thoughts, better ideas?

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-19 18:58:06
Message-ID: 87li0kz0t9.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> I've been hacking on this patch all day yesterday. What I'm on
Tom> about at the moment is reversing the decision to move range
Tom> functions' funccoltypes etc into FuncExpr. That's a bad idea on
Tom> the grounds of bloating FuncExpr, but the real problem with it
Tom> is this: what happens if the planner decides to inline or
Tom> const-simplify the function expression? You just lost a
Tom> critical part of the RTE's infrastructure, that's what.

Inlining should already check that the type doesn't change as a
result; where exactly is the issue here?

What matters is whether get_expr_result_type still works; the only
place (other than ruleutils) now that looks at funccoltypes etc. is
the guts of that. Is it incorrect to assume that if a FuncExpr is
transformed in any way, the result should give the same return from
get_expr_result_type?

--
Andrew (irc:RhodiumToad)


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-19 19:13:21
Message-ID: 87hab8yzq9.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> BTW, the reason we need to store the column count explicitly is
Tom> that we have to ignore the added columns if a composite type has
Tom> had an ADD COLUMN done to it since the RTE was made. The
Tom> submitted patch fails rather nastily in such cases, if the
Tom> composite type isn't last in the function list.

Am I understanding correctly that the only reason this didn't fail
before we added ORDINALITY is that the executor in general does not
care if there are more columns in a tuple than it expects? And that
adding ORDINALITY broke this already?

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-19 19:50:28
Message-ID: 17284.1384890628@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> BTW, the reason we need to store the column count explicitly is
> Tom> that we have to ignore the added columns if a composite type has
> Tom> had an ADD COLUMN done to it since the RTE was made. The
> Tom> submitted patch fails rather nastily in such cases, if the
> Tom> composite type isn't last in the function list.

> Am I understanding correctly that the only reason this didn't fail
> before we added ORDINALITY is that the executor in general does not
> care if there are more columns in a tuple than it expects? And that
> adding ORDINALITY broke this already?

Probably it's already broken with ORDINALITY, but I've not checked.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-19 20:01:07
Message-ID: 17493.1384891267@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> I've been hacking on this patch all day yesterday. What I'm on
> Tom> about at the moment is reversing the decision to move range
> Tom> functions' funccoltypes etc into FuncExpr. That's a bad idea on
> Tom> the grounds of bloating FuncExpr, but the real problem with it
> Tom> is this: what happens if the planner decides to inline or
> Tom> const-simplify the function expression? You just lost a
> Tom> critical part of the RTE's infrastructure, that's what.

> Inlining should already check that the type doesn't change as a
> result; where exactly is the issue here?

The issue is that if you want to dig column type information out of
a function RTE, that won't necessarily work after preprocess_expression
has had its way with the contained expressions. That's needed at
the very least in create_functionscan_plan.

You might try to argue that flattening of an expression-returning-RECORD
is guaranteed to preserve whatever we know about the result type, but
that argument sounds mighty flimsy to me. There's nothing much
guaranteeing that the expression couldn't be folded to a Const, or
at least something that didn't have a FuncExpr at the top.

In any case, there is absolutely nothing that is desirable enough
about this representation that we should take any risks for it.
The historical approach is that the coldeflist data is securely attached
to the RangeTblEntry itself, and I think we should stick with that.

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-19 21:14:39
Message-ID: 87bo1gyw1b.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

>> Inlining should already check that the type doesn't change as a
>> result; where exactly is the issue here?

Tom> The issue is that if you want to dig column type information out
Tom> of a function RTE, that won't necessarily work after
Tom> preprocess_expression has had its way with the contained
Tom> expressions. That's needed at the very least in
Tom> create_functionscan_plan.

My intention was that whatever was in the funcexprs list should be
self-describing as far as result type information goes - whether or
not it was a FuncExpr node. create_functionscan_plan used to copy the
funccoltypes etc. to the FunctionScan node, but I removed that in
favour of having get_expr_result_type do the work.

Tom> You might try to argue that flattening of an
Tom> expression-returning-RECORD is guaranteed to preserve whatever
Tom> we know about the result type, but that argument sounds mighty
Tom> flimsy to me. There's nothing much guaranteeing that the
Tom> expression couldn't be folded to a Const, or at least something
Tom> that didn't have a FuncExpr at the top.

So, at the moment, get_expr_result_type can't return a tupdesc for an
expression tree that doesn't have FuncExpr or OpExpr at the top and
which doesn't return a named composite type.

If there's an issue here, then it goes beyond functions-returning-RECORD
and affects flattening of functions with OUT parameters too; if there
were some way for those to get replaced by a Const node (currently
there is not: see comment in evaluate_function) then that would break,
and that clearly has nothing to do with coldef lists.

I can see that it would be nice to allow folding and so on in these
cases, but it seems to me that having some infrastructure that would
allow get_expr_result_type to return the same result for the
transformed call as the original call is a prerequisite for any such
change.

Tom> In any case, there is absolutely nothing that is desirable
Tom> enough about this representation that we should take any risks
Tom> for it. The historical approach is that the coldeflist data is
Tom> securely attached to the RangeTblEntry itself, and I think we
Tom> should stick with that.

What I was aiming for was to _remove_ any special-case handling of
coldef lists (post-parser) and use only get_expr_result_type.

--
Andrew. (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-20 04:22:26
Message-ID: 31727.1384921346@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> The issue is that if you want to dig column type information out
> Tom> of a function RTE, that won't necessarily work after
> Tom> preprocess_expression has had its way with the contained
> Tom> expressions. That's needed at the very least in
> Tom> create_functionscan_plan.

> What I was aiming for was to _remove_ any special-case handling of
> coldef lists (post-parser) and use only get_expr_result_type.

[ thinks for awhile... ] I can see that that would have some value
if we were looking to expand the usage of coldeflists to allow
"record_returning_function(...) AS (coldeflist)" to appear in any
expression context not just function RTEs. However, I can't get
excited about that as a future feature, for two reasons:

1. IME, functions returning unconstrained RECORD tend to return sets
as well; if you don't know what columns you return, it's unlikely you
know how many rows you return. So this would only be a sensible
feature addition if we were looking to encourage the use of SRFs outside
the FROM clause. I'm not sure whether we are going to deprecate that,
but I'm pretty sure we don't want to encourage it.

2. There's a syntactic problem, stemming from the perhaps unfortunate
choice to shoehorn coldeflists into the SQL alias syntax: if you've got
SELECT foo(...) AS ...
it'd be impossible to tell after seeing AS whether what follows is a
coldeflist (which'd need to be parsed as part of the function call)
or a column alias (which'd need to not be). So this would be a
shift-or-reduce conflict for bison, and I venture that humans would
get confused too.

There are also implementation-level reasons to want to keep this behavior
tied to RTE_FUNCTION RTEs rather than being loose in the expression
tree evaluator: we can much more easily handle RTEs that return a random
collection of columns than we can handle arbitrary rowtypes in
expressions. In particular, the latter works only with the "blessed
rowtype" hack, which doesn't scale nicely to lots of different rowtypes
used in a session. And I've always considered that to be strictly a
runtime thing, too --- I don't want the interpretation of parsetrees
to require consulting the anonymous-rowtype cache.

So on the whole, I can't get excited about decoupling coldeflists
from function RTEs. Even if I were excited about it, I'd see it
as a separate feature unrelated to the stated goals of this patch.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-20 20:07:17
Message-ID: 32143.1384978037@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> The spec syntax for table function calls, <table function derived table>
> in <table reference>, looks like TABLE(func(args...)) AS ...

> This patch implements that, plus an extension: it allows multiple
> functions, TABLE(func1(...), func2(...), func3(...)) [WITH ORDINALITY]
> and defines this as meaning that the functions are to be evaluated in
> parallel.

I went back and looked at the spec, and so far as I can tell, the claim
that this is spec syntax plus an extension is a falsehood. What
I read in SQL:2008 7.6 <table reference> is

<table function derived table> ::=
TABLE <left paren> <collection value expression> <right paren>

where <collection value expression> is elsewhere defined to be an
expression returning an array or multiset value, and then syntax rule 2
says:

* the <collection value expression> shall be a <routine invocation>

* this construct is equivalent to UNNEST ( <collection value expression> )

So unless I'm misreading it, the spec's idea is that you could write

SELECT ... FROM TABLE( function_returning_array(...) )

and this would result in producing the array elements as a table column.
There is nothing in there about a function returning set. You could argue
that that leaves us with the freedom to define what the construct does
for functions returning set --- but as this patch stands, if a function
doesn't return set but does return an array, the behavior will not be what
the spec plainly demands.

I do like the basic concept of this syntax, but I think it's a serious
error to appropriate the TABLE() spelling for something that doesn't
agree with the spec's semantics for that spelling. We need to spell it
some other way.

I've not experimented to see what's practical in bison, but a couple
of ideas that come to mind are:

1. Use FUNCTION instead of TABLE.

2. Don't use any keyword, just parens. Right now you get a syntax error
from that:

regression=# select * from (foo(), bar()) s;
ERROR: syntax error at or near ","
LINE 1: select * from (foo(), bar()) s;
^

which implies that it's syntax space we could commandeer. On the other
hand, I'm a bit worried about the future-proof-ness of such a choice.
It's uncomfortably close to one of the ways to write a row expression,
so it's not too hard to foresee the SQL committee someday defining
something like this in FROM clauses. It's also hard to see what you'd
call the construct in documentation or error messages --- no keyword means
no easy name to apply.

Thoughts, other ideas?

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:03:12
Message-ID: CA+Tgmobg4AddwQkxpJLSCTcyz6deYXSv73S8Fn=x9Tq7cxCixg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 20, 2013 at 3:07 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
>> The spec syntax for table function calls, <table function derived table>
>> in <table reference>, looks like TABLE(func(args...)) AS ...
>
>> This patch implements that, plus an extension: it allows multiple
>> functions, TABLE(func1(...), func2(...), func3(...)) [WITH ORDINALITY]
>> and defines this as meaning that the functions are to be evaluated in
>> parallel.
>
> I went back and looked at the spec, and so far as I can tell, the claim
> that this is spec syntax plus an extension is a falsehood. What
> I read in SQL:2008 7.6 <table reference> is
>
> <table function derived table> ::=
> TABLE <left paren> <collection value expression> <right paren>
>
> where <collection value expression> is elsewhere defined to be an
> expression returning an array or multiset value, and then syntax rule 2
> says:
>
> * the <collection value expression> shall be a <routine invocation>
>
> * this construct is equivalent to UNNEST ( <collection value expression> )
>
> So unless I'm misreading it, the spec's idea is that you could write
>
> SELECT ... FROM TABLE( function_returning_array(...) )
>
> and this would result in producing the array elements as a table column.
> There is nothing in there about a function returning set. You could argue
> that that leaves us with the freedom to define what the construct does
> for functions returning set --- but as this patch stands, if a function
> doesn't return set but does return an array, the behavior will not be what
> the spec plainly demands.

The original post on this thread includes this example, which mixes
SRFs and arrays by running the array through UNNEST:

select * from table(generate_series(10,20,5), unnest(array['fred','jim']));

But if we think the spec calls for things to be implicitly unnested,
you could still get the same effect by adjusting the query. You'd
just get rid of the UNNEST from the argument that had it and wrap
ARRAY(SELECT ...) around the other one:

select * from table(array(select generate_series(10,20,5)),
array['fred','jim']);

It's not clear to me whether that's likely to be inefficient in
practical cases, but there's no real loss of functionality. IOW, I'm
not sure we really need to invent a new syntax here; maybe we can just
implement the spec, assuming your interpretation thereof is correct.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:11:55
Message-ID: 87ob5ewlq5.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> and this would result in producing the array elements as a table
Tom> column. There is nothing in there about a function returning
Tom> set.

In the spec, there is no such thing as a function returning a set of
rows in the sense that we use.

Functions can return arrays or multisets of simple or composite types,
but a multiset is a single value like an array (just with slightly
different semantics), not a set of rows. (And in particular it's not
ordered.)

--
Andrew (irc:RhodiumToad)


From: David Johnston <polobo(at)yahoo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:19:28
Message-ID: 1385000368151-5779512.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote
> select * from table(array(select generate_series(10,20,5)),
> array['fred','jim']);

Can we have our arrays and eat our functions too? (and is someone willing to
bake such a complicated cake...)

select * from table ( ARRAY | FUNCTION/SET [, ARRAY | FUNCTION/SET ]* )

The standard-compliant case is handled as required - and those who want to
write compliant code can use the array(select function) trick - while others
can avoid straining their eyes and fingers.

Since we would have to invent implicit unnesting anyway to conform, and the
function version is working currently, the suggested behavior would seem to
be the ideal target.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/UNNEST-with-multiple-args-and-TABLE-with-multiple-funcs-tp5767280p5779512.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:27:31
Message-ID: 28537.1385000851@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> and this would result in producing the array elements as a table
> Tom> column. There is nothing in there about a function returning
> Tom> set.

> In the spec, there is no such thing as a function returning a set of
> rows in the sense that we use.

Right, but they do have a concept of arrays that's similar to ours,
and AFAICS the spec demands different behavior for an array-returning
function than what we've got here.

We could conceivably say that we'll implicitly UNNEST() if the function
returns array, and not otherwise --- but that seems pretty inconsistent
and surprise-making to me. I'm not too sure what to do if a function
returns setof array, either.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:36:33
Message-ID: 28746.1385001393@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> The original post on this thread includes this example, which mixes
> SRFs and arrays by running the array through UNNEST:

> select * from table(generate_series(10,20,5), unnest(array['fred','jim']));

> But if we think the spec calls for things to be implicitly unnested,
> you could still get the same effect by adjusting the query. You'd
> just get rid of the UNNEST from the argument that had it and wrap
> ARRAY(SELECT ...) around the other one:

> select * from table(array(select generate_series(10,20,5)),
> array['fred','jim']);

> It's not clear to me whether that's likely to be inefficient in
> practical cases,

Yeah, it would be :-(. Maybe we could hack something to translate
unnest(array(...)) into a no-op, but ugh. You really want to make
people use a syntax like that, even if it weren't inefficient?
It's verbose, ugly, unintuitive, and redundant given that you could
just write UNNEST() instead of TABLE(). But more: we *know* what
the common case is going to be, based on existing usage of SRFs,
and forced-unnest ain't it. So I'm thinking benign neglect of the
spec's syntax is the way to go. If anyone does come along and say
they want the spec's semantics, let them implement it, and the
syntax to go with it.

regards, tom lane


From: David Johnston <polobo(at)yahoo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:36:47
Message-ID: 1385001407576-5779515.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane-2 wrote
> Andrew Gierth &lt;

> andrew(at)(dot)org

> &gt; writes:
>> "Tom" == Tom Lane &lt;

> tgl(at)(dot)pa

> &gt; writes:
>> Tom> and this would result in producing the array elements as a table
>> Tom> column. There is nothing in there about a function returning
>> Tom> set.
>
>> In the spec, there is no such thing as a function returning a set of
>> rows in the sense that we use.
>
> Right, but they do have a concept of arrays that's similar to ours,
> and AFAICS the spec demands different behavior for an array-returning
> function than what we've got here.
>
> We could conceivably say that we'll implicitly UNNEST() if the function
> returns array, and not otherwise --- but that seems pretty inconsistent
> and surprise-making to me. I'm not too sure what to do if a function
> returns setof array, either.

If a function returns a scalar array (RETURNS text[]) we would unnest the
array per-spec. If it returns a set (RETURN setof anything {including a
single array}) we would not unnest it since set returning functions are
non-spec - instead we'd use our SRF processing routine. If the function
returns a scalar non-array the implicit single-row returned by the function
would be output.

How would the spec interpret:

CREATE FUNCTION f(IN text, OUT text[]) RETURNS record AS $$ ...

TABLE( f('id_123') )

If that is illegal because the result is not just a single array value then
we would not unnest the component array and would also output the implicit
single-row.

My $0.02, quickly gathered

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/UNNEST-with-multiple-args-and-TABLE-with-multiple-funcs-tp5767280p5779515.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:42:48
Message-ID: 28893.1385001768@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Johnston <polobo(at)yahoo(dot)com> writes:
> Tom Lane-2 wrote
>> We could conceivably say that we'll implicitly UNNEST() if the function
>> returns array, and not otherwise --- but that seems pretty inconsistent
>> and surprise-making to me. I'm not too sure what to do if a function
>> returns setof array, either.

> If a function returns a scalar array (RETURNS text[]) we would unnest the
> array per-spec. If it returns a set (RETURN setof anything {including a
> single array}) we would not unnest it since set returning functions are
> non-spec - instead we'd use our SRF processing routine. If the function
> returns a scalar non-array the implicit single-row returned by the function
> would be output.

I find that way too inconsistent to be a sane specification.

regards, tom lane


From: David Johnston <polobo(at)yahoo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 02:51:14
Message-ID: 1385002274983-5779518.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane-2 wrote
> We could conceivably say that we'll implicitly UNNEST() if the function
> returns array, and not otherwise --- but that seems pretty inconsistent
> and surprise-making to me.

The use-cases for putting a scalar array returning function call into a
TABLE construct, and NOT wanting the array to be un-nested, are likely few
and far between.

Neither the inconsistency nor surprise-making are serious deal-breakers for
me.

And if we do go with the "screw the standard" approach then we should just
state right now that we will never adhere to standard on "inconsistency
grounds" and not even encourage others to make it work. If "TABLE(
array_scalar_func() )" ends up only returning a single row then nothing can
be done to make it unnest the array and conform with the syntax without
breaking backward compatibility.

I'd rather change "TABLE" to "FUNCTION" and leave the implementation of
TABLE open for future standards-compliance - which maybe you do as well and
just haven't carried that sentiment to your more recent responses

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/UNNEST-with-multiple-args-and-TABLE-with-multiple-funcs-tp5767280p5779518.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 15:07:53
Message-ID: 21317.1385046473@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Johnston <polobo(at)yahoo(dot)com> writes:
> Tom Lane-2 wrote
>> We could conceivably say that we'll implicitly UNNEST() if the function
>> returns array, and not otherwise --- but that seems pretty inconsistent
>> and surprise-making to me.

> The use-cases for putting a scalar array returning function call into a
> TABLE construct, and NOT wanting the array to be un-nested, are likely few
> and far between.

> Neither the inconsistency nor surprise-making are serious deal-breakers for
> me.

Well, they are for me ;-). I'm concerned for example about how we get
ruleutils.c to reverse-list into a form that's certain to be interpreted
the same by the parser.

The whole business with the spec's reading of TABLE() seems bizarre.
AFAICS there is nothing about TABLE(foo()) that you can't get with
greater clarity by writing UNNEST(foo()) instead. And it's not like
it's a legacy feature --- SQL99 has single-argument UNNEST() but not
TABLE(), so why'd they add TABLE() later, and why'd they make it a
strict subset of what UNNEST() can do? I can't escape the suspicion
that I'm misreading the spec somehow ... but the text seems perfectly
clear.

Anyway, after further thought I've come up with an approach that's purely
a syntactic transformation and so less likely to cause surprise: let's
say that if we have TABLE() with a single argument, and no coldeflist
either inside or outside, then we implicitly insert UNNEST(). Otherwise
not. This is sufficient to satisfy the case spelled out in the standard,
but it doesn't get in the way of any more-typical use of TABLE().
In particular, if you don't want the implicit UNNEST(), you can just leave
off TABLE(), because the case where we'd insert it has no features you
can't get in the old syntax. Similarly, because ruleutils.c is already
coded not to bother with printing TABLE() if there's a single function and
no coldeflist, we needn't worry about falling foul of the implicit action
when a printed view is re-parsed.

BTW, I looked into the option of choosing a different syntax altogether,
but the results weren't too promising. FUNCTION() doesn't work unless
we're willing to make that keyword partially reserved, which seems like
a bad thing. (TABLE() works because TABLE is already a fully reserved
word.) The idea with no keyword at all might work, but it seems way too
likely to cause confusion, especially if you think about parenthesized
JOIN syntax as an alternative possibility for some slightly-typoed query.

Thoughts?

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 17:05:11
Message-ID: 878uwhwv0i.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> Anyway, after further thought I've come up with an approach
Tom> that's purely a syntactic transformation and so less likely to
Tom> cause surprise: let's say that if we have TABLE() with a single
Tom> argument, and no coldeflist either inside or outside, then we
Tom> implicitly insert UNNEST(). Otherwise not.

This seems ugly beyond belief.

Specifically, having TABLE(foo()) and TABLE(foo(),bar()) be radically
different constructs, and likewise TABLE(foo()) and TABLE(foo() AS (...)),
strikes me as highly objectionable.

If there isn't a reasonable syntax alternative to TABLE(...) for the
multiple functions case, then frankly I think we should go ahead and
burn compatibility with a spec feature which appears to be of negative
value.

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-21 17:22:57
Message-ID: 6975.1385054577@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> Anyway, after further thought I've come up with an approach
> Tom> that's purely a syntactic transformation and so less likely to
> Tom> cause surprise: let's say that if we have TABLE() with a single
> Tom> argument, and no coldeflist either inside or outside, then we
> Tom> implicitly insert UNNEST(). Otherwise not.

> This seems ugly beyond belief.

True :-(

> If there isn't a reasonable syntax alternative to TABLE(...) for the
> multiple functions case, then frankly I think we should go ahead and
> burn compatibility with a spec feature which appears to be of negative
> value.

TBH, I'm getting close to that conclusion too. The more I look at the
spec, the more I think it must be a mistake, or else I'm somehow reading
it wrong, because it sure makes no sense for them to have invented
something that's just an alternative and less-clear syntax for a feature
they already had.

Can anyone who's following this thread check the behavior of Oracle or
DB2 to see if they interpret TABLE() the way I think the spec says?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-22 00:40:25
Message-ID: 12738.1385080825@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've committed this patch after some significant editorialization, but
leaving the use of TABLE( ... ) syntax in-place. If we decide that we
don't want to risk doing that, we can change to some other syntax later.

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-22 12:53:10
Message-ID: 87eh68vbvm.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> I've committed this patch after some significant editorialization, but
Tom> leaving the use of TABLE( ... ) syntax in-place. If we decide that we
Tom> don't want to risk doing that, we can change to some other syntax later.

Is this intended:

create function foo() returns setof footype language plpgsql
as $f$ begin return next row(1,true); end; $f$;

select pg_typeof(f), row_to_json(f) from foo() with ordinality f(p,q);
pg_typeof | row_to_json
-----------+---------------------------------
record | {"p":1,"q":true,"ordinality":1}
(1 row)

select pg_typeof(f), row_to_json(f) from foo() f(p,q);
pg_typeof | row_to_json
-----------+------------------
footype | {"a":1,"b":true}
(1 row)

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-22 17:18:37
Message-ID: 29388.1385140717@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> Is this intended:

[ I assume you forgot a create type footype here ]

> create function foo() returns setof footype language plpgsql
> as $f$ begin return next row(1,true); end; $f$;

> select pg_typeof(f), row_to_json(f) from foo() with ordinality f(p,q);
> pg_typeof | row_to_json
> -----------+---------------------------------
> record | {"p":1,"q":true,"ordinality":1}
> (1 row)

> select pg_typeof(f), row_to_json(f) from foo() f(p,q);
> pg_typeof | row_to_json
> -----------+------------------
> footype | {"a":1,"b":true}
> (1 row)

Well, it's not insane on its face. The rowtype of f in the first example
is necessarily a built-on-the-fly record, but in the second case using
the properties of the underlying named composite type is possible, and
consistent with what happens in 9.3 and earlier. (Not that I'm claiming
we were or are totally consistent ...)

regards, tom lane


From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-22 18:05:30
Message-ID: 8761rkuy4x.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> [ I assume you forgot a create type footype here ]

yeah, sorry

Tom> Well, it's not insane on its face. The rowtype of f in the
Tom> first example is necessarily a built-on-the-fly record, but in
Tom> the second case using the properties of the underlying named
Tom> composite type is possible, and consistent with what happens in
Tom> 9.3 and earlier. (Not that I'm claiming we were or are totally
Tom> consistent ...)

Right, but your changes to the code make it look like there was an
intended change there - with the scan type tupdesc being forced to
RECORD type and its column names changed.

--
Andrew (irc:RhodiumToad)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-11-22 19:47:31
Message-ID: 24073.1385149651@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Tom> Well, it's not insane on its face. The rowtype of f in the
> Tom> first example is necessarily a built-on-the-fly record, but in
> Tom> the second case using the properties of the underlying named
> Tom> composite type is possible, and consistent with what happens in
> Tom> 9.3 and earlier. (Not that I'm claiming we were or are totally
> Tom> consistent ...)

> Right, but your changes to the code make it look like there was an
> intended change there - with the scan type tupdesc being forced to
> RECORD type and its column names changed.

I did set things up so that if you have a RECORD result, the column names
will be those of the query's alias list; this was in response to the
comment in the patch that complained that we were inconsistent about where
we were getting the names from if you had a mix of named-composite
functions and other functions. I believe what is happening in the case
you show is that the function is returning a composite Datum that's marked
with the composite type's OID, and the upstream consumers are looking at
that, not at the scan tupdesc. I'm not really excited about tracing down
exactly what the data flow is ...

regards, tom lane


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-02 23:02:42
Message-ID: 20131202230242.GA1148155@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 21, 2013 at 12:22:57PM -0500, Tom Lane wrote:
> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
> > If there isn't a reasonable syntax alternative to TABLE(...) for the
> > multiple functions case, then frankly I think we should go ahead and
> > burn compatibility with a spec feature which appears to be of negative
> > value.
>
> TBH, I'm getting close to that conclusion too. The more I look at the
> spec, the more I think it must be a mistake, or else I'm somehow reading
> it wrong, because it sure makes no sense for them to have invented
> something that's just an alternative and less-clear syntax for a feature
> they already had.
>
> Can anyone who's following this thread check the behavior of Oracle or
> DB2 to see if they interpret TABLE() the way I think the spec says?

Oracle's closest analog to SQL-standard arrays is its "varray" feature, and
TABLE() behaves like our UNNEST() for those. Note that Oracle has no UNNEST.

*SQL> CREATE OR REPLACE TYPE intarray AS VARRAY(100) OF int;
* 2 /

Type created.

*SQL> select * from table(intarray(1,2,3));

COLUMN_VALUE
------------
1
2
3

I don't have a DB2 installation within reach, but its documentation implies
that UNNEST and TABLE are interchangeable:

http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/index.jsp?topic=%2Fcom.ibm.db2.luw.sql.ref.doc%2Fdoc%2Fr0055064.html

If anyone can test "select x from table(trim_array(array[1,2,3], 1)) as t(x);"
in DB2 and provide the output, that would be helpful.

On Thu, Nov 21, 2013 at 10:07:53AM -0500, Tom Lane wrote:
> The whole business with the spec's reading of TABLE() seems bizarre.
> AFAICS there is nothing about TABLE(foo()) that you can't get with
> greater clarity by writing UNNEST(foo()) instead. And it's not like
> it's a legacy feature --- SQL99 has single-argument UNNEST() but not
> TABLE(), so why'd they add TABLE() later, and why'd they make it a
> strict subset of what UNNEST() can do? I can't escape the suspicion
> that I'm misreading the spec somehow ... but the text seems perfectly
> clear.

That's how I read it, too. My hypothesis is that the standard adopted TABLE()
to rubber-stamp Oracle's traditional name for UNNEST().

On Wed, Nov 20, 2013 at 03:07:17PM -0500, Tom Lane wrote:
> I do like the basic concept of this syntax, but I think it's a serious
> error to appropriate the TABLE() spelling for something that doesn't
> agree with the spec's semantics for that spelling. We need to spell it
> some other way.

I realize you may have changed your mind later in the thread, but I share this
original sentiment. I think of this feature as optimization of and syntactic
sugar for full outer joins on ordinality columns. Compare these queries:

select * from table(generate_series(1,3), generate_series(2,5))
with ordinality as t(g1,g2);
select g1, g2, ordinality from generate_series(1,3) with ordinality as g1
full join generate_series(2,5) with ordinality as g2 using (ordinality);

The new syntax is limited to function calls, but I could imagine extending it
to take arbitrary subqueries (or, at the cost of inviting folks to depend on
subject-to-change row order, arbitrary from_item's). If this project were
just starting, I'd probably favor optimizing ordinality joins in the planner
rather than introducing special syntax to request the optimization. I don't
claim that's sufficiently better to justify the extensive rework it would now
entail, though. Therefore, I propose merely changing the syntax to "TABLE FOR
ROWS (...)". As a comparison, think of the standard syntax as "TABLE [FOR
ELEMENTS] (...)". Here is a longer list of conflict-free syntax choices that
I considered before settling on that one:

FUNCTIONS TABLE
FUNCTIONS TO TABLE
ROWS FOR
ROWS FOR EACH
ROWS FROM
ROWS FROM EACH
ROWS FROM FUNCTIONS
ROWS TO TABLE
TABLE (ROWS OF f0(), ROWS OF f1())
TABLE BY FUNCTIONS
TABLE BY ROW
TABLE FOR
TABLE FOR FUNCTION ROWS
TABLE FOR FUNCTIONS
TABLE FOR ROWS
TABLE FOR ROWS OF
TABLE FROM
TABLE FROM FUNCTION ROWS
TABLE FROM FUNCTIONS
TABLE OF
TABLE OF EACH
TABLE OF FUNCTION ROWS
TABLE OF FUNCTIONS
TABLE OF ROWS
TABLE OF ROWS OF EACH

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 01:56:03
Message-ID: 29437.1386035763@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> That's how I read it, too. My hypothesis is that the standard adopted TABLE()
> to rubber-stamp Oracle's traditional name for UNNEST().

Hmm ... plausible.

> ... I propose merely changing the syntax to "TABLE FOR ROWS (...)".

Ugh :-(. Verbose and not exactly intuitive, I think. I don't like
any of the other options you listed much better. Still, the idea of
using more than one word might get us out of the bind that a single
word would have to be a fully reserved one.

> ROWS FROM

This one's a little less awful than the rest. What about "ROWS OF"?

regards, tom lane


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 04:26:49
Message-ID: 20131203042649.GA1163520@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 02, 2013 at 08:56:03PM -0500, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > ... I propose merely changing the syntax to "TABLE FOR ROWS (...)".
>
> Ugh :-(. Verbose and not exactly intuitive, I think. I don't like
> any of the other options you listed much better. Still, the idea of
> using more than one word might get us out of the bind that a single
> word would have to be a fully reserved one.
>
> > ROWS FROM
>
> This one's a little less awful than the rest. What about "ROWS OF"?

I had considered ROWS OF and liked it, but I omitted it from the list on
account of the shift/reduce conflict from a naturally-written Bison rule.
Distinguishing it from a list of column aliases takes extra look-ahead. We
could force that to work. However, if we ever wish to allow an arbitrary
from_item in the list, it would become ambiguous: is this drawing rows from
"a" or just using an alias with a column list?

WITH a AS (SELECT oid FROM pg_am ORDER BY 1) SELECT * FROM rows of(a, a);

ROWS FOR is terse and conflict-free. "FOR" evokes the resemblance to looping
over the parenthesized section with the functions acting as generators.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 12:59:43
Message-ID: CA+Tgmoa3Z8DZXsbk=Z=dik_ubiugEtH7nsz5xifGWf0yka4UYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 2, 2013 at 11:26 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Mon, Dec 02, 2013 at 08:56:03PM -0500, Tom Lane wrote:
>> Noah Misch <noah(at)leadboat(dot)com> writes:
>> > ... I propose merely changing the syntax to "TABLE FOR ROWS (...)".
>>
>> Ugh :-(. Verbose and not exactly intuitive, I think. I don't like
>> any of the other options you listed much better. Still, the idea of
>> using more than one word might get us out of the bind that a single
>> word would have to be a fully reserved one.
>>
>> > ROWS FROM
>>
>> This one's a little less awful than the rest. What about "ROWS OF"?
>
> I had considered ROWS OF and liked it, but I omitted it from the list on
> account of the shift/reduce conflict from a naturally-written Bison rule.
> Distinguishing it from a list of column aliases takes extra look-ahead. We
> could force that to work. However, if we ever wish to allow an arbitrary
> from_item in the list, it would become ambiguous: is this drawing rows from
> "a" or just using an alias with a column list?
>
> WITH a AS (SELECT oid FROM pg_am ORDER BY 1) SELECT * FROM rows of(a, a);
>
> ROWS FOR is terse and conflict-free. "FOR" evokes the resemblance to looping
> over the parenthesized section with the functions acting as generators.

I like the idea of using ROWS + some additional word. I think I
mildly prefer Tom's suggestion of ROWS FROM to your suggestion of ROWS
FOR, but I can live with either one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 14:46:07
Message-ID: 10523.1386081967@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Mon, Dec 02, 2013 at 08:56:03PM -0500, Tom Lane wrote:
>> Ugh :-(. Verbose and not exactly intuitive, I think. I don't like
>> any of the other options you listed much better. Still, the idea of
>> using more than one word might get us out of the bind that a single
>> word would have to be a fully reserved one.

> I had considered ROWS OF and liked it, but I omitted it from the list on
> account of the shift/reduce conflict from a naturally-written Bison rule.
> Distinguishing it from a list of column aliases takes extra look-ahead.

Hmm, yeah, you're right --- at least one of the first two words needs
to be reserved (not something that can be a ColId, at least), or else
we can't tell them from a table name and alias. So this approach
doesn't give us all that much extra wiggle room. We do have a number
of already-fully-reserved prepositions (FOR, FROM, IN, ON, TO) but none
of them seem like great choices.

> ROWS FOR is terse and conflict-free. "FOR" evokes the resemblance to looping
> over the parenthesized section with the functions acting as generators.

Meh. I don't find that analogy compelling.

After sleeping on it, your other suggestion of TABLE OF, or possibly
TABLE FROM, is starting to grow on me.

Who else has an opinion?

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 15:03:32
Message-ID: 20131203150331.GI17272@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> After sleeping on it, your other suggestion of TABLE OF, or possibly
> TABLE FROM, is starting to grow on me.
>
> Who else has an opinion?

Alright, for my 2c, I like having this syntax include 'TABLE' simply
because it's what folks coming from Oracle might be looking for.
Following from that, to keep it distinct from the spec's notion of
'TABLE', my preference is 'TABLE FROM'. I don't particularly like
'TABLE OF', nor do I like the various 'ROWS' suggestions.

Thanks,

Stephen


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 17:37:40
Message-ID: CAFj8pRAL9RQbJmcwWmse4ZzabaVu++Kg-Gbi4T8kkeqgktPKuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2013/12/3 Stephen Frost <sfrost(at)snowman(dot)net>

> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > After sleeping on it, your other suggestion of TABLE OF, or possibly
> > TABLE FROM, is starting to grow on me.
> >
> > Who else has an opinion?
>
> Alright, for my 2c, I like having this syntax include 'TABLE' simply
> because it's what folks coming from Oracle might be looking for.
> Following from that, to keep it distinct from the spec's notion of
> 'TABLE', my preference is 'TABLE FROM'. I don't particularly like
> 'TABLE OF', nor do I like the various 'ROWS' suggestions.
>

+1

Pavel

>
> Thanks,
>
> Stephen
>


From: Noah Misch <noah(at)leadboat(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 18:57:16
Message-ID: 20131203185716.GG1163520@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 03, 2013 at 10:03:32AM -0500, Stephen Frost wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > After sleeping on it, your other suggestion of TABLE OF, or possibly
> > TABLE FROM, is starting to grow on me.
> >
> > Who else has an opinion?
>
> Alright, for my 2c, I like having this syntax include 'TABLE' simply
> because it's what folks coming from Oracle might be looking for.
> Following from that, to keep it distinct from the spec's notion of
> 'TABLE', my preference is 'TABLE FROM'. I don't particularly like
> 'TABLE OF', nor do I like the various 'ROWS' suggestions.

I like having "ROWS" in there somehow, because it denotes the distinction from
SQL-standard TABLE(). Suppose we were to implement the SQL-standard TABLE(),
essentially just mapping it to UNNEST(). Then we'd have "TABLE (f())" that
unpacks the single array returned by f(), and we'd have "TABLE FROM (f())"
that unpacks the set of rows returned by f(). The word "FROM" alone does not
indicate that difference the way including "ROWS" does. (I don't object to
having "FROM" in addition to "ROWS".)

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 19:27:06
Message-ID: 26144.1386098826@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Tue, Dec 03, 2013 at 10:03:32AM -0500, Stephen Frost wrote:
>> Alright, for my 2c, I like having this syntax include 'TABLE' simply
>> because it's what folks coming from Oracle might be looking for.
>> Following from that, to keep it distinct from the spec's notion of
>> 'TABLE', my preference is 'TABLE FROM'. I don't particularly like
>> 'TABLE OF', nor do I like the various 'ROWS' suggestions.

> I like having "ROWS" in there somehow, because it denotes the distinction from
> SQL-standard TABLE(). Suppose we were to implement the SQL-standard TABLE(),
> essentially just mapping it to UNNEST(). Then we'd have "TABLE (f())" that
> unpacks the single array returned by f(), and we'd have "TABLE FROM (f())"
> that unpacks the set of rows returned by f(). The word "FROM" alone does not
> indicate that difference the way including "ROWS" does.

Hm ... fair point, except that "ROWS" doesn't seem to suggest the right
thing either, at least not to me. After further thought I've figured
out what's been grating on me about Noah's suggestions: he suggests that
we're distinguishing "TABLE [FROM ELEMENTS]" from "TABLE FROM ROWS",
but this is backwards. What UNNEST() really does is take an array,
extract the elements, and make a table of those. Similarly, what our
feature does is take a set (the result of a set-returning function),
extract the rows, and make a table of those. So what would seem
appropriate to me is "TABLE [FROM ARRAY]" versus "TABLE FROM SET".
Now I find either of those phrases to be one word too many, but the key
point is that I'd probably prefer something involving SET over something
involving ROWS. (Both of those are unreserved_keyword, so this doesn't
move the ball at all in terms of finding an unambiguous syntax.)

Another issue is that if you are used to the Oracle syntax, in which an
UNNEST() is presumed, it's not exactly clear that TABLE ROWS, or any other
phrase including TABLE, *doesn't* also imply an UNNEST. So to me that's
kind of a strike against Stephen's preference --- I'm thinking we might be
better off not using the word TABLE.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-03 20:15:13
Message-ID: 20131203201513.GR17272@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Another issue is that if you are used to the Oracle syntax, in which an
> UNNEST() is presumed, it's not exactly clear that TABLE ROWS, or any other
> phrase including TABLE, *doesn't* also imply an UNNEST. So to me that's
> kind of a strike against Stephen's preference --- I'm thinking we might be
> better off not using the word TABLE.

I see the concern there, but I would think a bit of documentation around
that would help them find UNNEST quickly, if that's what they're really
looking for. On the flip side, I imagine it could be jarring seeing
'TABLE FROM' when you're used to Oracle's 'TABLE'.

I haven't got any great suggestions about how to incorporate 'SET' and I
I do still like 'TABLE' as that's what we're building, but I'll be happy
to have this capability even if it's 'TABLE FROM SET ROWS THING'.

Thanks,

Stephen


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-06 01:36:53
Message-ID: 20131206013653.GA1209659@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 03, 2013 at 02:27:06PM -0500, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Tue, Dec 03, 2013 at 10:03:32AM -0500, Stephen Frost wrote:
> >> Alright, for my 2c, I like having this syntax include 'TABLE' simply
> >> because it's what folks coming from Oracle might be looking for.
> >> Following from that, to keep it distinct from the spec's notion of
> >> 'TABLE', my preference is 'TABLE FROM'. I don't particularly like
> >> 'TABLE OF', nor do I like the various 'ROWS' suggestions.
>
> > I like having "ROWS" in there somehow, because it denotes the distinction from
> > SQL-standard TABLE(). Suppose we were to implement the SQL-standard TABLE(),
> > essentially just mapping it to UNNEST(). Then we'd have "TABLE (f())" that
> > unpacks the single array returned by f(), and we'd have "TABLE FROM (f())"
> > that unpacks the set of rows returned by f(). The word "FROM" alone does not
> > indicate that difference the way including "ROWS" does.
>
> Hm ... fair point, except that "ROWS" doesn't seem to suggest the right
> thing either, at least not to me. After further thought I've figured
> out what's been grating on me about Noah's suggestions: he suggests that
> we're distinguishing "TABLE [FROM ELEMENTS]" from "TABLE FROM ROWS",
> but this is backwards. What UNNEST() really does is take an array,
> extract the elements, and make a table of those. Similarly, what our
> feature does is take a set (the result of a set-returning function),
> extract the rows, and make a table of those. So what would seem
> appropriate to me is "TABLE [FROM ARRAY]" versus "TABLE FROM SET".

Valid. On the other hand, tables *are* sets, so one could be forgiven for
wondering how an operation called TABLE FROM SET modifies anything. Since
order matters for this operation, I also get some mathematical angst from use
of the word "SET". When we added WITH ORDINALITY, set-returning functions
effectively became sequence-returning functions. (Not that actually using the
word SEQUENCE would be a net clarification.)

I model "ROWS FROM (f0(), f1())" as "cut from the following template,
row-wise, to make a table/set: (f0(), f1())".

> Another issue is that if you are used to the Oracle syntax, in which an
> UNNEST() is presumed, it's not exactly clear that TABLE ROWS, or any other
> phrase including TABLE, *doesn't* also imply an UNNEST. So to me that's
> kind of a strike against Stephen's preference --- I'm thinking we might be
> better off not using the word TABLE.

I could go either way on that.

Two naming proposals, "ROWS FROM" and "TABLE FROM", got an ACK from more than
one person apiece. I move that we settle on "ROWS FROM".

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-06 03:25:53
Message-ID: 27951.1386300353@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> Two naming proposals, "ROWS FROM" and "TABLE FROM", got an ACK from more than
> one person apiece. I move that we settle on "ROWS FROM".

I'm not sufficiently annoyed by "ROWS FROM" to object. Other opinions?

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-06 03:34:08
Message-ID: 20131206033407.GH17272@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > Two naming proposals, "ROWS FROM" and "TABLE FROM", got an ACK from more than
> > one person apiece. I move that we settle on "ROWS FROM".
>
> I'm not sufficiently annoyed by "ROWS FROM" to object. Other opinions?

Works well enough for me.

Thanks,

Stephen


From: Noah Misch <noah(at)leadboat(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNNEST with multiple args, and TABLE with multiple funcs
Date: 2013-12-09 19:40:56
Message-ID: 20131209194056.GA1289003@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 05, 2013 at 10:34:08PM -0500, Stephen Frost wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > Noah Misch <noah(at)leadboat(dot)com> writes:
> > > Two naming proposals, "ROWS FROM" and "TABLE FROM", got an ACK from more than
> > > one person apiece. I move that we settle on "ROWS FROM".
> >
> > I'm not sufficiently annoyed by "ROWS FROM" to object. Other opinions?
>
> Works well enough for me.

Great. Here's the patch I'll be using.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
rename-from-table-v1.patch text/plain 43.2 KB