Set generating functions and subqueries

Lists: pgsql-sql
From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: PostgreSQL SQL List <pgsql-sql(at)postgresql(dot)org>
Subject: Set generating functions and subqueries
Date: 2006-03-10 12:19:30
Message-ID: 44116ED2.9060901@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hello,

Today, I stumbled across the following:

postgres=# select generate_series(1,2),'othercol';
generate_series | ?column?
-----------------+----------
1 | othercol
2 | othercol
(2 rows)

postgres=# select (select generate_series(1,2)),'othercol';
ERROR: more than one row returned by a subquery used as an expression

So it seems that set-returning functions "blow up" the resultset by
duplicating rows - so why is this not allowed for subqueries?

It is easy to refactor a subquery into a set-returning function, so I
think this violates the principle of orthogonality.

But there may be subtle reasons of ambiguity here I don't see right now.

(I know that usually, a JOIN would be the right thing to do here, but
I'm just curious why multi-row subqueries are not allowed.)

Btw, having several set-returning functions with equal or different set
lengths produce interesting results:

postgres=# select generate_series(1,2),generate_series(3,4),'othercol';
generate_series | generate_series | ?column?
-----------------+-----------------+----------
1 | 3 | othercol
2 | 4 | othercol
(2 rows)

postgres=# select generate_series(1,2),generate_series(3,5),'othercol';
generate_series | generate_series | ?column?
-----------------+-----------------+----------
1 | 3 | othercol
2 | 4 | othercol
1 | 5 | othercol
2 | 3 | othercol
1 | 4 | othercol
2 | 5 | othercol
(6 rows)

Is there any way to indicate that I want the cross-product if both
set-returning functions have the same length? This could lead to strange
effects if the sets have varying lengths otherwhise.
(One workaround would be to join two selects, each one having one
set-returning function.)

Btw, it is not possible to trick PostgreSQL into accepting multi-row
selects this way:

postgres=# select (select generate_series(1,2)),generate_series(3,4),'';
ERROR: more than one row returned by a subquery used as an expression

Have fun,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Schaber <schabi(at)logix-tt(dot)com>
Cc: PostgreSQL SQL List <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Set generating functions and subqueries
Date: 2006-03-10 14:48:48
Message-ID: 5142.1142002128@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Markus Schaber <schabi(at)logix-tt(dot)com> writes:
> postgres=# select (select generate_series(1,2)),'othercol';
> ERROR: more than one row returned by a subquery used as an expression

> So it seems that set-returning functions "blow up" the resultset by
> duplicating rows - so why is this not allowed for subqueries?

The behavior of the subquery expression is dictated by the SQL spec:

1) If the cardinality of a <scalar subquery> or a <row subquery> is
greater than 1, then an exception condition is raised: cardinal-
ity violation.

The fact that the other form is even allowed is more of a holdover from
PostQUEL than something we have consciously decided is a good idea.
(IMHO it's actually a fairly *bad* idea, because it does not work nicely
when there's more than one SRF in the same targetlist.) It'll probably
get phased out someday, if we can find a way to replace the
functionality. I seem to recall speculating that SQL2003's LATERAL
tables might do the job.

> Btw, having several set-returning functions with equal or different set
> lengths produce interesting results:

No kidding.

regards, tom lane


From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL SQL List <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Set generating functions and subqueries
Date: 2006-03-13 12:34:23
Message-ID: 441566CF.2040105@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hi, Tom,

Tom Lane wrote:

> The behavior of the subquery expression is dictated by the SQL spec:
>
> 1) If the cardinality of a <scalar subquery> or a <row subquery> is
> greater than 1, then an exception condition is raised: cardinal-
> ity violation.

That's interesting to know, and it seems to be a clean design.

> The fact that the other form is even allowed is more of a holdover from
> PostQUEL than something we have consciously decided is a good idea.
> (IMHO it's actually a fairly *bad* idea, because it does not work nicely
> when there's more than one SRF in the same targetlist.) It'll probably
> get phased out someday, if we can find a way to replace the
> functionality. I seem to recall speculating that SQL2003's LATERAL
> tables might do the job.

AFAICS, it can be replaced with JOINs:

select * FROM (SELECT 'othercol' AS other) as foo CROSS JOIN (SELECT
generate_series(1,2) as a) as fop CROSS JOIN (SELECT
generate_series(3,4) as b) AS foq;
other | a | b
----------+---+---
othercol | 1 | 3
othercol | 2 | 3
othercol | 1 | 4
othercol | 2 | 4
(4 rows)

> No kidding.

I wasn't kidding, I just wanted to point out the different behaviour
between equal-length and inequal-length sequences.

Thanks,
markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Schaber <schabi(at)logix-tt(dot)com>
Cc: PostgreSQL SQL List <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Set generating functions and subqueries
Date: 2006-03-13 14:57:39
Message-ID: 7169.1142261859@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Markus Schaber <schabi(at)logix-tt(dot)com> writes:
> Tom Lane wrote:
>> The fact that the other form is even allowed is more of a holdover from
>> PostQUEL than something we have consciously decided is a good idea.
>> (IMHO it's actually a fairly *bad* idea, because it does not work nicely
>> when there's more than one SRF in the same targetlist.) It'll probably
>> get phased out someday, if we can find a way to replace the
>> functionality. I seem to recall speculating that SQL2003's LATERAL
>> tables might do the job.

> AFAICS, it can be replaced with JOINs:

Not really; the case where joins don't help is where you want to
generate the SRF's output for each of the values appearing in a table.
There's an example in "SQL Functions Returning Sets" in the manual:
http://www.postgresql.org/docs/8.1/static/xfunc-sql.html#AEN31646

regards, tom lane