Re: Multiple SRF right after SELECT

Lists: pgsql-hackers
From: "Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Multiple SRF right after SELECT
Date: 2008-03-19 12:03:48
Message-ID: e431ff4c0803190503m49eeef76pb91a692009a30331@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I wonder, if the following is correct and provides expected result:

test=# select generate_series(1, 2), generate_series(1, 4);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
1 | 3
2 | 4
(4 rows)

Actually I have two questions on this:
1. Is it correct at all to use SRF in select list, w/o explicit FROM?
Why then we do not allow using subselects that return multiple rows?
I'd rather expect that these two things work in similar manner.
2. Why the query above provides 4 rows, not 2*4=8? Actually, that's
interesting -- I can use this query to find l.c.m. But it's defenetely
not that I'd expect before my try...
--
Best regards,
Nikolay


From: Albert Cervera i Areny <albert(at)nan-tic(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com>
Subject: Re: Multiple SRF right after SELECT
Date: 2008-03-19 12:20:33
Message-ID: 200803191320.33679.albert@nan-tic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

A Dimecres 19 Març 2008, Nikolay Samokhvalov va escriure:
> 2. Why the query above provides 4 rows, not 2*4=8? Actually, that's
> interesting -- I can use this query to find l.c.m. But it's defenetely
> not that I'd expect before my try...

2*4 = 8:

select * from generate_series(1, 2) a, generate_series(1, 4) b;

Can't tell you about the expected behaviour in the query you provided though.

--
Albert Cervera i Areny
http://www.NaN-tic.com


From: David BOURIAUD <david(dot)bouriaud(at)ac-rouen(dot)fr>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multiple SRF right after SELECT
Date: 2008-03-19 12:47:03
Message-ID: 200803191347.06443.david.bouriaud@ac-rouen.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le mercredi 19 mars 2008, Albert Cervera i Areny a écrit :
Hi !

> A Dimecres 19 Març 2008, Nikolay Samokhvalov va escriure:
> > 2. Why the query above provides 4 rows, not 2*4=8? Actually, that's
> > interesting -- I can use this query to find l.c.m. But it's defenetely
> > not that I'd expect before my try...
>
> 2*4 = 8:
>
> select * from generate_series(1, 2) a, generate_series(1, 4) b;

If you launch the above query, you just get what you would get if you would do
a select from two tables without joining them at all...
So, you get the cartesian product of the two ensembles.

>
> Can't tell you about the expected behaviour in the query you provided
> though.

I've made few tests with the primary query, and indeed it is strange
behavoiour. Consider the following :

select generate_series(1, 3), generate_series(1, 4);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
3 | 3
1 | 4
2 | 1
3 | 2
1 | 3
2 | 4
3 | 1
1 | 2
2 | 3
3 | 4
which is not fully readeable but if you sort things, you get exactly the same
as what you mentionned before :

select generate_series(1, 3), generate_series(1, 4) order by 1,2;
generate_series | generate_series
-----------------+-----------------
1 | 1
1 | 2
1 | 3
1 | 4
2 | 1
2 | 2
2 | 3
2 | 4
3 | 1
3 | 2
3 | 3
3 | 4

So far it is clear, but if you just make things so that the the number of rows
returned by one call to generate_series is a multiple of the other, the
result is truncated :

select generate_series(1, 3), generate_series(1, 6) order by 1,2;
generate_series | generate_series
-----------------+-----------------
1 | 1
1 | 4
2 | 2
2 | 5
3 | 3
3 | 6

provides the same strange result as initialy discovered, and
select generate_series(1, 6), generate_series(1, 3) order by 2,1;
generate_series | generate_series
-----------------+-----------------
1 | 1
4 | 1
2 | 2
5 | 2
3 | 3
6 | 3

provides the same, mirrored. So, it could be a bug somewhere.
Hoping that it will be of any help...
Regards.

>
> --
> Albert Cervera i Areny
> http://www.NaN-tic.com


From: Volkan YAZICI <yazicivo(at)ttmail(dot)com>
To: "Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multiple SRF right after SELECT
Date: 2008-03-19 13:08:53
Message-ID: 87lk4eoo7u.fsf@alamut.mobiliz.com.tr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 19 Mar 2008, "Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com> writes:
> I wonder, if the following is correct and provides expected result:
>
> test=# select generate_series(1, 2), generate_series(1, 4);
> generate_series | generate_series
> -----------------+-----------------
> 1 | 1
> 2 | 2
> 1 | 3
> 2 | 4
> (4 rows)
>
>
> 1. Is it correct at all to use SRF in select list, w/o explicit FROM?
> Why then we do not allow using subselects that return multiple rows?
> I'd rather expect that these two things work in similar manner.
> 2. Why the query above provides 4 rows, not 2*4=8? Actually, that's
> interesting -- I can use this query to find l.c.m. But it's defenetely
> not that I'd expect before my try...

From PL/scheme sources:

/*
* There're 2 ways to return from an SRF:
*
* 1. Value-per-call Mode
* You return each tuple one by one via SRF_RETURN_NEXT() macro. But
* PG_RETURN_DATUM() calls in the macro, makes it quite
* impracticble. OTOH, this method gives opportunity to call SRFs in
* a fashion like "SELECT mysrf();"
*
* 2. Materialize Mode
* In this mode, you collect all tuples in a single set and return
* that set. When compared to previos method, it's not possible to
* use SRF of materialize mode like "SELECT my_materialized_srf();",
* instead, you need to access it as a simple table: "SELECT * FROM
* my_materialized_srf();".
*
* ...
*/

And I conclude to that generate_series() is written as a SRF function of
value-per-call mode. (Also you may want to check Returning Sets[1]
chapter at PostgreSQL manual.)

[1] http://www.postgresql.org/docs/current/static/xfunc-c.html#XFUNC-C-RETURN-SET

Regards.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multiple SRF right after SELECT
Date: 2008-03-19 15:10:26
Message-ID: 1620.1205939426@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Nikolay Samokhvalov" <samokhvalov(at)gmail(dot)com> writes:
> 1. Is it correct at all to use SRF in select list, w/o explicit FROM?

You can read about the current behavior in ExecTargetList, but basically
the idea is to cycle all the SRFs until they all say "done" at the same
time. So the number of result rows is the least common multiple of the
sizes of the SRF outputs.

This behavior is pretty random, I agree, but we inherited it from
Berkeley ... and even if we didn't care about breaking existing
applications that might rely on it, it's not exactly clear what it
should be instead.

The fact that there isn't any obviously right behavior is why I'd
prefer to deprecate SRFs in targetlists ...

regards, tom lane