Returning multiple result sets

Lists: pgsql-hackers
From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Returning multiple result sets
Date: 2005-11-19 15:13:43
Message-ID: 20051119151343.GF8630@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've been thinking about this and wondered if this is a way to get it
done without too much work.

1. Create an "anyrecord" type to which any record type can be cast.
It's essentially a heaptuple with a tupledesc.

2. "anyrecord" is opaque to the parser, you cannot dereference it, only
output it or pass it to other functions accepting "anyrecord". Possibly
some dynamic languages may be able to reference these.

3. In the output functions printtup_startup/printtup, if any of the
result fields are of type "anyrecord" it defers the 'T' message until
the tuples arrive. When they do it expands the anyrecord and creates
columns from each and sends an appropriate 'T' message.

4. libpq already supports multiple result sets so no problem there.

The effect of this would be that you get multiple result sets, once for
each time the tupledesc changes. As for creating a split without
changing, maybe you need to return a special empty tuple which
signifies end-of-set.

It also occured to me we could just change the "record" type to do
this, but this would change the behaviour of:

test=# select x from test() as x;
x
-------
(1,2)
(1 row)

to:

a | b
---+---
1 | 2
(1 row)

But that's backward incompatable. OTOH, it would mean we could get rid
of the requirement that functions returning "record" must specify a
column definition list in the query. the only restriction is that you
can't dereference it, but that doesn't seem so big a deal.

So, kill a few birds with one stone. Any thoughts?
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-19 17:43:15
Message-ID: 20984.1132422195@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> So, kill a few birds with one stone. Any thoughts?

I don't think any of this will actually work :-(. There's too much code
that assumes that all the tuples returned by a query are alike, and I
for one don't feel like trying to find and fix it all. (Not all of it
is within our control, either --- this will break client code along with
the backend.)

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-19 18:02:42
Message-ID: 20051119180242.GG8630@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Nov 19, 2005 at 12:43:15PM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > So, kill a few birds with one stone. Any thoughts?
>
> I don't think any of this will actually work :-(. There's too much code
> that assumes that all the tuples returned by a query are alike, and I
> for one don't feel like trying to find and fix it all. (Not all of it
> is within our control, either --- this will break client code along with
> the backend.)

I don't think so, as far as all the functions are concerned, the tuples
are all the same: when a function is called with anyrecord, it's passed
a single argument, the heaptuple+tupledesc. It's an opaque verlena type
that nothing is going to be able to access unless they actually go to
the effort.

All this does is essentially flatten records-in-tuples in the output
function.

Consider:

create function a(anyrecord) returns anyrecord;
create function b(int4) returns anyrecord;

select a(b(2));

Does anything in the backend other than those two functions need to
know the exact format of the "anyrecord"? Even if the actual records
contain 20 values, to everybody else it's just an opaque verlena type.
And in the output (and *only* on output), the printtup function can
examine the tupledesc to tell the client what data to expect.

Seems like it should be possible to me. Another way to put it would be
making records a first-class type. What am I missing?

Thanks in advance,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-19 23:02:08
Message-ID: 20051119230208.GB16420@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > So, kill a few birds with one stone. Any thoughts?
>
> I don't think any of this will actually work :-(. There's too much code
> that assumes that all the tuples returned by a query are alike, and I
> for one don't feel like trying to find and fix it all. (Not all of it
> is within our control, either --- this will break client code along with
> the backend.)

Hmm -- probably we could declare that the current libpq API will not
support multiple result sets from one query, and return only the first
one to the application discarding the rest. (It just occured to me --
what happens if one send multiple SELECTs in a semicolon-separated query
via libpq?). New apps wanting to take advantage of the new
functionality would need to invoke a different function.

At the protocol level this will need an extension anyway, so clients
using the protocol directly would need to be updated to understand
multiple results.

I know people migrating from SQL Server (maybe others?) are already
having trouble because of our inability to return multiple result sets.
The sooner we do it, the sooner all the code will be fixed ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: "Pavel Stehule" <pavel(dot)stehule(at)hotmail(dot)com>
To: kleptog(at)svana(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 07:43:05
Message-ID: BAY20-F428A1066834836F9E5429F9500@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>
>Consider:
>
>create function a(anyrecord) returns anyrecord;
>create function b(int4) returns anyrecord;
>
>select a(b(2));
>

for my task I need little different form :-(

create function a(..) returns setof tables

but SQL2003 needs type table, and this can be solution

_________________________________________________________________
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci.
http://messenger.msn.cz/


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 14:07:23
Message-ID: 20051120140717.GA32031@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Nov 19, 2005 at 08:02:08PM -0300, Alvaro Herrera wrote:
> Hmm -- probably we could declare that the current libpq API will not
> support multiple result sets from one query, and return only the first
> one to the application discarding the rest. (It just occured to me --
> what happens if one send multiple SELECTs in a semicolon-separated query
> via libpq?). New apps wanting to take advantage of the new
> functionality would need to invoke a different function.

libpq supports it just fine. You do a PQsendQuery() and then as many
PQgetResult()s as it takes to get back the results. This worked for a
while AFAIK.

> At the protocol level this will need an extension anyway, so clients
> using the protocol directly would need to be updated to understand
> multiple results.

The protocol supports it fine. If the server sends a new 'T' record,
libpq assumes it's returning a new resultset. When the server sends a
"ready for query" message, libpq knows that the last resultset has
arrived.

It's the backend that needs work, not the protocol.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Pavel Stehule <pavel(dot)stehule(at)hotmail(dot)com>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 14:09:25
Message-ID: 20051120140925.GB32031@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 20, 2005 at 08:43:05AM +0100, Pavel Stehule wrote:
> >Consider:
> >
> >create function a(anyrecord) returns anyrecord;
> >create function b(int4) returns anyrecord;
> >
> >select a(b(2));
> >
>
> for my task I need little different form :-(
>
> create function a(..) returns setof tables
>
> but SQL2003 needs type table, and this can be solution

You want a function return entire tables at a time? Why bother when you
can just return rows and signal when the next table starts?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 16:29:39
Message-ID: 28702.1132504179@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> libpq supports it just fine. You do a PQsendQuery() and then as many
> PQgetResult()s as it takes to get back the results. This worked for a
> while AFAIK.

That only works if the caller is prepared to read each result serially,
and not (say) a row at a time in parallel. There are a bunch of
ease-of-use problems as well, such as knowing which resultset is which,
coping with errors detected after the first resultset(s) are sent, etc.

A more realistic way of dealing with multiple resultsets is to deliver
them as named cursor references and allow the client to FETCH
reasonable-sized chunks. We can sort of handle this today, but it's
notationally painful at both the stored-procedure and client ends.

regards, tom lane


From: "Pavel Stehule" <pavel(dot)stehule(at)hotmail(dot)com>
To: kleptog(at)svana(dot)org
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 17:05:36
Message-ID: BAY20-F10756838B935AAF25FCD6BF9500@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > for my task I need little different form :-(
> >
> > create function a(..) returns setof tables
> >
> > but SQL2003 needs type table, and this can be solution
>
>You want a function return entire tables at a time? Why bother when you
>can just return rows and signal when the next table starts?
>

what is difference between rows with different structures and tables? Tables
are more logic. But I unlike function which returns setof tables. This need
data type table. I prefere normal clasic solution.

-------------- stored proc -------------- | --------------- client
------------------

function -> scalar, vector, table

procedure -> OUT params
-----------------------------
every free select --------------------------------> table
-----------------------------

I don't have imagine how I can write readable code with your proposal

variants one:

create function aaa returns setof anyrecord
begin
for each a in select * from temptab1
return next a;
end loop;
return next 'next table';
for each a in select * from temptab2
return next a;
end loop;
return next 'ok';
return;
end;

variants two:
create procedure aaa(OUT allok bool)
begin
select * from temptab1;
select * from temptab2;
a := true;
end;

I don't have better words :-). I am sorry. I don't wont to complicate
internal structure of planer, executor, etc ... Procedures are different
than functions, and can be executed different, Isn't possible using
procedure in params list.

Nice day
Pavel

_________________________________________________________________
Citite se osamele? Poznejte nekoho vyjmecneho diky Match.com.
http://www.msn.cz/


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 18:05:57
Message-ID: 20051120180556.GE32031@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 20, 2005 at 11:29:39AM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > libpq supports it just fine. You do a PQsendQuery() and then as many
> > PQgetResult()s as it takes to get back the results. This worked for a
> > while AFAIK.
>
> That only works if the caller is prepared to read each result serially,
> and not (say) a row at a time in parallel. There are a bunch of
> ease-of-use problems as well, such as knowing which resultset is which,
> coping with errors detected after the first resultset(s) are sent, etc.

Urk! I don't think anyone is suggesting that resultsets can be
interleaved. Apart from being extremely unlike the current model in
PostgreSQL, I can't think of a use for it that isn't served just as
well by sending them sequentially.

> A more realistic way of dealing with multiple resultsets is to deliver
> them as named cursor references and allow the client to FETCH
> reasonable-sized chunks. We can sort of handle this today, but it's
> notationally painful at both the stored-procedure and client ends.

But if you run a function, it can only return a single row at a time.
Fiddling with cursors means you would have to queue up. After all, the
function is going to return the tuples in a fixed order that is
independant of when the client asks.

At the end of the day, the client can only accept data in the order the
server sends it. Having to request each row seem somewhat inefficient.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Pavel Stehule <pavel(dot)stehule(at)hotmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 18:18:18
Message-ID: 20051120181817.GF32031@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 20, 2005 at 06:05:36PM +0100, Pavel Stehule wrote:
> what is difference between rows with different structures and tables?
> Tables are more logic. But I unlike function which returns setof tables.
> This need data type table. I prefere normal clasic solution.

You're confusing syntax with implementation. Internally the functions
would return a single tuple at a time. But there is no real reason why:

return table (select * from foo);

would not simply loop and return each tuple. We can create syntax as we
feel appropriate if we think it makes thing easier. Thus internally
your two variants would both work and do the same thing. Look at the
SQL language functions. There if you say 'select * from foo' it returns
the whole table without a loop...

Anyway, this discussion isn't really going to go anywhere without some
code. I'll see what I can do.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: "Pavel Stehule" <pavel(dot)stehule(at)hotmail(dot)com>
To: kleptog(at)svana(dot)org
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 20:06:10
Message-ID: BAY20-F2A7AE6DE50CE73B2C6FB8F9500@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I thinking about solution based on setof cursors. This solustion has three
big minus:
1. I can "unpack" cursors after finish of called procedure. If I get
exception, or long query, I can show nothing.
2. Old clients don't understand and don't unpack cursor. Statement call is
(+/-) == statement SELECT (more if call return only one table).
3. twice communication.

backend client
==============
<--------------------- call
-----------------------> cursors
<---------------------- select cursor
------------------------> table

this isn't pretty solution. May be with minimal changes in code. I think,
this need bigger changes and support next class of stored objects.

best regards
Pavel Stehule

_________________________________________________________________
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci.
http://messenger.msn.cz/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 20:41:36
Message-ID: 19517.1132519296@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Sun, Nov 20, 2005 at 11:29:39AM -0500, Tom Lane wrote:
>> That only works if the caller is prepared to read each result serially,
>> and not (say) a row at a time in parallel.

> Urk! I don't think anyone is suggesting that resultsets can be
> interleaved.

No? If not, why not? The main reason why this is being pushed, IIRC,
is the claim that "you can do this easily in other databases". If you
don't want to support interleaved retrieval of multiple datasets, you
had better be prepared to prove that no other popular database can
do it either.

>> A more realistic way of dealing with multiple resultsets is to deliver
>> them as named cursor references and allow the client to FETCH
>> reasonable-sized chunks. We can sort of handle this today, but it's
>> notationally painful at both the stored-procedure and client ends.

> But if you run a function, it can only return a single row at a time.

This is not about what we can do today with PG functions; it's about
what can be done with a stored procedure in other RDBMSes.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-20 21:30:48
Message-ID: 20051120213046.GG32031@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 20, 2005 at 03:41:36PM -0500, Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > Urk! I don't think anyone is suggesting that resultsets can be
> > interleaved.
>
> No? If not, why not? The main reason why this is being pushed, IIRC,
> is the claim that "you can do this easily in other databases". If you
> don't want to support interleaved retrieval of multiple datasets, you
> had better be prepared to prove that no other popular database can
> do it either.

I don't know if I can prove it. however, I do have a few datapoints:

1. In SQLJ when you call a stored procedure that returns multiple
datasets, you have to close a resultset before you can start on the
next one.

: Result sets are returned to the calling program in the same order
: that their cursors are opened in the stored procedure. When there are
: no more result sets to retrieve, getNextResultSet returns a null
: value.

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db2.doc.java/bjnkmstr81.htm

2. ASP seems to have the same restriction

http://www.w3schools.com/ado/met_rs_nextrecordset.asp

Note, we should distinguish here between (a) being able to send a query
before you've retreived all the data of the current one and having the
results of those interleaved, and (b) having the results of a single
query return two results sets interleaved.

I beleive the first is supported by other DBs but not us (other than
explicit cursors). I don't think any support the latter, but I can't
claim to have checked them all. Your point is taken though, I'll see if
I can find any evidence one way or the other.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Returning multiple result sets
Date: 2005-11-23 20:22:44
Message-ID: 200511232022.jANKMia21318@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> > libpq supports it just fine. You do a PQsendQuery() and then as many
> > PQgetResult()s as it takes to get back the results. This worked for a
> > while AFAIK.
>
> That only works if the caller is prepared to read each result serially,
> and not (say) a row at a time in parallel. There are a bunch of
> ease-of-use problems as well, such as knowing which resultset is which,
> coping with errors detected after the first resultset(s) are sent, etc.
>
> A more realistic way of dealing with multiple resultsets is to deliver
> them as named cursor references and allow the client to FETCH
> reasonable-sized chunks. We can sort of handle this today, but it's
> notationally painful at both the stored-procedure and client ends.

Is there a TODO here?

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073