Re: proposal: array utility functions phase 1

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: array utility functions phase 1
Date: 2002-12-09 01:56:37
Message-ID: 3DF3F855.1000606@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> This crystallizes something that has been bothering me for awhile: the
> table function syntax is severely hobbled (not to say crippled :-() by
> the fact that the function arguments have to be constants. You really
> don't want to have to invent intermediate functions every time you want
> a slightly different query --- yet this technique seems to require *two*
> bespoke functions for every query, one on each end of the array_values()
> function.

It did for me too. I was thinking along these lines while working on the
connectby function, but this work really makes it clear.

> The original Berkeley syntax, messy as it was, at least avoided this
> problem. For example, I believe this same problem could be solved
> (approximately) with
>
> select array_values(grolist) from pg_group where groname = 'g2'

Yes, this is exactly what I was yearning to do. Was there a spec or technical
reason (or both) for not allowing the following?

select * from array_values(g.grolist), pg_group g where g.groname = 'g2';

It seems like you could treat it like a one-to-many join between pg_group and
the function. I'm sure this is a bad idea and breaks down for more complex
examples, but I often have found myself wishing I could do exactly that.

> I think we ought to try to find a way that table functions could be used
> with inputs that are taken from tables. In a narrow sense you can do
> this already, with a sub-SELECT:
>
> select * from my_table_func((select x from ...));
>
> but (a) the sub-select can only return a single value, and (b) you can't
> get at any of the other columns in the row the sub-select is selecting.
> For instance it won't help me much to do
>
> select * from
> array_values((select grolist from pg_group where groname = 'g2'))
>
> if I want to show the group's grosysid as well.

You could do something like:
select * from array_values('pg_group','grolist') ...
and repeat the rest of pg_group's columns for each row produced from grolist
in the output (this is closer to what Nigel did, IIRC). This even works in the
current table function implementation. It does not get around the issue of
specifying querytime column refs though.

> I know I'm not explaining this very well (I'm only firing on one
> cylinder today :-(), but basically I think we need to step back and take
> another look at the mechanism before we start inventing tons of helper
> functions to make up for the lack of adequate mechanism.

Nope, you're explaining it just fine -- it's what I've been thinking for a
while, but couldn't articulate myself.

> As for array_values() itself, it seems fairly inelegant to rely on the
> user to get the input and output types to match up. Now that we have
> an "anyarray" pseudotype, I think it wouldn't be unreasonable to hack up
> some kluge in the parser to allow reference to the element type of such
> an argument --- that is, you'd say something like
>
> create function array_values(anyarray) returns setof anyarray_element
>
> and the parser would automatically understand what return type to assign
> to any particular use of array_values. (Since type resolution is done
> bottom-up, I see no logical difficulty here, though the implementation
> might be a bit of a wart...)

That doesn't quite work as written (you'd have to account for the array index
column or lose it -- which loses any ability to get position in the array),
and has even more problems with the array_values('pg_group','grolist') approach.

How ugly/difficult would it be to allow the planner to interrogate the
function and let the function report back a tupledesc based on the actual
runtime input parameters? Kind of a special mode of function call that the
function could detect and respond to differently than during execution (to
avoid excessive runtime an/or side effects -- just form a tupledesc and return
it). Then the planner could move forward without requiring a specific declared
return composite type or a return type of record with a runtime query column
definition.

Joe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2002-12-09 03:48:54 Re: [GENERAL] PostgreSQL Global Development Group
Previous Message Ned Lilly 2002-12-09 01:53:17 Re: [GENERAL] PostgreSQL Global Development Group

Browse pgsql-patches by date

  From Date Subject
Next Message Rod Taylor 2002-12-09 02:19:30 ALTER DOMAIN .. OWNER TO ..
Previous Message Tom Lane 2002-12-08 22:06:40 Re: proposal: array utility functions phase 1