WITH RECURSIVE ... CYCLE in vanilla SQL: issues with arrays of rows

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: WITH RECURSIVE ... CYCLE in vanilla SQL: issues with arrays of rows
Date: 2008-10-08 01:58:40
Message-ID: 20435.1223431120@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I looked a bit at the SQL:2008 spec for a CYCLE clause for WITH
RECURSIVE. It is interesting to see that it is just syntactic sugar,
because *they spell out how to expand it into regular SQL*. More,
they defined it in such a way that it's hard to optimize at all,
because the "path" column is exposed to the user; you don't really
have any choice about how to do it. There are some ugly and unnecessary
choices in there too, like insisting that the cycle mark column be
char(1).

So I am not feeling very excited about implementing the syntax per se
(and I note that DB2 doesn't seem to have done so either). Instead
we should document some examples of how to do cycle detection at the
SQL level. However, it would be nice if the spec's approach to cycle
detection actually worked well in Postgres. There are a couple of
things we seem to be missing, according to some experiments I just
did with trying to translate the spec's code into Postgres:

* The spec assumes that ARRAY[ROW(some columns)] works, ie, that you can
have an array of an anonymous record type. We don't allow that right
now, but it seems like a useful thing to have --- at least as a
transient value within a query. I'm not sure there's a case for
allowing such things to go to disk.

* The spec writes this to detect whether a row of an anonymous record
type is present in an array of that same anonymous record type:
ROW(some columns) IN (SELECT P.* FROM TABLE(array variable) P)
We haven't got the TABLE() syntax; you can sort of emulate it with a SRF
but only for arrays of named rowtypes. For an anonymous rowtype,
it's very unclear to me how the rowtype would be communicated at
parse time so that the P.* notation could be expanded properly.

* Instead of the above, we could try to make
ROW(some columns) = ANY (array variable)
work. This is shorter than the above syntax and would presumably have
a lot less overhead too. But it doesn't work right now, not even for
named rowtypes much less anonymous ones.

I'm thinking that addressing these pieces would be a generally good
thing to do, above and beyond potential uses in recursive queries.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emmanuel Cecchet 2008-10-08 02:02:48 Re: Transactions and temp tables
Previous Message Robert Haas 2008-10-08 01:11:14 Re: Updates of SE-PostgreSQL 8.4devel patches