Re: plpgsql.consistent_into

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plpgsql.consistent_into
Date: 2014-01-14 17:15:08
Message-ID: 19226.1389719708@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> writes:
> On 1/14/14 12:28 PM, Marti Raudsepp wrote:
>> Now, another question is whether it's possible to make the syntax
>> work. Is this an assignment from the result of a subquery, or is it a
>> query by itself?
>> a = (SELECT foo FROM table);

> That looks like a scalar subquery, which is wrong because they can't
> return more than one column (nor can they be INSERT etc., obviously).

Yeah, it's a scalar subquery, which means that plpgsql already assigns
a non-error meaning to this syntax.

> How about:

> (a) = SELECT 1;
> (a, b) = SELECT 1, 2;
> (a, b) = INSERT INTO foo RETURNING col1, col2;

> Same semantics: TOO_MANY_ROWS on rows > 1, sets FOUND and row_count.
> AFAICT this can be parsed unambiguously, too, and we don't need to look
> at the query string because this is new syntax.

The idea of inventing new syntax along this line seems like a positive
direction to pursue. Since assignment already rejects multiple rows
from the source expression, this wouldn't be weirdly inconsistent.

It might be worth thinking about the <multiple column assignment> UPDATE
syntax that's in recent versions of the SQL standard:

UPDATE targettab SET (a, b, c) = row-valued-expression [ , ... ] [ WHERE ... ]

We don't actually implement this in PG yet, except for trivial cases, but
it will certainly happen eventually. I think your sketch above deviates
unnecessarily from what the standard says for UPDATE. In particular
I think it'd be better to write things like

(a, b) = ROW(1, 2);
(a, b, c) = (SELECT x, y, z FROM foo WHERE id = 42);

which would exactly match what you'd write in a multiple-assignment
UPDATE, and it has the same rejects-multiple-rows semantics too.

Also note that the trivial cases we do already implement in UPDATE
look like

UPDATE targettab SET (a, b, c) = (1, 2, 3) [ WHERE ... ]

that is, we allow a row constructor where the optional keyword ROW has
been omitted. I think people would expect to be able to write this in
plpgsql:

(a, b) = (1, 2);

Now, this doesn't provide any guidance for INSERT/UPDATE/DELETE RETURNING,
but frankly I don't feel any need to invent new syntax for those, since
RETURNING INTO already works the way you want.

I'm not too sure what it'd take to make this work. Right now,

SELECT (SELECT x, y FROM foo WHERE id = 42);

would generate "ERROR: subquery must return only one column", but
I think it's mostly a historical artifact that it does that rather than
returning a composite value (of an anonymous record type). If we were
willing to make that change then it seems like it'd be pretty
straightforward to teach plpgsql to handle

(a, b, ...) = row-valued-expression

where there wouldn't actually be any need to parse the RHS any differently
from the way plpgsql parses an assignment RHS right now. Which would be
a good thing IMO. If we don't generalize the behavior of scalar
subqueries then plpgsql would have to jump through a lot of hoops to
support the subselect case.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2014-01-14 17:15:20 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Previous Message Robert Haas 2014-01-14 17:14:07 Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance