Behavior of GENERATED columns per SQL2003

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zoltan Boszormenyi <zb(at)cybertec(dot)at>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Behavior of GENERATED columns per SQL2003
Date: 2007-05-07 21:16:15
Message-ID: 18812.1178572575@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been studying the SQL spec in a bit more detail and I'm suddenly
thinking that we've got the behavior all wrong in the current
GENERATED/IDENTITY patch. In particular, it looks to me like we've
been implementing GENERATED ALWAYS AS (expr) according to the rules
that the spec in fact lays down only for GENERATED ALWAYS AS IDENTITY.
You'd think the two constructs would be pretty closely related but
the behaviors specified by the spec are light-years apart. If you
look closely, a "generated column" in the meaning of section 4.14.8
is one that has GENERATED ALWAYS AS (expr), and identity columns are
*not* in this class.

It looks to me like the behavior the spec intends for a generated column
is actually that it can be implemented as a "virtual column" occupying
no space on disk and instead computed on-the-fly when retrieved.
Identity columns can have their values overridden by the
user (it's a little harder if GENERATED ALWAYS, but still possible),
and they don't change during an UPDATE unless specifically forced to.
In contrast, generated columns cannot be overridden by
assignment, and are recomputed from their base columns during updates.
This realization also explains the following, otherwise rather strange,
facts:

* There is no GENERATED BY DEFAULT AS (expr) in the spec.

* GENERATED expressions are specifically disallowed from containing
subselects, calling functions that access any SQL-data, or being
nondeterministic; hence their values depend solely on the regular
columns in the same row.

* While identity columns are updated (if needed) before execution of
BEFORE triggers, generated columns are updated after BEFORE triggers;
hence a BEFORE trigger can override the value in one case and not the
other. (The current patch gets this wrong, btw.)

* Generated columns are forcibly updated when their base columns change
as a result of FK constraints (such as ON UPDATE CASCADE).

It looks to me like a BEFORE trigger is actually the only place that can
(transiently) see values of a generated column that are different from
the result of applying the generation expression on the rest of the row.
It's unclear whether that's intentional or an oversight.

Is anyone familiar with a database that implements SQL-spec generated
columns? Do they actually store the columns?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-05-07 21:27:05 pg_type
Previous Message Kurt Harriman 2007-05-07 20:53:12 BufFileWrite across MAX_PHYSICAL_FILESIZE boundary