Re: IDENTITY/GENERATED columns

Lists: pgsql-patches
From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To: pgsql-patches(at)postgresql(dot)org
Subject: IDENTITY/GENERATED columns
Date: 2006-08-07 17:57:35
Message-ID: 44D77F0F.3070505@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi,

my last patch didn't make it to the -hackers list,
here's a newer one. Let me list what this patch does now:

- CREATE TABLE/ALTER TABLE ADD syntax support for
GENERATED { ALWAYS | BY DEFAULT } AS
{ ( expr ) | IDENTITY ( seq_opts ) }
- catalog indicators of the above properties
- INSERT|COPY syntax support for OVERRIDING { SYSTEM | USER } VALUE
- INSERT|COPY fails for non-owner when using OVERRIDING SYSTEM VALUE
on tables with GENERATED ALWAYS columns
- UPDATE fails when using other than DEFAULT literal for
GENERATED ALWAYS columns
- GRANT {INSERT|UPDATE|ALL} ON table automatically
gives UPDATE permission for the supporting sequence
(missing: ALTER TABLE ADD should give permission
for the new sequence)
- ALTER TABLE tab ALTER col { SET seq_opts | RESTART [WITH] N }
syntax support to alter the supporting sequence
- ALTER TABLE tab RENAME also renames the supporting sequence
on both table and column renaming
- ALTER TABLE SET/DROP default is disallowed
- pg_dump support for exporting the above properties including
sequence parameters. Data dump uses OVERRIDING SYSTEM VALUE
when the table has GENERATED ALWAYS columns
- test cases for some operations
- documented

With this version, I mostly covered these TODO entries:

* Add support for SQL-standard GENERATED/IDENTITY columns (almost
done :-) )
* %Disallow changing DEFAULT expression of a SERIAL column? (done)
* %Disallow ALTER SEQUENCE changes for SERIAL sequences because
pg_dump does not dump the changes
(pg_dump dumps the sequence options)
* Add DEFAULT .. AS OWNER so permission checks are done as the table
owner
This would be useful for SERIAL nextval() calls and CHECK
constraints.
(GRANT TABLE grants UPDATE on the supporting sequence,
ALTER TABLE ADD COLUMN will be implemented)
* %Have ALTER TABLE RENAME rename SERIAL sequence names (done)
* Allow SERIAL sequences to inherit permissions from the base table?
(not needed because INSERT is a fastpath, permissions added
with GRANT TABLE, and [will be] extended on ALTER TABLE ADD COLUMN)

I am considering using ALTER TABLE tab ALTER col TYPE
to add (and drop) GENERATED ALWAYS and GENERATED AS IDENTITY
properties to existing columns. As it stands now, ALTER TYPE
doesn't change these column attributes.

Please, review.

Best regards,
Zoltán Böszörményi

Attachment Content-Type Size
psql-serial-21.diff.gz application/x-tar 25.1 KB

From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To: pgsql-patches(at)postgresql(dot)org
Subject: Re: IDENTITY/GENERATED columns
Date: 2006-08-09 15:42:01
Message-ID: 44DA0249.2010502@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi,

here's the next, changes are at the end.

Zoltan Boszormenyi írta:
> Hi,
>
> my last patch didn't make it to the -hackers list,
> here's a newer one. Let me list what this patch does now:
>
> - CREATE TABLE/ALTER TABLE ADD syntax support for
> GENERATED { ALWAYS | BY DEFAULT } AS
> { ( expr ) | IDENTITY ( seq_opts ) }
> - catalog indicators of the above properties
> - INSERT|COPY syntax support for OVERRIDING { SYSTEM | USER } VALUE
> - INSERT|COPY fails for non-owner when using OVERRIDING SYSTEM VALUE
> on tables with GENERATED ALWAYS columns
> - UPDATE fails when using other than DEFAULT literal for
> GENERATED ALWAYS columns
> - GRANT {INSERT|UPDATE|ALL} ON table automatically
> gives UPDATE permission for the supporting sequence
> (missing: ALTER TABLE ADD should give permission
> for the new sequence)
> - ALTER TABLE tab ALTER col { SET seq_opts | RESTART [WITH] N }
> syntax support to alter the supporting sequence
> - ALTER TABLE tab RENAME also renames the supporting sequence
> on both table and column renaming
> - ALTER TABLE SET/DROP default is disallowed
> - pg_dump support for exporting the above properties including
> sequence parameters. Data dump uses OVERRIDING SYSTEM VALUE
> when the table has GENERATED ALWAYS columns
> - test cases for some operations
> - documented
>
> With this version, I mostly covered these TODO entries:
>
>
> * Add support for SQL-standard GENERATED/IDENTITY columns (almost
> done :-) )
> * %Disallow changing DEFAULT expression of a SERIAL column? (done)
> * %Disallow ALTER SEQUENCE changes for SERIAL sequences because
> pg_dump does not dump the changes
> (pg_dump dumps the sequence options)
> * Add DEFAULT .. AS OWNER so permission checks are done as the table
> owner
> This would be useful for SERIAL nextval() calls and CHECK
> constraints.
> (GRANT TABLE grants UPDATE on the supporting sequence,
> ALTER TABLE ADD COLUMN will be implemented)
> * %Have ALTER TABLE RENAME rename SERIAL sequence names (done)
> * Allow SERIAL sequences to inherit permissions from the base table?
> (not needed because INSERT is a fastpath, permissions added
> with GRANT TABLE, and [will be] extended on ALTER TABLE ADD
> COLUMN)
>
> I am considering using ALTER TABLE tab ALTER col TYPE
> to add (and drop) GENERATED ALWAYS and GENERATED AS IDENTITY
> properties to existing columns. As it stands now, ALTER TYPE
> doesn't change these column attributes.
>
> Please, review.
>
> Best regards,
> Zoltán Böszörményi

Changes from previous version:

- ALTER TABLE ADD col type GENERATED AS IDENTITY
adds permission over the newly created sequence
to those who already have INSERT or UPDATE on the table.
- ALTER TABLE SET/DROP DEFAULT also disallowed on
GENERATED ALWAYS columns
- some codepaths were consolidated to avoid source bloat

A remaining larger problem is that
GENERATED ALWAYS AS (expr) should allow column
references from the same table, etc. just like CHECK.
For the sake of generality, DEFAULT must have
the same features, too. E.g. evaluating DEFAULT
value should be done as a second phase, after
evaluating other columns with given constants or
from DEFAULT that doesn't depend on other columns.
Circular dependencies must be avoided, etc.

I talked about implementin setting and dropping
GENERATED / IDENTITY properties. I cooked up this syntax:

ALTER TABLE tab ALTER col SET GENERATED ALWAYS AS ( expr )
Behaves the same as "SET DEFAULT expr" but sets attforceddef.
ALTER TABLE tab ALTER col SET GENERATED ALWAYS AS IDENTITY [( seq_opts )]
It should create supporting sequence iff the column
isn't already an IDENTITY column.
ALTER TABLE tab ALTER col DROP IDENTITY
It should be spelled loud instead of DROP DEFAULT.

That leaves ALTER TABLE ALTER TYPE alone.

Then a short check showed that changing such attributes
on a column isn't defined by SQL2003. The nice thing about
not implementing the above is that it avoids the debate
whether SET GENERATED ALWAYS AS [IDENTITY]
should also rewrite the records.

My answer in the debate would be that
GENERATED ALWAYS AS (expr)
should naturally do that but it isn't clear at all for
GENERATED ALWAYS AS IDENTITY

Closing words: I think it is now ready for acceptance,
it does all what I wanted it to do and it conforms to
SQL2003. Maybe I introduced some memory leaks
but I tried to avoid it by carefully inspecting what other
functions do. And some documentation details can
also be improved. But I think the code is ready.

Please, review.

Thanks and best regards,
Zoltán Böszörményi

Attachment Content-Type Size
psql-serial-24.diff.gz application/x-tar 27.2 KB

From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To: pgsql-patches(at)postgresql(dot)org
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <pg(at)rbt(dot)ca>
Subject: Re: IDENTITY/GENERATED columns
Date: 2006-08-14 21:34:25
Message-ID: 44E0EC61.9080404@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Hi,

here's the next version. Changes:
- Extended documentation
- Extending permissions to new sequences
ALTER TABLE tab ADD col type GENERATED AS IDENTITY
didn't work as advertised, now it seems to.
- Test case was also extended.
- Previously introduced memory leaks were plugged. Really.

Now the only feature missing is the previously discussed
GENERATED ALWAYS AS ( expr ) so it can be used like this:

CREATE TABLE tab (
c1 double,
c2 integer,
c3 double GENERATED ALWAYS AS ( col1 + col2),
c4 SMALLINT GENERATED ALWAYS AS
(CASE WHEN c1 > c2 THEN 1 ELSE NULL END)
);

What should the following code produce as a result?

INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0);

This should insert (1.1, 2, 3.1, NULL)

UPDATE tab SET c2 = 1;

Only c2 changes, so: (1.1, 1, 3.1, NULL)
Or should it change to (1.1, 1, 2.1, 1),
e.g. recompute all columns that depend on
changed columns?

UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5;

Now what? It should be (3.5, 2, 5.5, 1)
But based on current UPDATE behaviour,
e.g. values gets computed based on previous
values, it becomes (3.5, 2, 2.1, 1)

That would really need changing the behaviour of UPDATE.
Currently, if I do an

UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2;

then c3 gets its value based on the previous content
of the record. For the above GENERATED ALWAYS
AS (expr) construct to work, UPDATE have to compute
the column values in multipass, something like this:

constant values are computed;
while (is there any non-computed columns)
{
newly_computed = 0;
foreach (column, non-computed-columns)
{
if (column value depends only on computed columns)
{
compute it;
newly_computed++;
}
}
if (newly_computed == 0)
elog(ERROR, "circular dependency");
}

This behaviour change would enable something like this:
CREATE tab2 (c1 integer, c2 integer, c3 integer);
INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2);

Does this described behaviour have any precedent or
standard compliance?

Best regards,
Zoltán Böszörményi

Attachment Content-Type Size
psql-serial-26.diff.gz application/x-tar 28.4 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
Cc: pgsql-patches(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <pg(at)rbt(dot)ca>
Subject: Re: IDENTITY/GENERATED columns
Date: 2006-08-26 03:44:52
Message-ID: 200608260344.k7Q3iqT06695@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches


This is being done for 8.3, right?

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:
> Hi,
>
> here's the next version. Changes:
> - Extended documentation
> - Extending permissions to new sequences
> ALTER TABLE tab ADD col type GENERATED AS IDENTITY
> didn't work as advertised, now it seems to.
> - Test case was also extended.
> - Previously introduced memory leaks were plugged. Really.
>
> Now the only feature missing is the previously discussed
> GENERATED ALWAYS AS ( expr ) so it can be used like this:
>
> CREATE TABLE tab (
> c1 double,
> c2 integer,
> c3 double GENERATED ALWAYS AS ( col1 + col2),
> c4 SMALLINT GENERATED ALWAYS AS
> (CASE WHEN c1 > c2 THEN 1 ELSE NULL END)
> );
>
> What should the following code produce as a result?
>
> INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0);
>
> This should insert (1.1, 2, 3.1, NULL)
>
> UPDATE tab SET c2 = 1;
>
> Only c2 changes, so: (1.1, 1, 3.1, NULL)
> Or should it change to (1.1, 1, 2.1, 1),
> e.g. recompute all columns that depend on
> changed columns?
>
> UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5;
>
> Now what? It should be (3.5, 2, 5.5, 1)
> But based on current UPDATE behaviour,
> e.g. values gets computed based on previous
> values, it becomes (3.5, 2, 2.1, 1)
>
> That would really need changing the behaviour of UPDATE.
> Currently, if I do an
>
> UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2;
>
> then c3 gets its value based on the previous content
> of the record. For the above GENERATED ALWAYS
> AS (expr) construct to work, UPDATE have to compute
> the column values in multipass, something like this:
>
> constant values are computed;
> while (is there any non-computed columns)
> {
> newly_computed = 0;
> foreach (column, non-computed-columns)
> {
> if (column value depends only on computed columns)
> {
> compute it;
> newly_computed++;
> }
> }
> if (newly_computed == 0)
> elog(ERROR, "circular dependency");
> }
>
> This behaviour change would enable something like this:
> CREATE tab2 (c1 integer, c2 integer, c3 integer);
> INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2);
>
> Does this described behaviour have any precedent or
> standard compliance?
>
> Best regards,
> Zolt?n B?sz?rm?nyi
>

[ application/x-tar is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <pg(at)rbt(dot)ca>
Subject: Re: IDENTITY/GENERATED columns
Date: 2006-08-26 04:42:24
Message-ID: 44EFD130.1070106@dunaweb.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Yes, I am not ready with it.

Bruce Momjian írta:
> This is being done for 8.3, right?
>
> ---------------------------------------------------------------------------
>
> Zoltan Boszormenyi wrote:
>
>> Hi,
>>
>> here's the next version. Changes:
>> - Extended documentation
>> - Extending permissions to new sequences
>> ALTER TABLE tab ADD col type GENERATED AS IDENTITY
>> didn't work as advertised, now it seems to.
>> - Test case was also extended.
>> - Previously introduced memory leaks were plugged. Really.
>>
>> Now the only feature missing is the previously discussed
>> GENERATED ALWAYS AS ( expr ) so it can be used like this:
>>
>> CREATE TABLE tab (
>> c1 double,
>> c2 integer,
>> c3 double GENERATED ALWAYS AS ( col1 + col2),
>> c4 SMALLINT GENERATED ALWAYS AS
>> (CASE WHEN c1 > c2 THEN 1 ELSE NULL END)
>> );
>>
>> What should the following code produce as a result?
>>
>> INSERT INTO tab (c1, c2, c3, c4) VALUES (1.1, 2, 0, 0);
>>
>> This should insert (1.1, 2, 3.1, NULL)
>>
>> UPDATE tab SET c2 = 1;
>>
>> Only c2 changes, so: (1.1, 1, 3.1, NULL)
>> Or should it change to (1.1, 1, 2.1, 1),
>> e.g. recompute all columns that depend on
>> changed columns?
>>
>> UPDATE tab SET c4 = DEFAULT, c3 = DEFAULT, c2 = 2, c1 = 3.5;
>>
>> Now what? It should be (3.5, 2, 5.5, 1)
>> But based on current UPDATE behaviour,
>> e.g. values gets computed based on previous
>> values, it becomes (3.5, 2, 2.1, 1)
>>
>> That would really need changing the behaviour of UPDATE.
>> Currently, if I do an
>>
>> UPDATE tab SET c1 = 3.5, c2 = 2, c3 = c1 + c2;
>>
>> then c3 gets its value based on the previous content
>> of the record. For the above GENERATED ALWAYS
>> AS (expr) construct to work, UPDATE have to compute
>> the column values in multipass, something like this:
>>
>> constant values are computed;
>> while (is there any non-computed columns)
>> {
>> newly_computed = 0;
>> foreach (column, non-computed-columns)
>> {
>> if (column value depends only on computed columns)
>> {
>> compute it;
>> newly_computed++;
>> }
>> }
>> if (newly_computed == 0)
>> elog(ERROR, "circular dependency");
>> }
>>
>> This behaviour change would enable something like this:
>> CREATE tab2 (c1 integer, c2 integer, c3 integer);
>> INSERT INTO tab2 (c1,c2,c3) VALUES (1, 2, c1 + c2);
>>
>> Does this described behaviour have any precedent or
>> standard compliance?
>>
>> Best regards,
>> Zolt?n B?sz?rm?nyi
>>
>>
>
> [ application/x-tar is not supported, skipping... ]
>
>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>>
>
>