Remove implicit unique index creation on SERIAL columns?

Lists: pgsql-hackers
From: Rod Taylor <rbt(at)zort(dot)ca>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-17 20:57:40
Message-ID: 1029617861.37296.59.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'd like to propose dropping the auto-creation of UNIQUE indexes on
serial columns for the following reasons:

1. Serials with indexes are quite difficult to handle in pg_dump. It
means that the implicitly created unique index must be destroyed prior
to loading the data, then re-created afterward else risk a performance
hit.

2. SERIAL columns are usually used as the primary key of the table. As
such one must specify PRIMARY KEY at creation time to override the
implicitly created UNIQUE index.

3. Consistency with other databases.

MySQL's AUTO_INCREMENT suggests quite heavily that an index be applied,
but it doesn't appear to do it for you.

Oracle has SEQUENCES, but it is up to the user to associate them with a
column as per my understanding. At least, thats all I could find.

SAP -> SERIAL and SERIAL(n). No index is created (creates sequence /
default)

A nice side effect is that analyze.c will become somewhat cleaner.

Any thoughts?

Does anyone create serial columns without manually specifying UNIQUE
when wanting an index? Examples in our documentation do. See section
5.1.4 in the current docset:
http://developer.postgresql.org/docs/postgres/datatype.html

Removal of implicit UNIQUE index creation would not affect structure of
current or prior db versions -- strictly new tables on 7.3.


From: Joe Conway <mail(at)joeconway(dot)com>
To: Rod Taylor <rbt(at)zort(dot)ca>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-17 21:26:10
Message-ID: 3D5EBF72.70405@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Rod Taylor wrote:
> I'd like to propose dropping the auto-creation of UNIQUE indexes on
> serial columns for the following reasons:
>
> 1. Serials with indexes are quite difficult to handle in pg_dump. It
> means that the implicitly created unique index must be destroyed prior
> to loading the data, then re-created afterward else risk a performance
> hit.
>
>
> 2. SERIAL columns are usually used as the primary key of the table. As
> such one must specify PRIMARY KEY at creation time to override the
> implicitly created UNIQUE index.
>
> 3. Consistency with other databases.
>
> MySQL's AUTO_INCREMENT suggests quite heavily that an index be applied,
> but it doesn't appear to do it for you.
>
> Oracle has SEQUENCES, but it is up to the user to associate them with a
> column as per my understanding. At least, thats all I could find.
>
> SAP -> SERIAL and SERIAL(n). No index is created (creates sequence /
> default)

FWIW, also MSSQL.

I agree 100%. If you want an index, unique constraint, or primary key on
a SERIAL, I think you should explicitly add it. SERIAL should give me a
column that automatically increments -- no more, no less.

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-18 01:13:18
Message-ID: 6829.1029633198@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joe Conway <mail(at)joeconway(dot)com> writes:
> I agree 100%. If you want an index, unique constraint, or primary key on
> a SERIAL, I think you should explicitly add it. SERIAL should give me a
> column that automatically increments -- no more, no less.

Hmm, do you also want to eliminate the implicit NOT NULL constraint?

I think that efficiency and orthogonality are adequate reasons for
dissociating UNIQUE from SERIAL. The efficiency argument is pretty
weak in the case of the NOT NULL part, though, so maybe backwards
compatibility should win out there.

Another line of thought: as near as I can tell, SQL92 allows defaults
and CHECK constraints to be associated with domains. Taking the
viewpoint that SERIAL is a domain would say that the DEFAULT nextval()
is okay, and CHECK NOT NULL is okay, but UNIQUE is not okay to be
implied by the domain type. Perhaps the SQL authors knew what they
were doing when they made that restriction...

(Note that although I just dissuaded Rod from actually turning SERIAL
into a domain on compatibility grounds, I don't see any reason why
we shouldn't use the spec's rules about domains to reason about how
it should work. In a slightly longer timeframe we may decide that
we do want to make it a domain.)

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-18 06:35:32
Message-ID: 3D5F4034.7040405@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>>I agree 100%. If you want an index, unique constraint, or primary key on
>>a SERIAL, I think you should explicitly add it. SERIAL should give me a
>>column that automatically increments -- no more, no less.
>
> Hmm, do you also want to eliminate the implicit NOT NULL constraint?
>
> I think that efficiency and orthogonality are adequate reasons for
> dissociating UNIQUE from SERIAL. The efficiency argument is pretty
> weak in the case of the NOT NULL part, though, so maybe backwards
> compatibility should win out there.

To be honest I wasn't thinking about NOT NULL. I'd agree with leaving
that in place.

Maybe I should restate my comment above: SERIAL should give me a column
that automatically increments -- no more, no less -- and it should not
allow me to override the value that it gives. Hence an implicit NOT
NULL, but also an implicit rejection of a manual insert/update of that
field (how hard would this be to do?).

I know this causes problems for dumped and reloaded data. In MSSQL this
is gotten around by allowing the properties of the data type to be
altered, e.g. in MSSQL you can turn the IDENTITY property on or off (in
fact, I just checked and MSSQL won't allow you to turn on IDENTITY for a
column unless you turn off ALLOW NULLS). You can also specify an
exception to the rule when running BCP (the bulk loader command line
program).

The reason I think this behavior is good, is that it helps prevent toe
loss from stray bullets. E.g. you manually add a row where you've
specified some value that has not yet been reached by the sequence --
then when someday the sequence reaches said value, your insert fails on
a duplicate primary key insertion attempt.

If you really need to be able to insert or update a field with an
explicit value *sometimes* (and you really know what you're doing), then
use a plain sequence and a default, not a SERIAL.

Anyway, just my thoughts.

Joe


From: Þórhallur Hálfdánarson <tolli(at)tol(dot)li>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-18 12:37:51
Message-ID: 20020818123751.A30716@tol.li
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

-*- Joe Conway <mail(at)joeconway(dot)com> [ 2002-08-18 06:36 ]:
> Maybe I should restate my comment above: SERIAL should give me a column
> that automatically increments -- no more, no less -- and it should not
> allow me to override the value that it gives. Hence an implicit NOT
> NULL, but also an implicit rejection of a manual insert/update of that
> field (how hard would this be to do?).
>
> I know this causes problems for dumped and reloaded data. In MSSQL this
> is gotten around by allowing the properties of the data type to be
> altered, e.g. in MSSQL you can turn the IDENTITY property on or off (in
> fact, I just checked and MSSQL won't allow you to turn on IDENTITY for a
> column unless you turn off ALLOW NULLS). You can also specify an
> exception to the rule when running BCP (the bulk loader command line
> program).
>
> The reason I think this behavior is good, is that it helps prevent toe
> loss from stray bullets. E.g. you manually add a row where you've
> specified some value that has not yet been reached by the sequence --
> then when someday the sequence reaches said value, your insert fails on
> a duplicate primary key insertion attempt.
>
> If you really need to be able to insert or update a field with an
> explicit value *sometimes* (and you really know what you're doing), then
> use a plain sequence and a default, not a SERIAL.

I like the functionality you are suggesting. However, when I started thinking about what size this field should be, the idea of keeping the current SERIAL "data type" and having a SERIAL flag (CREATE TABLE foo (id int serial))... so it's pretty much the same as a seqence, but enforced so that you can't tamper with it. The best part about this is obviously that it doesn't change behaviour of the old SERIAL.

Am I right about this, or am I missing something here?

--
Regards,
Tolli
tolli(at)tol(dot)li


From: Rod Taylor <rbt(at)zort(dot)ca>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL
Date: 2002-08-18 13:21:24
Message-ID: 1029676885.37296.153.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2002-08-18 at 02:35, Joe Conway wrote:
> Tom Lane wrote:
> > Joe Conway <mail(at)joeconway(dot)com> writes:
> >>I agree 100%. If you want an index, unique constraint, or primary key on
> >>a SERIAL, I think you should explicitly add it. SERIAL should give me a
> >>column that automatically increments -- no more, no less.
> >
> > Hmm, do you also want to eliminate the implicit NOT NULL constraint?
> >
> > I think that efficiency and orthogonality are adequate reasons for
> > dissociating UNIQUE from SERIAL. The efficiency argument is pretty
> > weak in the case of the NOT NULL part, though, so maybe backwards
> > compatibility should win out there.
>
> To be honest I wasn't thinking about NOT NULL. I'd agree with leaving
> that in place.
>
> Maybe I should restate my comment above: SERIAL should give me a column
> that automatically increments -- no more, no less -- and it should not
> allow me to override the value that it gives. Hence an implicit NOT
> NULL, but also an implicit rejection of a manual insert/update of that
> field (how hard would this be to do?).

I don't like not overriding the value. A dataload example is a perfect
reason why you would want to. Anyway, this would require placing 2
triggers on the table in order to prevent changes of the value.

Personally I prefer the method that SAPdb uses. They make the column a
fixed() type (integer) and set the default. Nothing about NOT NULL or
UNIQUE.

Anyway, I think SERIAL is about assisting creation of a entry, not
enforcing it. Enforcement is trivial for those who don't mind the
additional overhead.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-18 16:46:47
Message-ID: 24150.1029689207@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joe Conway <mail(at)joeconway(dot)com> writes:
> I know this causes problems for dumped and reloaded data.

Yup.

> In MSSQL this
> is gotten around by allowing the properties of the data type to be
> altered, e.g. in MSSQL you can turn the IDENTITY property on or off

Rather pointless if it can be turned off, wouldn't you say?

What I would do if I wanted such a guarantee is to make insertions
go through a view rule that prevents the serial column from being
supplied directly. That's a general technique that applies to more than
just serial columns, and it is also easy to control who has privileges
to bypass the view. AFAICT this "IDENTITY" thingie is an extremely
limited kluge.

regards, tom lane


From: Rod Taylor <rbt(at)zort(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joe Conway <mail(at)joeconway(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL
Date: 2002-08-18 16:57:35
Message-ID: 1029689856.37296.182.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > In MSSQL this
> > is gotten around by allowing the properties of the data type to be
> > altered, e.g. in MSSQL you can turn the IDENTITY property on or off
>
> Rather pointless if it can be turned off, wouldn't you say?
>
> What I would do if I wanted such a guarantee is to make insertions
> go through a view rule that prevents the serial column from being

In the future if we have per column privileges, we could disable insert
and update on serial columns by default for everyone but the table owner
and super-users (to accommodate dataloads).

Till then a trigger or view will do the job for those who wish to add
them.


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-18 17:07:23
Message-ID: 3D5FD44B.2030702@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>>In MSSQL this
>>is gotten around by allowing the properties of the data type to be
>>altered, e.g. in MSSQL you can turn the IDENTITY property on or off
>
> Rather pointless if it can be turned off, wouldn't you say?

Not really. Turning it off takes an explict act of the dba (ALTER TABLE)
for a special circumstance such as importing data. In the normal case
while it's turned on, it is the developers who need and get the foot
teflon (INSERT/UPDATE).

> What I would do if I wanted such a guarantee is to make insertions
> go through a view rule that prevents the serial column from being
> supplied directly. That's a general technique that applies to more than
> just serial columns, and it is also easy to control who has privileges
> to bypass the view.

I thought about the view rule myself. Nice, but beyond the grasp of
inexperienced users, who I think SERIAL is target for.

> AFAICT this "IDENTITY" thingie is an extremely limited kluge.

It is extremely limited -- that's why sequences are so much better
(IMHO). But SERIAL is for those who want something simple to use, don't
want to think about the implementation details, and don't mind it being
limited. Otherwise why have it at all -- just make everyone use sequences.

Joe


From: Joe Conway <mail(at)joeconway(dot)com>
To: Rod Taylor <rbt(at)zort(dot)ca>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-18 19:02:43
Message-ID: 3D5FEF53.9010005@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Rod Taylor wrote:
>>>In MSSQL this
>>>is gotten around by allowing the properties of the data type to be
>>>altered, e.g. in MSSQL you can turn the IDENTITY property on or off
>>
>>Rather pointless if it can be turned off, wouldn't you say?
>>
>>What I would do if I wanted such a guarantee is to make insertions
>>go through a view rule that prevents the serial column from being
>
>
> In the future if we have per column privileges, we could disable insert
> and update on serial columns by default for everyone but the table owner
> and super-users (to accommodate dataloads).

That sounds like the best way-to-go.

> Till then a trigger or view will do the job for those who wish to add
> them.

Agreed.

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Rod Taylor <rbt(at)zort(dot)ca>
Cc: Joe Conway <mail(at)joeconway(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL
Date: 2002-08-19 16:00:22
Message-ID: 12069.1029772822@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Since there didn't seem to be anyone objecting to the notion of
decoupling UNIQUE from SERIAL, I'm going to go ahead with
reviewing/applying Rod's recent patch that does that (and fixes pg_dump
to dump 7.3 serials correctly). We can continue to debate about
the merits of making additional changes in SERIAL behavior, of course,
but this is the one that pg_dump needs.

regards, tom lane


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-20 01:34:10
Message-ID: 200208200134.g7K1YAo15826@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Joe Conway wrote:
> Tom Lane wrote:
> > Joe Conway <mail(at)joeconway(dot)com> writes:
> >>I agree 100%. If you want an index, unique constraint, or primary key on
> >>a SERIAL, I think you should explicitly add it. SERIAL should give me a
> >>column that automatically increments -- no more, no less.
> >
> > Hmm, do you also want to eliminate the implicit NOT NULL constraint?
> >
> > I think that efficiency and orthogonality are adequate reasons for
> > dissociating UNIQUE from SERIAL. The efficiency argument is pretty
> > weak in the case of the NOT NULL part, though, so maybe backwards
> > compatibility should win out there.
>
> To be honest I wasn't thinking about NOT NULL. I'd agree with leaving
> that in place.
>
> Maybe I should restate my comment above: SERIAL should give me a column
> that automatically increments -- no more, no less -- and it should not
> allow me to override the value that it gives. Hence an implicit NOT
> NULL, but also an implicit rejection of a manual insert/update of that
> field (how hard would this be to do?).

If don't understand. We already have a unique index on the SERIAL
column, so why bother rejecting an insert/update that supplies the
value? We need the column to be unique, and that is forced, but why
prevent _any_ unique value from being used.

Clearly NULL is not a valid value and should be prevented with NOT NULL.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-20 03:01:58
Message-ID: 20020819195708.U37544-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Mon, 19 Aug 2002, Bruce Momjian wrote:

> Joe Conway wrote:
> > Tom Lane wrote:
> > > Joe Conway <mail(at)joeconway(dot)com> writes:
> > >>I agree 100%. If you want an index, unique constraint, or primary key on
> > >>a SERIAL, I think you should explicitly add it. SERIAL should give me a
> > >>column that automatically increments -- no more, no less.
> > >
> > > Hmm, do you also want to eliminate the implicit NOT NULL constraint?
> > >
> > > I think that efficiency and orthogonality are adequate reasons for
> > > dissociating UNIQUE from SERIAL. The efficiency argument is pretty
> > > weak in the case of the NOT NULL part, though, so maybe backwards
> > > compatibility should win out there.
> >
> > To be honest I wasn't thinking about NOT NULL. I'd agree with leaving
> > that in place.
> >
> > Maybe I should restate my comment above: SERIAL should give me a column
> > that automatically increments -- no more, no less -- and it should not
> > allow me to override the value that it gives. Hence an implicit NOT
> > NULL, but also an implicit rejection of a manual insert/update of that
> > field (how hard would this be to do?).
>
> If don't understand. We already have a unique index on the SERIAL
> column, so why bother rejecting an insert/update that supplies the
> value? We need the column to be unique, and that is forced, but why
> prevent _any_ unique value from being used.

One reason is that the sequence won't respect those inserted values and
you'll get uniqueness errors on statements that don't give a value for the
column where you'd expect to be getting a working autogenerated value.


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Rod Taylor <rbt(at)zort(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remove implicit unique index creation on SERIAL columns?
Date: 2002-08-20 03:17:55
Message-ID: 200208200317.g7K3Htp19255@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephan Szabo wrote:
> > If don't understand. We already have a unique index on the SERIAL
> > column, so why bother rejecting an insert/update that supplies the
> > value? We need the column to be unique, and that is forced, but why
> > prevent _any_ unique value from being used.
>
> One reason is that the sequence won't respect those inserted values and
> you'll get uniqueness errors on statements that don't give a value for the
> column where you'd expect to be getting a working autogenerated value.

Oh, it causes problems later on. Interesting. However, preventing
INSERT/UPDATE seems quite extreme.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073