Re: string PK vs. interger PK

Lists: pgsql-admin
From: Jodi Kanter <jkanter(at)virginia(dot)edu>
To: Postgres Admin List <pgsql-admin(at)postgresql(dot)org>
Subject: string PK vs. interger PK
Date: 2002-04-15 18:56:20
Message-ID: 019201c1e4af$3b855c90$de138f80@virginia.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Can anyone offer some insight as to what should be considered when choosing between a primary key that is an integer vs. a primary key that is a string value? Does one or the other affect indexing speed? update speed?
Is a serial integer value better than using a PK that has some "value" in its meaning but is a string?
Thanks
Jodi
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter(at)virginia(dot)edu


From: "Nick Fankhauser" <nickf(at)ontko(dot)com>
To: "Jodi Kanter" <jkanter(at)virginia(dot)edu>, "Postgres Admin List" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: string PK vs. interger PK
Date: 2002-04-15 19:20:06
Message-ID: NEBBLAAHGLEEPCGOBHDGKEHNEMAA.nickf@ontko.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin


As a general rule, a primary key (or any relationship key) should not
contain a value that means something beyond its use in relating entities.
The problem is that if a field describes an object, the day may come when
you want to change the description, but you *never* want to change the field
that identifies the record and relates it to other records.

I can't offer insider's knowledge on speed of int vs string, but it seems
very reasonable to think that an int would be faster. On the other hand, I'm
forced to use strings in my app, and the performance doesn't suffer much so
they must be close.

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

-----Original Message-----
From: pgsql-admin-owner(at)postgresql(dot)org
[mailto:pgsql-admin-owner(at)postgresql(dot)org]On Behalf Of Jodi Kanter
Sent: Monday, April 15, 2002 1:56 PM
To: Postgres Admin List
Subject: [ADMIN] string PK vs. interger PK

Can anyone offer some insight as to what should be considered when choosing
between a primary key that is an integer vs. a primary key that is a string
value? Does one or the other affect indexing speed? update speed?
Is a serial integer value better than using a PK that has some "value" in
its meaning but is a string?
Thanks
Jodi
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter(at)virginia(dot)edu


From: "Marc Mitchell" <marcm(at)eisolution(dot)com>
To: "Jodi Kanter" <jkanter(at)virginia(dot)edu>, "Postgres Admin List" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: string PK vs. interger PK
Date: 2002-04-15 19:51:38
Message-ID: 01cc01c1e4b6$f538b680$6701050a@eisolution.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Jodi:

This opinion has less to do with Postgres and more to do will a general
philosophy but we always use a 4 byte integer column as the primary key for
almost every table. None of the following reasons means it has to be that
way, but all support Ints over strings:

1) Strings would normally be bigger than 4 byte integers. While I have no
experience to say this size issue could contribute to a perceived
performance issue, theoretically speaking, a smaller, more compact value
would be better.

2) Int processing of equality and inequality comparisons should be much
faster than string compares. However, again I have no experience to say
this size issue could contribute to a perceived performance issue.

3) Ints make it easier to employ Postgres Sequences in generating PK
values. These are very nice and quick.

4) Lastly, I'll just agree with the other replies that I've already seen
that you should avoid putting any kind of meaningful data into PK columns.
That fact that you want to consider strings almost implies meaning to the
values. For example, if these values are dictated by an outside source
which happened to use an alphanumeric value, then they carry meaning to an
outside source even if that meaning is that it is simply their own unique
identifier.

We've been faced with this question many a time. We almost always chose an
int PK and more often than not when we've strayed from this course, it's
come back to haunt us.

Hope this is helpful.

Marc Mitchell - Senior Application Architect
Enterprise Information Solutions, Inc.
4910 Main Street
Downers Grove, IL 60515
(630) 512-0570
marcm(at)eisolution(dot)com

----- Original Message -----
From: "Jodi Kanter" <jkanter(at)virginia(dot)edu>
To: "Postgres Admin List" <pgsql-admin(at)postgresql(dot)org>
Sent: Monday, April 15, 2002 1:56 PM
Subject: [ADMIN] string PK vs. interger PK

Can anyone offer some insight as to what should be considered when choosing
between a primary key that is an integer vs. a primary key that is a string
value? Does one or the other affect indexing speed? update speed?
Is a serial integer value better than using a PK that has some "value" in
its meaning but is a string?
Thanks
Jodi
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter(at)virginia(dot)edu


From: "Dan Langille" <dan(at)langille(dot)org>
To: "Postgres Admin List" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: string PK vs. interger PK
Date: 2002-04-15 20:02:44
Message-ID: 20020415200405.9BCD93F28@bast.unixathome.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

On 15 Apr 2002 at 14:20, Nick Fankhauser wrote:

> As a general rule, a primary key (or any relationship key) should not
> contain a value that means something beyond its use in relating entities.
> The problem is that if a field describes an object, the day may come when
> you want to change the description, but you *never* want to change the
> field that identifies the record and relates it to other records.

FWIW, I would recommend not using business values as a primary key (and by
extension, as a foreign key). Business values are just that: business
values. Sure, put unique constraints on business values if you want.
Just don't make them a primary key or a foreign key.
--
Dan Langille
The FreeBSD Diary - http://freebsddiary.org/ - practical examples