Re: UUID as primary key

Lists: pgsql-general
From: Vance Maverick <vmaverick(at)pgp(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: UUID as primary key
Date: 2006-08-22 04:16:14
Message-ID: 1156220174.10168.17.camel@vmaverick-linux2.pgp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I'm considering using a UUID as a primary / foreign key for my schema,
to help ensure portability of data in a multi-master context. Does
anyone have experience with this?

There's a project on Gborg (pguuid) to create a "native" UUID type, but
it looks stagnant (and I'd prefer to use PostgreSQL out of the box, if I
can). So I'm considering three possible representations:

* VARCHAR(36) or CHAR(36) containing the standard textual encoding
* NUMERIC(40,0) containing the 128-bit binary version of the UUID,
considered as an integer
* A pair of BIGINT columns, containing the two 64-bit halves of the
binary UUID, handled together as a two-column key.

Would any of these give reasonable performance (for joins of large
tables), compared to int4 IDs? Is any of them clearly any better or
worse than the others?

I'd appreciate any advice!

Vance

* I would certainly consider using a standard int4 sequence ID to join
the tables, with a separate mapping from ID to UUID. However, this
would require some effort when transferring data between databases, to
replace the IDs with UUIDs for transmission. The performance would have
to be quite a bit better to make the extra effort worthwhile.


From: Vance Maverick <vmaverick(at)pgp(dot)com>
To: Harald Armin Massa <haraldarminmassa(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UUID as primary key
Date: 2006-08-22 14:49:26
Message-ID: 1156258166.22672.6.camel@vmaverick-linux2.pgp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Thanks for the good suggestion. I spent some time last night verifying
that it works, and that I can get the values in and out correctly
through JDBC. (When going from Java to the database, for example, it's
easiest to compose the value directly into the SQL query rather than
using a parameter.) This seems quite viable.

Vance

On Tue, 2006-08-22 at 08:44 +0200, Harald Armin Massa wrote:
> Vance,
>
> I use UUID as PK/FK within a rather complex application. To store it I
> use the type INET, which fits quite well.
>
> Compared to sequences I know of some disadvantages:
> - theoretical: keys are less locally distributed
> - bigger keys = more memory usage
> - manually editing of tables is difficult, just because the keys are
> more difficult to compare visually
>
> Harald
>
> --
> GHUM Harald Massa
> persuadere et programmare
> Harald Armin Massa
> Reinsburgstraße 202b
> 70197 Stuttgart
> 0173/9409607
> -
> Let's set so double the killer delete select all.


From: "Harald Armin Massa" <haraldarminmassa(at)gmail(dot)com>
To: "Vance Maverick" <vmaverick(at)pgp(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UUID as primary key
Date: 2006-08-22 15:14:43
Message-ID: 7be3f35d0608220814p25e08fedt479070e2f766ef90@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Vance,

nice that it was of help.

(When going from Java to the database, for example, it's
> easiest to compose the value directly into the SQL query rather than
> using a parameter.) This seems quite viable.

Please allow me to recommend to you to NOT go this seemingly easy way. I
went there myself (but did it in Python and PHP) some years ago, when I was
young and needed the money.

Those "put Data into your SQL Statements with String Substitution" is easy
to get working in the first step; but gets very very hard to keep correct in
the long run. Please google up topics like "SQL injection" and "escaping
characters for SQL" - it is a source of endless frustration.

Another tip to deal with GUIDs in applications:
Because GUIDs are rather long when expressed in base16 (hexadecimal notion),
I made a transformer to Base62, using numbers and all regular letters. That
shortens the typical GUID from 32chars to 21chars.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all.


From: Vance Maverick <vmaverick(at)pgp(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: UUID as primary key
Date: 2006-08-22 17:03:51
Message-ID: 1156266231.24659.24.camel@vmaverick-linux2.pgp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Thanks, I've gotten this working nicely now (after some offline
exchanges with Harald).

In JDBC, inet values can be read a couple of ways -- the easiest is to
call ResultSet.getString(). And to write them, the easiest is to use
prepared statements of the form

INSERT INTO xxx VALUES (?::inet, ...)
SELECT * FROM xxx WHERE uuid = ?::inet

filling the parameters in with PreparedStatement.setString().

This is clearly a viable solution -- I'd be interested to hear from
others as well.

Vance

On Tue, 2006-08-22 at 17:14 +0200, Harald Armin Massa wrote:
> Vance,
>
> nice that it was of help.
>
> (When going from Java to the database, for example, it's
> easiest to compose the value directly into the SQL query
> rather than
> using a parameter.) This seems quite viable.
>
> Please allow me to recommend to you to NOT go this seemingly easy way.
> I went there myself (but did it in Python and PHP) some years ago,
> when I was young and needed the money.
>
> Those "put Data into your SQL Statements with String Substitution" is
> easy to get working in the first step; but gets very very hard to keep
> correct in the long run. Please google up topics like "SQL injection"
> and "escaping characters for SQL" - it is a source of endless
> frustration.
>
>
> Another tip to deal with GUIDs in applications:
> Because GUIDs are rather long when expressed in base16 (hexadecimal
> notion), I made a transformer to Base62, using numbers and all regular
> letters. That shortens the typical GUID from 32chars to 21chars.
>
> Harald
>
>
> --
> GHUM Harald Massa
> persuadere et programmare
> Harald Armin Massa
> Reinsburgstraße 202b
> 70197 Stuttgart
> 0173/9409607
> -
> Let's set so double the killer delete select all.


From: Roman Neuhauser <neuhauser(at)sigpipe(dot)cz>
To: Vance Maverick <vmaverick(at)pgp(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UUID as primary key
Date: 2006-08-23 13:05:35
Message-ID: 20060823130535.GB99982@dagan.sigpipe.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

# vmaverick(at)pgp(dot)com / 2006-08-21 21:16:14 -0700:
> I'm considering using a UUID as a primary / foreign key for my schema,
> to help ensure portability of data in a multi-master context. Does
> anyone have experience with this?
>
> There's a project on Gborg (pguuid) to create a "native" UUID type, but
> it looks stagnant (and I'd prefer to use PostgreSQL out of the box, if I
> can). So I'm considering three possible representations:
>
> * VARCHAR(36) or CHAR(36) containing the standard textual encoding
> * NUMERIC(40,0) containing the 128-bit binary version of the UUID,
> considered as an integer
> * A pair of BIGINT columns, containing the two 64-bit halves of the
> binary UUID, handled together as a two-column key.
>
> Would any of these give reasonable performance (for joins of large
> tables), compared to int4 IDs? Is any of them clearly any better or
> worse than the others?

Ralf Engelschall's OSSP uuid looks very good. Written in C with
interfaces into PostgreSQL, PHP and C++ (classes wrapping the C
structures and functions).

http://www.ossp.org/pkg/lib/uuid/

You should be able to e. g.

CREATE TABLE t (id UUID DEFAULT 'uuid(4)' PRIMARY KEY);

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man. You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991


From: "Vance Maverick" <vmaverick(at)pgp(dot)com>
To:
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: UUID as primary key
Date: 2006-08-31 14:35:30
Message-ID: DAA9CBC6D4A7584ABA0B6BEA7EC6FC0B0C494E@hq-exch01.corp.pgp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Can you give a link to the PostgreSQL binding? I haven't been able to
find it there. All I see is C, Perl and PHP bindings for the generation
of UUIDs, and nothing about storage.

(For my application, as it happens, I don't need to generate UUIDs in
the database, but I recognize that would be useful.)

Vance

>
> Ralf Engelschall's OSSP uuid looks very good. Written in C with
> interfaces into PostgreSQL, PHP and C++ (classes wrapping the C
> structures and functions).
>
> http://www.ossp.org/pkg/lib/uuid/
>
> You should be able to e. g.
>
> CREATE TABLE t (id UUID DEFAULT 'uuid(4)' PRIMARY KEY);
>
> --
> How many Vietnam vets does it take to screw in a light bulb?
> You don't know, man. You don't KNOW.
> Cause you weren't THERE. http://bash.org/?255991
>


From: Roman Neuhauser <neuhauser(at)sigpipe(dot)cz>
To: Vance Maverick <vmaverick(at)pgp(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UUID as primary key
Date: 2006-08-31 18:22:46
Message-ID: 20060831182246.GA86613@dagan.sigpipe.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

# vmaverick(at)pgp(dot)com / 2006-08-31 07:35:30 -0700:
> > Ralf Engelschall's OSSP uuid looks very good. Written in C with
> > interfaces into PostgreSQL, PHP and C++ (classes wrapping the C
> > structures and functions).
> >
> > http://www.ossp.org/pkg/lib/uuid/
>
> Can you give a link to the PostgreSQL binding? I haven't been able to
> find it there. All I see is C, Perl and PHP bindings for the generation
> of UUIDs, and nothing about storage.

The above page links to the source repository [1] which in turn
contains a "pgsql" directory[2].

[1] http://cvs.ossp.org/dir?d=ossp-pkg/uuid
[2] http://cvs.ossp.org/dir?d=ossp-pkg/uuid/pgsql

> > CREATE TABLE t (id UUID DEFAULT 'uuid(4)' PRIMARY KEY);

> (For my application, as it happens, I don't need to generate UUIDs in
> the database, but I recognize that would be useful.)

Actually, serverside generation of uuids is the least interesting
aspect of that snippet. Use of custom types in indexes requires
appropriate operator classes, and I was trying to give a hint that
the library's support for PostgreSQL is quite comprehensive.

Now, an operator class is no rocket science, but the ability to use
the type in any way without having to add missing features is nice
anyway.

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man. You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991