Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

From: David Fetter <david(at)fetter(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table
Date: 2014-04-25 18:46:07
Message-ID: 20140425184607.GI16465@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 25, 2014 at 10:58:29AM -0700, Josh Berkus wrote:
> On 04/24/2014 05:23 PM, Marti Raudsepp wrote:
> > On Thu, Apr 24, 2014 at 8:40 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> >> A pseudo-random UUID is frankly pretty
> >> useless to me because (a) it's not really unique
> >
> > This is FUD. A pseudorandom UUID contains 122 bits of randomness. As
> > long as you can trust the random number generator, the chances of a
> > value occurring twice can be estimated using the birthday paradox:
> > there's a 50% chance of having *one* collision in a set of 2^61 items.
> > Storing this amount of UUIDs alone requires 32 exabytes of storage.
> > Factor in the tuple and indexing overheads and you'd be needing close
> > to all the hard disk space ever manufactured in the world.
>
> Well, I've already had collisions with UUID-OSSP, in production, with
> only around 20 billion values. So clearly there aren't 122bits of true
> randomness in OSSP. I can't speak for other implementations because I
> haven't tried them.
>
> >> (b) it doesn't help me route data at all.
> >
> > That's really out of scope for UUIDs. They're about generating
> > identifiers, not describing what the identifier means. UUIDs also
> > don't happen to cure cancer.
>
> http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327
>
> On the contrary, I would argue that an object identifier which is
> completely random is possibly the worst way to form an ID of all
> possible concepts; there's no relationship whatsoever between the ID,
> the application stack, and the application data; you don't even get the
> pseudo-time indexing you get with Serials. The only reason to do it is
> because you're too lazy do implement a better way.
>
> Or to put it another way: a value which is truly random is no identifier
> at all.

Not exactly. It's at least potentially hiding information an attacker
could use, with all the caveats that carries.

> Compare this with a composite identifier which carries information about
> the node, table, and schema of origin for the tuple. Not only does this
> help ensure uniqueness, but it also supports intelligent sharding and
> multi-master replication systems. I don't speak hypothetically; we've
> done this in the past and will do it again in the future.

This is an excellent idea, but I don't think it's in scope for UUIDs.

> I would love to have some machinery inside PostgreSQL to make this
> easier (for example, a useful unique database ID), but I suspect that
> acutal implementation will always remain application-specific.
>
> You may say "oh, that's not the job of the identifer", but if it's not,
> WTF is the identifer for, then?

Frequently, it's to provide some kind of opacity in the sense of not
have an obvious predecessor or successor.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Boszormenyi Zoltan 2014-04-25 19:01:00 Re: Review: ECPG FETCH readahead
Previous Message Josh Berkus 2014-04-25 17:58:29 Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table