Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Date: 2009-04-06 02:40:21
Message-ID: 603c8f070904051940p406cc4d0g46c086621f5dff67@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 5, 2009 at 10:38 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Apr 5, 2009 at 10:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> On Sun, Apr 5, 2009 at 7:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> [ shrug... ]  Precision is not important for this value: we are not
>>>> anywhere near needing more than six significant digits for our
>>>> statistical estimates.  Range, on the other hand, could be important
>>>> when dealing with really large tables.
>>
>>> I thought about that, and if you think that's better, I can implement
>>> it that way.  Personally, I'm unconvinced.  The use case for
>>> specifying a number of distinct values in excess of 2 billion as an
>>> absolute number rather than as a percentage of the table size seems
>>> pretty weak to me.
>>
>> I was more concerned about the other end of it.  Your patch sets a
>> not-too-generous lower bound on the percentage that can be represented ...
>
> Huh?  With a scaling factor of 1 million, you can represent anything
> down to about 0.000001, which is apparently all you can expect out of
> a float4 anyway.
>
> http://archives.postgresql.org/pgsql-bugs/2009-01/msg00039.php

I guess I'm wrong here - 0.00001 is only one SIGNIFICANT digit. But
the point remains that specifying ndistinct in ppm is probably enough
for most cases, and ppb (which would still fit in int4) even more so.
I don't think we need to worry about people with trillions of rows
(and even they could still specify an absolute number).

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-04-06 03:33:14 Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Previous Message Robert Haas 2009-04-06 02:38:04 Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT