Quick Links

Re: ANALYZE sampling is too good

From:	Claudio Freire <klaussfreire(at)gmail(dot)com>
To:	Greg Stark <stark(at)mit(dot)edu>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: ANALYZE sampling is too good
Date:	2013-12-10 14:32:14
Message-ID:	CAGTBQpbnhsc7h4fCHBG63kSYt3-DmyeTZ-QjGf30dN-xacrK4Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Dec 10, 2013 at 11:02 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
>
> On 10 Dec 2013 08:28, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>>
>>
>> Doesn't all that assume a normally distributed random variable?
>
> I don't think so because of the law of large numbers. If you have a large
> population and sample it the sample behaves like a normal distribution when
> if the distribution of the population isn't.

No, the large population says that if you have an AVERAGE of many
samples of a random variable, the random variable that is the AVERAGE
behaves like a normal.

The variable itself doesn't.

And for n_distinct, you need to know the variable itself.

In response to

Re: ANALYZE sampling is too good at 2013-12-10 14:02:35 from Greg Stark

Responses

Re: ANALYZE sampling is too good at 2013-12-10 14:32:44 from Claudio Freire
Re: ANALYZE sampling is too good at 2013-12-11 14:28:51 from Florian Pflug

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Claudio Freire	2013-12-10 14:32:44	Re: ANALYZE sampling is too good
Previous Message	Albe Laurenz	2013-12-10 14:31:31	Re: ANALYZE sampling is too good