Quick Links

Re: benchmarking the query planner

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Gregory Stark <stark(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: benchmarking the query planner
Date:	2008-12-11 23:52:02
Message-ID:	13911.1229039522@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Thu, 2008-12-11 at 22:29 +0000, Gregory Stark wrote:
>>> And I would like it even more if the sample size increased according
>>> to table size, since that makes ndistinct values fairly random for
>>> large tables.
>>
>> Unfortunately _any_ ndistinct estimate based on a sample of the table
>> is going to be pretty random.

> We know that constructed data distributions can destroy the
> effectiveness of the ndistinct estimate and make sample size irrelevant.
> But typical real world data distributions do improve their estimations
> with increased sample size and so it is worthwhile.

This is handwaving unsupported by evidence. If you've got a specific
proposal what to change the sample size to and some numbers about what
it might gain us or cost us, I'm all ears.

regards, tom lane

In response to

Re: benchmarking the query planner at 2008-12-11 23:44:04 from Simon Riggs

Responses

Re: benchmarking the query planner at 2008-12-12 09:35:49 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	KaiGai Kohei	2008-12-11 23:52:08	Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)
Previous Message	Bruce Momjian	2008-12-11 23:50:19	Re: benchmarking the query planner