Re: benchmarking the query planner

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: benchmarking the query planner
Date: 2008-12-11 23:44:04
Message-ID: 1229039044.13078.193.camel@hp_dx2400_1
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Thu, 2008-12-11 at 22:29 +0000, Gregory Stark wrote:

> > And I would like it even more if the sample size increased according
> to table size, since that makes ndistinct values fairly random for
> large
> > tables.
>
> Unfortunately _any_ ndistinct estimate based on a sample of the table
> is going to be pretty random.

We know that constructed data distributions can destroy the
effectiveness of the ndistinct estimate and make sample size irrelevant.
But typical real world data distributions do improve their estimations
with increased sample size and so it is worthwhile.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2008-12-11 23:47:51 Re: benchmarking the query planner
Previous Message Tom Lane 2008-12-11 23:43:48 Re: benchmarking the query planner