Re: benchmarking the query planner

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Nathan Boley" <npboley(at)gmail(dot)com>
Cc: "Vladimir Sitnikov" <sitnikov(dot)vladimir(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: benchmarking the query planner
Date: 2008-12-12 00:30:16
Message-ID: 14209.1229041816@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Nathan Boley" <npboley(at)gmail(dot)com> writes:
> Well, ISTM there is a profound difference. For scalarineqsel we care
> about the total number of values in a bucket. For eqsel we care about
> the total number of *distinct* values in each bucket

Really?

> IMHO, the whole idea of increasing mcv's seems a mistake. Why not use
> the limited storage in pg_statistic to try and estimate the
> selectivity for ranges of values rather than a single value?

MCVs are useful for questions that are not related to ranges of values
--- an example of something highly useful we do with them is to try to
estimate the fraction of a column that satisfies a LIKE or regex
pattern.

In fact, as I was just pointing out to Bruce, we can compute them and
do useful things with them for datatypes that don't have a defined sort
order and so the whole concept of "range" is meaningless.

Now, if your point is that it'd be more flexible to not tie the MCV list
length to the histogram length, you're right. I'm not sure we can
expect the average DBA to set two separate knobs very effectively,
though. We do already have some smarts in there to try to set the MCV
list length intelligently instead of slavishly following the stats
target --- perhaps it wouldn't be a bad idea to develop similar
heuristics for choosing an actual histogram length.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2008-12-12 00:32:12 Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)
Previous Message KaiGai Kohei 2008-12-12 00:18:14 Re: Updates of SE-PostgreSQL 8.4devel patches (r1268)