Re: benchmarking the query planner

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "jd(at)commandprompt(dot)com" <jd(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: benchmarking the query planner
Date: 2008-12-11 23:50:19
Message-ID: 200812112350.mBBNoJn23176@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> "Robert Haas" <robertmhaas(at)gmail(dot)com> writes:
> >> On the whole I think we have some evidence here to say that upping the
> >> default value of default_stats_target to 100 wouldn't be out of line,
> >> but 1000 definitely is. Comments?
>
> > Do you think there's any value in making it scale based on the size of
> > the table?
>
> As far as the MCVs go, I think we already have a decent heuristic for
> determining the actual length of the array, based on discarding values
> that have too small an estimated frequency --- look into
> compute_scalar_stats. I don't see that explicitly considering table
> size would improve that. It might be worth limiting the size of the
> histogram, as opposed to the MCV list, for smaller tables. But that's
> not relevant to the speed of eqjoinsel, and AFAIK there aren't any
> estimators that are O(N^2) in the histogram length.
> (ineq_histogram_selectivity is actually O(log N), so it hardly cares at
> all.)

Why is selfuncs.c::var_eq_const() doing a linear scan over the MCV array
instead of having the list sorted and doing a binary search on the
array? We already do this for histogram lookups, as you mentioned.
Does it not matter? It didn't for ten values but might for larger
distinct lists.

> > Otherwise, I am a bit concerned that 10 -> 100 may be too big a jump
> > for one release, especially since it may cause the statistics to get
> > toasted in some cases, which comes with a significant performance hit.
> > I would raise it to 30 or 50 and plan to consider raising it further
> > down the road. (I realize I just made about a million enemies with
> > that suggestion.)
>
> There's something in what you say, but consider that we have pretty
> much unanimous agreement that 10 is too small. I think we should
> try to fix the problem, not just gradually ratchet up the value until
> people start complaining in the other direction. (Also, we should have
> plenty of opportunity during beta to find out if we went too far.)

I am excited we are addresssing the low default statistics target value.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-12-11 23:52:02 Re: benchmarking the query planner
Previous Message Kevin Grittner 2008-12-11 23:47:51 Re: benchmarking the query planner