Re: An Idea for planner hints

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Martijn van Oosterhout <kleptog(at)svana(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: An Idea for planner hints
Date: 2006-08-09 10:57:39
Message-ID: 44D9BFA3.1090800@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
>> ISTM theat the easiest way would be to introduce a sort of predicate
>> like so:
>
>> SELECT * FROM foo, bar WHERE pg_selectivity(foo.a = bar.a, 0.1);
>
> The one saving grace of Florian's proposal was that you could go hack
> the statistics *without* changing your queries. This throws that away
> again.
I think for this to be really effective, you'd actually need both - a
query-independent way specifying selectivities, and a way to influence
the estimates for a _single_ query.

Image a complex, autogenerated query with looks something like this
select ....
from t1
join t2 on ...
join t3 on ...
join t4 on ...
...
...
where
<big, complicated expression derived from some user input>.

This big, complicated expression looks different for every query - and
currently, postgres often vastly overestimates the selectivity of this
expression. This leads to weird join orders, and generally very bad
performance. Of course, *I* don't know the selectivity of this
expression myself - but experience tells me that on average it's
something like 50%, and not 1% as postgres believes. So, in that case,
being able to write

select ... join .... where pg_selectivity(<expression>, 0.5)
would be a big win.

> The thing I object to about the "I want to decorate my queries with
> planner hints" mindset is that it's coming at it from the wrong
> direction. You should never be thinking in terms of "fix this one
> query", because that just leads back into the same dead end that your
> fix doesn't work tomorrow. What you *should* be thinking about is "why
> did the planner get this wrong, and how do I fix the generic problem?".
> If you attack it that way then your fix is much more likely to work on
> the next slightly-different query.

Fixing the generic problem is surely the best _if_ there is a fix for
the generic problem at all. But if your where-conditions involves fields
from 10 different tables, then IMHO there is no way to _ever_ guarantee
that postgres will get correct selectivity estimates. But since (at
least for me) overestimating selectivity hurts fare more than
underestimating it, forcing postgres to just assume a certain
selectivity could help.

I'm not in any way saying that there should _only_ be selectivity
annotations inside the query - a query-independent mechanism would
be a very nice thing to have. But a query-independent mechanism
wont be sufficient in all cases IMHO.

greetings, Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message stark 2006-08-09 11:21:40 Re: Casts
Previous Message Pavel Stehule 2006-08-09 09:48:41 Re: proposal for PL packages for 8.3.