Re: Hints proposal

From: Christopher Browne <cbbrowne(at)acm(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hints proposal
Date: 2006-10-13 02:54:02
Message-ID: 87r6xd9b91.fsf@wolfe.cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Quoth rabroersma(at)yahoo(dot)com (Richard Broersma Jr):
>> By the way, wouldn't it be possible if the planner learned from a query
>> execution, so it would know if a choice for a specific plan or estimate
>> was actually correct or not for future reference? Or is that in the line
>> of DB2's complexity and a very hard problem and/or would it add too much
>> overhead?
>
> Just thinking out-loud here...
>
> Wow, a learning cost based planner sounds a-lot like problem for
> control & dynamical systems theory.

Alas, dynamic control theory, home of considerable numbers of
Hamiltonian equations, as well as Pontryagin's Minimum Principle, is
replete with:
a) Gory multivariate calculus
b) Need for all kinds of continuity requirements (e.g. - continuous,
smooth functions with no discontinuities or other "nastiness")
otherwise the math gets *really* nasty

We don't have anything even resembling "continuous" because our
measures are all discrete (e.g. - the base values are all integers).

> As I understand it, much of the advice given for setting
> PostgreSQL's tune-able parameters are from "RULES-OF-THUMB." I am
> sure that effect on server performance from all of the parameters
> could be modeled and an adaptive feed-back controller could be
> designed to tuned these parameters as demand on the server changes.

Optimal control theory loves the "bang-bang" control, where you go to
one extreme or another, which requires all those continuity conditions
I mentioned, and is almost certainly not the right answer here.

> Al-thought, I suppose that a controller like this would have limited
> success since some of the most affective parameters are non-run-time
> tune-able.
>
> In regards to query planning, I wonder if there is way to model a
> controller that could adjust/alter query plans based on a comparison
> of expected and actual query execution times.

I think there would be something awesomely useful about recording
expected+actual statistics along with some of the plans.

The case that is easiest to argue for is where Actual >>> Expected
(e.g. - Actual "was a whole lot larger than" Expected); in such cases,
you've already spent a LONG time on the query, which means that
spending millisecond recording the moral equivalent to "Explain
Analyze" output should be an immaterial cost.

If we could record a whole lot of these cases, and possibly, with some
anonymization / permissioning, feed the data to a central place, then
some analysis could be done to see if there's merit to particular
modifications to the query plan cost model.

Part of the *really* fundamental query optimization problem is that
there seems to be some evidence that the cost model isn't perfectly
reflective of the costs of queries. Improving the quality of the cost
model is one of the factors that would improve the performance of the
query optimizer. That would represent a fundamental improvement.
--
let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;;
http://linuxdatabases.info/info/languages.html
"If I can see farther it is because I am surrounded by dwarves."
-- Murray Gell-Mann

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Browne 2006-10-13 03:12:29 Re: [PERFORM] Hints proposal
Previous Message Andrew Dunstan 2006-10-13 02:01:50 Re: Hints (Was: Index Tuning Features)

Browse pgsql-performance by date

  From Date Subject
Next Message Christopher Browne 2006-10-13 03:12:29 Re: [PERFORM] Hints proposal
Previous Message Jeff Davis 2006-10-12 22:41:00 Re: Hints proposal