Re: An Idea for planner hints

From: Arturo Pérez <arturo(at)ethicist(dot)net>
To: Jim C(dot)Nasby <jnasby(at)pervasive(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: An Idea for planner hints
Date: 2006-08-15 22:22:53
Message-ID: 32EA4E1D-7005-4ED4-8D49-C7A172EFF3F5@ethicist.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Aug 15, 2006, at 10:40 AM, Jim C. Nasby wrote:

> On Mon, Aug 14, 2006 at 11:41:29PM +0300, Hannu Krosing wrote:
>> ??hel kenal p??eval, E, 2006-08-14 kell 18:21, kirjutas Peter
>> Eisentraut:
>>> Perez wrote:
>>>> I thought, from watching the list for a while, that the planner
>>>> statistics needed were known but that how to gather the statistics
>>>> was not?
>>>
>>> I think over the course of the discussion we have figured out
>>> that we
>>> would like to have cross-column correlation statistics. The precise
>>> mathematical incarnation hasn't been determined yet, as far as I can
>>> see. Collecting the statistics thereafter isn't that hard, but
>>> there
>>> needs to be a way to not collect an exponential volume of
>>> statistics on
>>> all column combinations.
>>
>> I understood that the proposal was to collect only the stats where
>> needed (determined by user/dba) and use some rule-of-thumb values
>> if no
>> collected stats were available.
>
> Yeah, unless someone comes up with some kind of 'magic', I think
> trying
> to handle every cross-column possibility is a non-starter. IIRC, that
> argument is what's stalled cross-column stats every time in the
> past. It
> makes a lot more sense to allow defining what combinations of
> columns we
> need stats for.
>
> After that's done, it'd be easy to then write a script that will tell
> the database to collect stats on all multi-column indexes, RI, etc.
> Down
> the road, the planner could even be made to log (in a machine-readable
> format) every time it needs cross-column stats, and that data could be
> used to add stats that are needed.

If we're talking about my random neuron firing then I think the
responses have gone off
a bit. My thought was to just tell the planner the statistics that
are of interest.

An example of what I'm thinking would probably be helpful. Let's say
that the
DBA knows, through whatever means at his/her disposal (heck! the
magic you mention)
that column a & column b have some sort of correlation that the
planner can't determine
on its own but can use if it had it. The DBA therefore pokes the
right information into
the planner's statistical tables (or, perhaps, a more human-
manageable one that gets
"compiled" into the planner's stats).

For this to work we'd have to
1. Define the types of statistics that the planner could use in its
planning that
it cannot currently (or ever) collect itself. Cross-column
correlations, suitable
join selectivity, anything that would be useful to the planner.
2. Create a table or other data structure to contain this planner
information. Modify
the planner to use this information.
3. Document what these stats are, and the influence they have in a
format suitable
for use by DBAs, and how to add the stats to the above table. Mere
mortals can tinker
with this feature at their own peril :-)

Now, when a DBA has information that could steer the planner in the
right direction
he/she has a mechanism to do so that does not involve hinting the
specific query. My
hope would be that this information wouldn't go stale as fast as a
query hint would.
Furthermore, the DBA can improve an application's performance without
having to go
into every query it executes.

The planner would look in that table and say Ah! there's information
in here that says
that when a is joined to be it's going to eliminate 90% of my I/O.

Seems to me that such a feature would be a cool knob and address most/
all of the need for
query hints.

One other possibility for the above information would be just to have
a place for
the planner to save information for itself when it finds a plan to be
either horribly
over-optimistic or pessimistic.

Hope this blathering makes some kind of sense...
-arturo

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-08-15 22:42:17 Re: [HACKERS] Forcing current WAL file to be archived
Previous Message Kenneth Marshall 2006-08-15 22:08:22 Re: BugTracker (Was: Re: 8.2 features status)