Quick Links

Re: WIP: cross column correlation ...

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Subject:	Re: WIP: cross column correlation ...
Date:	2011-02-25 13:18:40
Message-ID:	AANLkTi=ggrfv6fZJXf7iy-BJyU=nPMhDTqxQ-SBEa9xu@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Feb 25, 2011 at 1:33 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Josh Berkus wrote:
>> On 2/23/11 7:10 AM, Robert Haas wrote:
>> > IME, most bad query plans are caused by either incorrect
>> > estimates of selectivity, or wrongheaded notions about what's likely
>> > to be cached. If we could find a way, automated or manual, of
>> > providing the planner some better information about the facts of life
>> > in those areas, I think we'd be way better off. I'm open to ideas
>> > about what the best way to do that is.
>>
>> As previously discussed, I'm fine with approaches which involve
>> modifying database objects. These are auditable and centrally managable
>> and aren't devastating to upgrades.
>>
>> So thinks like the proposed "CREATE SELECTIVITY" would be OK in a way
>> that decorating queries would not.
>>
>> Similiarly, I would love to be able to set "cache %" on a per-relation
>> basis, and override the whole dubious calculation involving
>> random_page_cost for scans of that table.
>
> We should just fine a way of checking what percentage of a table is
> already in the shared buffers. That doesn't help us with the kernel
> cache, but it would be a good start and something that doesn't require
> user tuning.

You're reinventing a wheel that's already been discarded multiple
times. There are at least four separate problems:

1. The percentage of the table which is cached in shared_buffers at
plan time need not match the percentage that is cached at execution
time. A delay of even a few seconds between planning and execution
could make the numbers totally different, and plans can be cached for
much longer than that.

2. Because shared_buffers can turn over quite quickly, planning the
statement multiple times in relatively quick succession could give
different results each time. Previous discussions on this topic have
concluded that DBAs hate plan instability, and hate GEQO because it
causes plan instability, and this would inject plan instabiilty into
the main planner.

3. The percentage of the table which is cached in shared_buffers is
not necessarily representative of the percentage which is cached in
general. On a large machine, shared_buffers may be less than 10% of
the total cache. It would be unwise to make guesses about what is and
is not cached based on a small percentage of the cache.

4. Even if we could accurately estimate the percentage of the table
that is cached, what then? For example, suppose that a user issues a
query which retrieves 1% of a table, and we know that 1% of that table
is cached. How much of the data that the user asked for is cache?
Hard to say, right? It could be none of it or all of it. The second
scenario is easy to imagine - just suppose the query's been executed
twice. The first scenario isn't hard to imagine either.

One idea Tom and I kicked around previously is to set an assumed
caching percentage for each table based on its size relative to
effective_cache_size - in other words, assume that the smaller a table
is, the more of it will be cached. Consider a system with 8GB of RAM,
and a table which is 64kB. It is probably unwise to make any plan
based on the assumption that that table is less than fully cached. If
it isn't before the query executes, it soon will be. Going to any
amount of work elsewhere in the plan to avoid the work of reading that
table in from disk is probably a dumb idea. Of course, one downside
of this approach is that it doesn't know which tables are hot and
which tables are cold, but it would probably still be an improvement
over the status quo.

All that having been said, I think that while Josh is thinking fuzzily
about the mathematics of his proposal, the basic idea is pretty
sensible. It is not easy - likely not possible - for the system to
have a good idea which things will be in some kind of cache at the
time the query is executed; it could even change mid-query. The
execution of one part of the query could evict from the cache data
which some other part of the plan assumed would be cached. But DBAs
frequently have a very good idea of which stuff is in cache - they can
make observations over a period of time and then adjust settings and
then observe some more and adjust some more.

PostgreSQL is extremely easy to administer compared with some of its
competitors, and it's frequently necessary to change very little. But
there's a difference between what you absolutely have to change to
make it work and what you have the option to change when necessary.
We need to decrease the amount of stuff in the first category (as we
recently did with wal_buffers) and increase the amount of stuff in the
second category. People coming from Oracle are not favorably
impressed either by the amount of monitoring data PostgreSQL can
gather or by the number of knobs that are available to fix problems
when they occur. We don't need to have as many knobs as Oracle and we
probably don't want to, and for that matter we probably couldn't if we
did want to for lack of manpower, but that doesn't mean we should have
none.

Maybe sometime during my life someone will invent a self-driving car
where I can just get in and say "take me to Bruce's house" and an hour
later it'll parallel park at the end of his driveway. That will be
great. But I think that the first generation of self-driving cars
will still have a steering wheel, and a brake pedal, and a little
switch that turns self-driving mode OFF. It is one thing to say that
you have a system which is really good and does not need much manual
adjustment, and we have such a system. It is another thing altogether
to systematically remove, or refuse to add, any controls that might
permit adjustment in cases where it is necessary. That can be the
right thing to do if the system is of such high quality that such
manual adjustment is so unlikely to be necessary as not to be worth
worrying about; but we are not at that point. And frankly I think if
we don't add some knobs to let this stuff be tuned manually, we will
never get the experience we need to write good auto-tuning algorithms.
Greg Smith would not have known what algorithm to propose for tuning
the wal_buffers option if he had not had a bunch of experience setting
it by hand.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: WIP: cross column correlation ... at 2011-02-25 06:33:59 from Bruce Momjian

Responses

Re: WIP: cross column correlation ... at 2011-02-25 16:45:43 from Cédric Villemain
Re: WIP: cross column correlation ... at 2011-02-25 17:03:58 from Rod Taylor
Re: WIP: cross column correlation ... at 2011-02-25 23:41:09 from Josh Berkus
Re: WIP: cross column correlation ... at 2011-02-26 09:33:14 from Grzegorz Jaskiewicz

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	marcin mank	2011-02-25 13:25:09	Re: disposition of remaining patches
Previous Message	Robert Haas	2011-02-25 12:43:06	Re: disposition of remaining patches