Re: Clock sweep not caching enough B-Tree leaf pages?

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Ants Aasma <ants(at)cybertec(dot)at>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Clock sweep not caching enough B-Tree leaf pages?
Date: 2014-04-18 11:16:31
Message-ID: CAOeZVicEUNUk-YC0KaZvR37kZWL=otm0TcaV_gzPPEfAqWf9aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 18, 2014 at 7:27 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

A way I have in mind about eviction policy is to introduce a way to have an
ageing factor in each buffer and take the ageing factor into consideration
when evicting a buffer.

Consider a case where a table is pretty huge and spread across multiple
pages. The querying pattern is like a time series pattern i.e. a set of
rows is queried pretty frequently for some time, making the corresponding
page hot. Then, the next set of rows is queried frequently making that page
hot and so on.

Consider a new page entering the shared buffers with refcount 1 and
usage_count 1. If that page is a part of the workload described above, it
is likely that it shall not be used for a considerable amount of time after
it has entered the buffers but will be used eventually.

Now, the current hypothetical situation is that we have three pages:

1) The page that used to be hot at the previous time window but is no
longer hot and is actually the correct candidate for eviction.
2) The current hot page (It wont be evicted anyway for now).
3) The new page which just got in and should not be evicted since it can be
hot soon (for this workload it will be hot in the next time window).

When Clocksweep algorithm runs the next time, it will see the new buffer
page as the one to be evicted (since page (1) may still have usage_count >
0 i.e. it may be 'cooling' but not 'cool' yet.)

This can be changed by introducing an ageing factor that sees how much time
the current buffer has spend in shared buffers. If the time that the buffer
has spent is large enough (relatively) and it is not hot currently, that
means it has had its chance and can be evicted. This shall save the new
page (3) from being evicted since it's time in shared buffers shall not be
high enough to mandate eviction and it shall be given more chances.

Since gettimeofday() is an expensive call and hence cannot be done in the
tight loop, we can count the number of clocksweeps the current buffer has
seen (rather, survived). This shall give us a rough idea of the estimate of
the relative age of the buffer.

When an eviction happens, all the candidates with refcount = 0 shall be
taken.Then, among them, the one with highest ageing factor shall be evicted.

Of course, there may be better ways of doing the same, but I want to
highlight the point (or possibility) of introducing an ageing factor to
prevent eviction of relatively younger pages early in the eviction process.

The overhead isnt too big. We just need to add another attribute in buffer
header for the number of clocksweeps seen (rather, survived) and check it
when an eviction is taking place.The existing spinlock for buffer headers
shall be good for protecting contention and access. The access rules can be
similar to that of usage_count.

Thoughts and comments?

Regards,

Atri

--
Regards,

Atri
*l'apprenant*

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2014-04-18 11:51:56 Re: [bug fix] pg_ctl always uses the same event source
Previous Message Amit Langote 2014-04-18 11:05:50 Re: Typo fix in src/backend/access/transam/recovery.conf.sample