Re: 2nd Level Buffer Cache

From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 2nd Level Buffer Cache
Date: 2011-03-23 20:49:17
Message-ID: 201103232149.18156.rsmogura@softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> Wednesday 23 March 2011 21:30:04
> On Wed, Mar 23, 2011 at 8:00 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > It looks like the only way anything can ever get put on the free list
> > right now is if a relation or database is dropped. That doesn't seem
> > too good. I wonder if the background writer shouldn't be trying to
> > maintain the free list. That is, perhaps BgBufferSync() should notice
> > when the number of free buffers drops below some threshold, and run
> > the clock sweep enough to get it back up to that threshold.
>
> I think this is just a terminology discrepancy. In postgres the free
> list is only used for buffers that contain no useful data at all. The
> only time there are buffers on the free list is at startup or if a
> relation or database is dropped.
>
> Most of the time blocks are read into buffers that already contain
> other data. Candidate buffers to evict are buffers that have been used
> least recently. That's what the clock sweep implements.
>
> What the bgwriter's responsible for is looking at the buffers *ahead*
> of the clock sweep and flushing them to disk. They stay in ram and
> don't go on the free list, all that changes is that they're clean and
> therefore can be reused without having to do any i/o.
>
> I'm a bit skeptical that this works because as soon as bgwriter
> saturates the i/o the os will throttle the rate at which it can write.
> When that happens even a few dozens of milliseconds will be plenty to
> allow the purely user-space processes consuming the buffers to catch
> up instantly.
>
> But Greg Smith has done a lot of work tuning the bgwriter so that it
> is at least useful in some circumstances. I could well see it being
> useful for systems where latency matters and the i/o is not saturated.

Freelist is almost useless under normal operations, but it's only one check if
it's empty or not, which could be optimized by checking (...> -1), or !(... <
0)

Regards,
Radek

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2011-03-23 20:50:36 Re: psql \dt and table size
Previous Message Alvaro Herrera 2011-03-23 20:33:46 Re: psql \dt and table size