Re: BufFreelistLock

From: Jim Nasby <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BufFreelistLock
Date: 2010-12-13 02:48:37
Message-ID: DAED5995-42C5-488B-B1A6-C251968937E2@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Dec 10, 2010, at 10:49 AM, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>> Excerpts from Jeff Janes's message of vie dic 10 12:24:34 -0300 2010:
>>> As far as I can tell, bgwriter never adds things to the freelist.
>>> That is only done at start up, and when a relation or a database is
>>> dropped. The clock sweep does the vast majority of the work.
>
>> AFAIU bgwriter runs the clock sweep most of the time (BgBufferSync).
>
> I think bgwriter just tries to write out dirty buffers so they'll be
> clean when the clock sweep reaches them. It doesn't try to move them to
> the freelist.

Yeah, it calls SyncOneBuffer which does nothing for the clock sweep.

> There might be some advantage in having it move buffers
> to a freelist that's just protected by a simple spinlock (or at least,
> a lock different from the one that protects the clock sweep). The
> idea would be that most of the time, backends just need to lock the
> freelist for long enough to take a buffer off it, and don't run clock
> sweep at all.

Yeah, the clock sweep code is very intensive compared to pulling a buffer from the freelist, yet AFAICT nothing will run the clock sweep except backends. Unless I'm missing something, the free list is practically useless because buffers are only put there by InvalidateBuffer, which is only called by DropRelFileNodeBuffers and DropDatabaseBuffers. So we make backends queue up behind the freelist lock with very little odds of getting a buffer, then we make them queue up for the clock sweep lock and make them actually run the clock sweep.

BTW, when we moved from 96G to 192G servers I tried increasing shared buffers from 8G to 28G and performance went down enough to be noticeable (we don't have any good benchmarks, so I cant really quantify the degradation). Going back to 8G brought performance back up, so it seems like it was the change in shared buffers that caused the issue (the larger servers also have 24 cores vs 16). My immediate thought was that we needed more lock partitions, but I haven't had the chance to see if that helps. ISTM the issue could just as well be due to clock sweep suddenly taking over 3x longer than before.

We're working on getting a performance test environment setup, so hopefully in a month or two we'd be able to actually run some testing on this.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2010-12-13 02:58:46 Re: ALTER TABLE ... ADD FOREIGN KEY ... NOT ENFORCED
Previous Message Fujii Masao 2010-12-13 02:44:43 Re: libpq changes for synchronous replication