pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Allow Pin/UnpinBuffer to operate in a lockfree manner.
Date: 2016-04-11 03:12:51
Message-ID: E1apSHT-0007xJ-Bh@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Allow Pin/UnpinBuffer to operate in a lockfree manner.

Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).

To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).

As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths. There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.

As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.

This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me. Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.

On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.

Author: Alexander Korotkov and Andres Freund
Discussion: 2400449(dot)GjM57CE0Yg(at)dinodell

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/48354581a49c30f5757c203415aa8412d85b0f70

Modified Files
--------------
contrib/pg_buffercache/pg_buffercache_pages.c | 15 +-
src/backend/storage/buffer/buf_init.c | 7 +-
src/backend/storage/buffer/bufmgr.c | 508 +++++++++++++++++---------
src/backend/storage/buffer/freelist.c | 44 ++-
src/backend/storage/buffer/localbuf.c | 64 ++--
src/backend/storage/lmgr/s_lock.c | 206 ++++++-----
src/include/postmaster/postmaster.h | 15 +-
src/include/storage/buf_internals.h | 101 +++--
src/include/storage/s_lock.h | 18 +
src/tools/pgindent/typedefs.list | 1 +
10 files changed, 622 insertions(+), 357 deletions(-)

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2016-04-11 03:16:24 pgsql: Fix access-to-already-freed-memory issue in plpython's error han
Previous Message Andres Freund 2016-04-10 23:08:56 Re: [COMMITTERS] pgsql: Move each SLRU's lwlocks to a separate tranche.

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-04-11 03:14:37 Re: 2016-03 Commitfest
Previous Message Tom Lane 2016-04-11 02:47:30 Re: Weird irreproducible behaviors in plpython tests