Re: Scaling shared buffer eviction

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling shared buffer eviction
Date: 2014-05-16 14:51:16
Message-ID: CAA4eK1+EGUPG2T+Eb7T7G=qr9bTDx-bc6S3R-KzTv+_h6zyhUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 15, 2014 at 11:11 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>wrote:

>
> Data with LWLOCK_STATS
> ----------------------------------------------
> BufMappingLocks
>
> PID 7245 lwlock main 38: shacq 41117 exacq 34561 blk 36274 spindelay 101
> PID 7310 lwlock main 39: shacq 40257 exacq 34219 blk 25886 spindelay 72
> PID 7308 lwlock main 40: shacq 41024 exacq 34794 blk 20780 spindelay 54
> PID 7314 lwlock main 40: shacq 41195 exacq 34848 blk 20638 spindelay 60
> PID 7288 lwlock main 41: shacq 84398 exacq 34750 blk 29591 spindelay 128
> PID 7208 lwlock main 42: shacq 63107 exacq 34737 blk 20133 spindelay 81
> PID 7245 lwlock main 43: shacq 278001 exacq 34601 blk 53473 spindelay 503
> PID 7307 lwlock main 44: shacq 85155 exacq 34440 blk 19062 spindelay 71
> PID 7301 lwlock main 45: shacq 61999 exacq 34757 blk 13184 spindelay 46
> PID 7235 lwlock main 46: shacq 41199 exacq 34622 blk 9031 spindelay 30
> PID 7324 lwlock main 46: shacq 40906 exacq 34692 blk 8799 spindelay 14
> PID 7292 lwlock main 47: shacq 41180 exacq 34604 blk 8241 spindelay 25
> PID 7303 lwlock main 48: shacq 40727 exacq 34651 blk 7567 spindelay 30
> PID 7230 lwlock main 49: shacq 60416 exacq 34544 blk 9007 spindelay 28
> PID 7300 lwlock main 50: shacq 44591 exacq 34763 blk 6687 spindelay 25
> PID 7317 lwlock main 50: shacq 44349 exacq 34583 blk 6861 spindelay 22
> PID 7305 lwlock main 51: shacq 62626 exacq 34671 blk 7864 spindelay 29
> PID 7301 lwlock main 52: shacq 60646 exacq 34512 blk 7093 spindelay 36
> PID 7324 lwlock main 53: shacq 39756 exacq 34359 blk 5138 spindelay 22
>
> This data shows that after patch, there is no contention
> for BufFreeListLock, rather there is a huge contention around
> BufMappingLocks. I have checked that HEAD also has contention
> around BufMappingLocks.
>
> As per my analysis till now, I think reducing contention around
> BufFreelistLock is not sufficient to improve scalability, we need
> to work on reducing contention around BufMappingLocks as well.
>

To reduce the contention around BufMappingLocks, I have tried the patch
by just increasing the Number of Buffer Partitions, and it actually shows
a really significant increase in scalability both due to reduced contention
around BufFreeListLock and BufMappingLocks. The real effect of reducing
contention around BufFreeListLock was hidden because the whole contention
was shifted to BufMappingLocks. I have taken performance data for both
HEAD+increase_buf_part and Patch+increase_buf_part to clearly see the
benefit of reducing contention around BufFreeListLock. This data has been
taken using pgbench read only load (Select).

Performance Data
-------------------------------
HEAD + 64 = HEAD + (NUM_BUFFER_PARTITONS(64) +
LOG2_NUM_LOCK_PARTITIONS(6))
V1 + 64 = PATCH + (NUM_BUFFER_PARTITONS(64) +
LOG2_NUM_LOCK_PARTITIONS(6))
Similarly 128 means 128 buffer partitions

shared_buffers= 8GB
scale factor = 3000
RAM - 64GB

Thrds (64) Thrds (128) HEAD 45562 17128 HEAD + 64 57904 32810 V1 + 64
105557 81011 HEAD + 128 58383 32997 V1 + 128 110705 114544

shared_buffers= 8GB
scale factor = 1000
RAM - 64GB

Thrds (64) Thrds (128) HEAD 92142 31050 HEAD + 64 108120 86367 V1 + 64
117454 123429 HEAD + 128 107762 86902 V1 + 128 123641 124822

Observations
-------------------------
1. There is increase of upto 5 times in performance for data that can fit
in memory but not in shared buffers
2. Though there is a increase in performance by just increasing number
of buffer partitions, but it doesn't scale well (especially see the case
when partitions have increased to 128 from 64).

I have verified that contention has reduced around BufMappingLocks
by running the patch with LWLOCKS

BufFreeListLock
PID 17894 lwlock main 0: shacq 0 exacq 171 blk 27 spindelay 1

BufMappingLocks

PID 17902 lwlock main 38: shacq 12770 exacq 10104 blk 282 spindelay 0
PID 17924 lwlock main 39: shacq 11409 exacq 10257 blk 243 spindelay 0
PID 17929 lwlock main 40: shacq 13120 exacq 10739 blk 239 spindelay 0
PID 17940 lwlock main 41: shacq 11865 exacq 10373 blk 262 spindelay 0
..
..
PID 17831 lwlock main 162: shacq 12706 exacq 10267 blk 199 spindelay 0
PID 17826 lwlock main 163: shacq 11081 exacq 10256 blk 168 spindelay 0
PID 17903 lwlock main 164: shacq 11494 exacq 10375 blk 176 spindelay 0
PID 17899 lwlock main 165: shacq 12043 exacq 10485 blk 216 spindelay 0

We can clearly notice that the number for *blk* has reduced significantly
which shows that contention has reduced.

The patch is still in a shape to prove the merit of idea and I have just
changed the number of partitions so that if someone wants to verify
the performance for similar load, it can be done by just applying
the patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
scalable_buffer_eviction_v2.patch application/octet-stream 14.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-05-16 15:05:08 chr() is still too loose about UTF8 code points
Previous Message Andres Freund 2014-05-16 14:30:36 Re: pg_basebackup: could not get transaction log end position from server: FATAL: could not open file "./pg_hba.conf~": Permission denied