Re: Scaling shared buffer eviction

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling shared buffer eviction
Date: 2014-08-28 11:11:42
Message-ID: CAA4eK1+5A1+_N+RLBOBzJFfdT4jvaDqJoWQ9cyF=no_4yLO5og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 27, 2014 at 8:34 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Aug 26, 2014 at 10:53 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > Today, while working on updating the patch to improve locking
> > I found that as now we are going to have a new process, we need
> > a separate latch in StrategyControl to wakeup that process.
> > Another point is I think it will be better to protect
> > StrategyControl->completePasses with victimbuf_lck rather than
> > freelist_lck, as when we are going to update it we will already be
> > holding the victimbuf_lck and it doesn't make much sense to release
> > the victimbuf_lck and reacquire freelist_lck to update it.
>
> Sounds reasonable. I think the key thing at this point is to get a
> new version of the patch with the background reclaim running in a
> different process than the background writer. I don't see much point
> in fine-tuning the locking regimen until that's done.
>
>
I have updated the patch to address the feedback. Main changes are:

1. For populating freelist, have a separate process (bgreclaimer)
instead of doing it by bgwriter.
2. Autotune the low and high threshold values for buffers
in freelist. I have used the formula as suggested by you upthread.
3. Cleanup of locking regimen as discussed upthread (completely
eliminated BufFreelist Lock).
4. Improved comments and general code cleanup.

I have not yet added statistics (buffers_backend_clocksweep) as
for that we need to add one more variable in BufferStrategyControl
structure where I have already added few variables for this patch.
I think it is important to have such a stat available via
pg_stat_bgwriter, but not sure if it is worth to make the structure
bit more bulky.

Another minor point is about changes in lwlock.h
lwlock.h
* if you remove a lock, consider leaving a gap in the numbering
* sequence for the benefit of DTrace and other external debugging
* scripts.

As I have removed BufFreelist lock, I have adjusted the numbering
as well in lwlock.h. There is a meesage on top of lock definitions
which suggest to leave gap if we remove any lock, however I was not
sure whether this case (removing the first element) can effect anything,
so for now, I have adjusted the numbering.

I have yet to collect data under varying loads, however I have
collected performance data for 8GB shared buffers which shows
reasonably good performance and scalability.

I think the main part left for this patch is more data for various loads
which I will share in next few days, however I think patch is ready for
next round of review, so I will mark it as Needs Review.

Performance Data:
-------------------------------

Configuration and Db Details

IBM POWER-7 16 cores, 64 hardware threads

RAM = 64GB

Database Locale =C

checkpoint_segments=256

checkpoint_timeout =15min

shared_buffers=8GB

scale factor = 3000

Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)

Duration of each individual run = 5mins

All the data is in tps and taken using pgbench read-only load

Client Count/Patch_ver 8 16 32 64 128 HEAD 58614 107370 140717 104357
65010 Patch 60849 118701 165631 209226 213029

Note -
a. The numbers are slightly different than previously reported
numbers as earlier I was using debug mode of binaries to take
data and it seems some kind of trace was enabled on m/c.
However the improve in performance and scalability is almost
similar to previous.
b. Above data is median of 3 runs, for detailed data refer attached
document (perf_read_scalability_data_v5.ods)

CPU Usage
------------------
I have observed that CPU usage for new process (reclaimer) is
between 5~9%.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
scalable_buffer_eviction_v5.patch application/octet-stream 55.9 KB
perf_read_scalability_data_v5.ods application/vnd.oasis.opendocument.spreadsheet 17.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-08-28 11:14:00 v4 protocol TODO item - Lazy fetch/stream of TOASTed values?
Previous Message Etsuro Fujita 2014-08-28 11:05:37 Re: Optimization for updating foreign tables in Postgres FDW