Re: Move unused buffers to freelist

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>
Cc: "'Greg Smith'" <greg(at)2ndquadrant(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Move unused buffers to freelist
Date: 2013-06-06 07:01:21
Message-ID: 005601ce6283$a814b410$f83e1c30$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, May 28, 2013 6:54 PM Robert Haas wrote:
> >> Instead, I suggest modifying BgBufferSync, specifically this part
> right
> >> here:
> >>
> >> else if (buffer_state & BUF_REUSABLE)
> >> reusable_buffers++;
> >>
> >> What I would suggest is that if the BUF_REUSABLE flag is set here,
> use
> >> that as the trigger to do StrategyMoveBufferToFreeListEnd().
> >
> > I think at this point also we need to lock buffer header to check
> refcount
> > and usage_count before moving to freelist, or do you think it is not
> > required?
>
> If BUF_REUSABLE is set, that means we just did exactly what you're
> saying. Why do it twice?

Even if we just did it, but we have released the buf header lock, so
theoretically chances are there that backend can increase the count, however
still it will be protected by check in StrategyGetBuffer(). As there is a
very rare chance of it, so doing without buffer header lock might not cause
any harm.
Modified patch to address the same is attached with mail.

Performance Data
-------------------

As far as I have noticed, performance data for this patch depends on 3
factors
1. Pre-loading of data in buffers, so that buffers holding pages should have
some usage count before running pgbench.
Reason is it might be creating difference in performance of clock-sweep
2. Clearing of pages in OS cache before running pgbench with different
patch, it can create difference because when we run pgbench with or without
patch,
it can access pages already cached due to previous runs which causes
variation in performance.
3. Scale factor and shared buffer configuration

To avoid above 3 factors in test readings, I used below steps:
1. Initialize the database with scale factor such that database size +
shared_buffers = RAM (shared_buffers = 1/4 of RAM).
For example:
Example -1
if RAM = 128G, then initialize db with scale factor = 6700
and shared_buffers = 32GB.
Database size (98 GB) + shared_buffers (32GB) = 130 (which
is approximately equal to total RAM)
Example -2 (this is based on your test m/c)
If RAM = 64GB, then initialize db with scale factor = 3400
and shared_buffers = 16GB.
2. reboot m/c
3. Load all buffers with data (tables/indexes of pgbench) using pg_prewarm.
I had loaded 3 times, so that usage count of buffers will be approximately
3.
Used file load_all_buffers.sql attached with this mail
4. run 3 times pgbench select-only case for 10 or 15 minutes without patch
5. reboot m/c
6. Load all buffers with data (tables/indexes of pgbench) using pg_prewarm.
I had loaded 3 times, so that usage count of buffers will be approximately
3.
Used file load_all_buffers.sql attached with this mail
7. run 3 times pgbench select-only case for 10 or 15 minutes with patch

Using above steps, I had taken performance data on 2 different m/c's

Configuration Details
O/S - Suse-11
RAM - 128GB
Number of Cores - 16
Server Conf - checkpoint_segments = 300; checkpoint_timeout = 15 min,
synchronous_commit = 0FF, shared_buffers = 32GB, AutoVacuum=off
Pgbench - Select-only
Scalefactor - 1200
Time - Each run is of 15 mins

Below data is for average of 3 runs

16C-16T 32C-32T 64C-64T
HEAD 4391 3971 3464
After Patch 6147 5093 3944

Detailed data of each run is attached with mail in file
move_unused_buffers_to_freelist_v2.htm

Below data is for 1 run of half hour on same configuration

16C-16T 32C-32T 64C-64T
HEAD 4377 3861 3295
After Patch 6542 4770 3504

Configuration Details
O/S - Suse-11
RAM - 24GB
Number of Cores - 8
Server Conf - checkpoint_segments = 256; checkpoint_timeout = 25 min,
synchronous_commit = 0FF, shared_buffers = 5GB
Pgbench - Select-only
Scalefactor - 1200
Time - Each run is of 10 mins

Below data is for average 3 runs of 10 minutes

8C-8T 16C-16T 32C-32T
64C-64T 128C-128T 256C-256T
HEAD 58837 56740 19390
5681 3191 2160
After Patch 59482 56936 25070
7655 4166 2704

Detailed data of each run is attached with mail in file
move_unused_buffers_to_freelist_v2.htm

Below data is for 1 run of half hour on same configuration

32C-32T
HEAD 17703
After Patch 20586

I had run these tests multiple times to ensure the correctness. I think last
time why it didn't show performance improvement in your runs is
because the way we both are running pgbench is different. This time, I have
detailed the steps I have used to collect performance data.

With Regards,
Amit Kapila.

Attachment Content-Type Size
move_unused_buffers_to_freelist_v2.patch application/octet-stream 3.2 KB
move_unused_buffers_to_freelist_v2.htm text/html 37.8 KB
load_all_buffers.sql application/octet-stream 1.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2013-06-06 07:09:37 Re: how to find out whether a view is updatable
Previous Message Peter Geoghegan 2013-06-06 06:31:51 Re: Redesigning checkpoint_segments