Re: Wait free LW_SHARED acquisition - v0.9

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Wait free LW_SHARED acquisition - v0.9
Date: 2014-10-21 07:10:56
Message-ID: CAA4eK1+3AEdJpKBoAR7-_GqqaO3ZXCEVAL177sVj=nQa7OX5zw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 17, 2014 at 11:41 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> On 2014-10-17 17:14:16 +0530, Amit Kapila wrote:
> > On Tue, Oct 14, 2014 at 11:34 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > wrote:
> > HEAD – commit 494affb + wait free lw_shared_v2
> >
> > Shared_buffers=8GB; Scale Factor = 3000
> >
> > Client Count/No. Of Runs (tps) 64 128 Run-1 286209 274922 Run-2
289101
> > 274495 Run-3 289639 273633
>
> So here the results with LW_SHARED were consistently better, right?

Yes.

> You
> saw performance degradations here earlier?

Yes.

> > So I am planning to proceed further with the review/test of your
> > latest patch.
>
> > According to me, below things are left from myside:
> > a. do some basic tpc-b tests with patch

I have done few tests, the results of which are below, the data indicates
that neither there is any noticeable gain nor any noticeable loss on tpc-b
tests which I think is what could have been expected of this patch.
There is slight variation at few client counts (for sync_commit =off,
at 32 and 128), however I feel that is just noise as I don't see any
general trend.

Performance Data
----------------------------
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB
Database Locale =C
max_connections =300
checkpoint_segments=300
checkpoint_timeout =15min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
Duration of each individual run = 30mins
Test mode - tpc-b

Below data is median of 3 runs, detailed data is attached with this
mail.

Scale_factor =3000; shared_buffers=8GB;

Patch/Client_count 8 16 32 64 128 HEAD 3849 4889 3569 3845 4547
LW_SHARED 3844 4787 3532 3814 4408

Scale_factor =3000; shared_buffers=8GB; synchronous_commit=off;

Patch/Client_count 8 16 32 64 128 HEAD 5966 8297 10084 9348 8836
LW_SHARED 6070 8612 8839 9503 8584

While doing performance tests, I noticed a hang at higher client
counts with patch. I have tried to check call stack for few of
processes and it is as below:

#0 0x0000008010933e54 in .semop () from /lib64/libc.so.6
#1 0x0000000010286e48 in .PGSemaphoreLock ()
#2 0x00000000102f68bc in .LWLockAcquire ()
#3 0x00000000102d1ca0 in .ReadBuffer_common ()
#4 0x00000000102d2ae0 in .ReadBufferExtended ()
#5 0x00000000100a57d8 in ._bt_getbuf ()
#6 0x00000000100a6210 in ._bt_getroot ()
#7 0x00000000100aa910 in ._bt_search ()
#8 0x00000000100ab494 in ._bt_first ()
#9 0x00000000100a8e84 in .btgettuple ()
..

#0 0x0000008010933e54 in .semop () from /lib64/libc.so.6
#1 0x0000000010286e48 in .PGSemaphoreLock ()
#2 0x00000000102f68bc in .LWLockAcquire ()
#3 0x00000000102d1ca0 in .ReadBuffer_common ()
#4 0x00000000102d2ae0 in .ReadBufferExtended ()
#5 0x00000000100a57d8 in ._bt_getbuf ()
#6 0x00000000100a6210 in ._bt_getroot ()
#7 0x00000000100aa910 in ._bt_search ()
#8 0x00000000100ab494 in ._bt_first ()
...

The test configuration is as below:
Test env - Power - 7 (hydra)
scale_factor - 3000
shared_buffers - 8GB
test mode - pgbench read only

test execution -
./pgbench -c 128 -j 128 -T 1800 -S -M prepared postgres

I have ran it for half an hour, but it doesn't came out even after
~2 hours. It doesn't get reproduced every time, currently I am
able to reproduce it and the m/c is in same state, if you want any
info, let me know (unfortunately binaries are in release mode, so
might not get enough information).

> > b. re-review latest version posted by you
>
> Cool!

I will post my feedback for code separately, once I am able to
completely review the new versions.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
perf_lwlock_contention_tpcb_data_v1.ods application/vnd.oasis.opendocument.spreadsheet 17.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2014-10-21 08:40:11 Re: inherit support for foreign tables
Previous Message Kyotaro HORIGUCHI 2014-10-21 07:06:43 Re: alter user/role CURRENT_USER