Quick Links

Reducing ClogControlLock contention

Lists:	pgsql-hackers

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Reducing ClogControlLock contention
Date:	2015-06-30 07:02:25
Message-ID:	CANP8+j+imQfHxkChFyfnXDyi6k-arAzRV+ZG-V_OFxEtJjOL2Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

ClogControlLock contention is high at commit time. This appears to be due
to the fact that ClogControlLock is acquired in Exclusive mode prior to
marking commit, which then gets starved by backends running
TransactionIdGetStatus().

Proposal for improving this is to acquire the ClogControlLock in Shared
mode, if possible.

This is safe because people checking visibility of an xid must always run
TransactionIdIsInProgress() first to avoid race conditions, which will
always return true for the transaction we are currently committing. As a
result, we never get concurrent access to the same bits in clog, which
would require a barrier.

Two concurrent writers might access the same word concurrently, so we
protect against that with a new CommitLock. We could partition that by
pageno also, if needed.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment	Content-Type	Size
clog_optimize.v3.patch	application/octet-stream	8.2 KB

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-06-30 07:13:25
Message-ID:	CAB7nPqRY1OBVZtxexn5t7sN=1n_nxsoosQA1TVHXzMXLxEj+sA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jun 30, 2015 at 4:02 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> ClogControlLock contention is high at commit time. This appears to be due
> to the fact that ClogControlLock is acquired in Exclusive mode prior to
> marking commit, which then gets starved by backends running
> TransactionIdGetStatus().
>
> Proposal for improving this is to acquire the ClogControlLock in Shared
> mode, if possible.
>
> This is safe because people checking visibility of an xid must always run
> TransactionIdIsInProgress() first to avoid race conditions, which will
> always return true for the transaction we are currently committing. As a
> result, we never get concurrent access to the same bits in clog, which
> would require a barrier.
>
> Two concurrent writers might access the same word concurrently, so we
> protect against that with a new CommitLock. We could partition that by
> pageno also, if needed.
>

Could it be possible to see some performance numbers? For example with a
simple pgbench script doing a bunch of tiny transactions, with many
concurrent sessions (perhaps hundreds).
--
Michael

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-06-30 07:22:51
Message-ID:	CANP8+jKXq8Xt4_o7nVUrHZu8AkPtvCL-QR_sOVa091Ft7nZ-Tg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 30 June 2015 at 08:13, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:

>
>
> On Tue, Jun 30, 2015 at 4:02 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
>
>> ClogControlLock contention is high at commit time. This appears to be due
>> to the fact that ClogControlLock is acquired in Exclusive mode prior to
>> marking commit, which then gets starved by backends running
>> TransactionIdGetStatus().
>>
>> Proposal for improving this is to acquire the ClogControlLock in Shared
>> mode, if possible.
>>
>> This is safe because people checking visibility of an xid must always run
>> TransactionIdIsInProgress() first to avoid race conditions, which will
>> always return true for the transaction we are currently committing. As a
>> result, we never get concurrent access to the same bits in clog, which
>> would require a barrier.
>>
>> Two concurrent writers might access the same word concurrently, so we
>> protect against that with a new CommitLock. We could partition that by
>> pageno also, if needed.
>>
>
> Could it be possible to see some performance numbers? For example with a
> simple pgbench script doing a bunch of tiny transactions, with many
> concurrent sessions (perhaps hundreds).
>

I'm more interested to see if people think it is safe.

This contention is masked by contention elsewhere, e.g. ProcArrayLock, so
the need for testing here should come once other patches ahead of this are
in.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-06-30 07:31:10
Message-ID:	CANP8+j+ATdC18wqUPR0zk__5GgkH4L7naqmOzOO-NRrP_fuUsQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 30 June 2015 at 08:22, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> This contention is masked by contention elsewhere, e.g. ProcArrayLock, so
> the need for testing here should come once other patches ahead of this are
> in.
>

Let me explain more clearly.

Andres' patch to cache snapshots and reduce ProcArrayLock was interesting,
but not initially compelling. We now have a solution that commits in
batches, which will reduce the number of times the ProcArray changes - this
will heighten the benefit from Andres' snapshot cache patch.

So the order of testing/commit should be

Queued commit patch
ProcArray cache patch
Clog shared commit patch (this one)

I didn't hear recent mention of Robert's chash patch, but IIRC it was
effective and so we hope to see it again soon also.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-06-30 08:31:42
Message-ID:	CAA4eK1JjRKsBkjkPWQkNFV+Eej==aSQRYerpWytRyy_s9OxQYA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jun 30, 2015 at 12:52 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 30 June 2015 at 08:13, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
>>
>>
>> Could it be possible to see some performance numbers? For example with a
simple pgbench script doing a bunch of tiny transactions, with many
concurrent sessions (perhaps hundreds).
>
>
> I'm more interested to see if people think it is safe.
>
> This contention is masked by contention elsewhere, e.g. ProcArrayLock, so
the need for testing here should come once other patches ahead of this are
in.
>

Exactly and other lock that can mask this improvement is WALWriteLock,
but for that we can take the performance data with synchronous_commit
off mode.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-01 08:00:16
Message-ID:	CAA4eK1+6Nbs1eBSm6Uo2jnMHStnw6Etqr7OSu3aOQuZ7MbuWbQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> ClogControlLock contention is high at commit time. This appears to be due
to the fact that ClogControlLock is acquired in Exclusive mode prior to
marking commit, which then gets starved by backends running
TransactionIdGetStatus().
>
> Proposal for improving this is to acquire the ClogControlLock in Shared
mode, if possible.
>

This approach looks good way for avoiding the contention around
ClogControlLock. Few things that occurred to me while looking at
patch are that

a. the semantics of new LWLock (CommitLock) introduced
by patch seems to be different in the sense that it is just taken in
Exclusive mode (and no Shared mode is required) as per your proposal. We
could use existing LWLock APi's, but on the other hand we could even
invent new LWLock API for this kind of locking.

b. SimpleLruReadPage_ReadOnly() - This API's is meant mainly for
read-access of page and the description also says the same, but now
we want to use it for updating page as well. It might be better to invent
similar new API or at the very least modify it's description.

> Two concurrent writers might access the same word concurrently, so we
protect against that with a new CommitLock. We could partition that by
pageno also, if needed.
>

I think it will be better to partition it or use it in some other way to
avoid
two concurrent writers block at it, however if you want to first see the
test results with this, then that is also okay.

Overall the idea seems good to pursue, however I have slight feeling
that using 2 LWLocks (CLOGControlLock in shared mode and new
CommitLock in Exclusive mode) to set the transaction information
is somewhat odd, but I could not see any problem with it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-01 08:08:11
Message-ID:	CANP8+jJ1TnPPZfkBA3tGOsLGJp61g6TuDKp1=OqKaFDJA+mu_A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 1 July 2015 at 09:00, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
> >
> > ClogControlLock contention is high at commit time. This appears to be
> due to the fact that ClogControlLock is acquired in Exclusive mode prior to
> marking commit, which then gets starved by backends running
> TransactionIdGetStatus().
> >
> > Proposal for improving this is to acquire the ClogControlLock in Shared
> mode, if possible.
> >
>
> This approach looks good way for avoiding the contention around
> ClogControlLock. Few things that occurred to me while looking at
> patch are that
>
> a. the semantics of new LWLock (CommitLock) introduced
> by patch seems to be different in the sense that it is just taken in
> Exclusive mode (and no Shared mode is required) as per your proposal. We
> could use existing LWLock APi's, but on the other hand we could even
> invent new LWLock API for this kind of locking.
>

LWLock API code is already too complex, so -1 for more changes there

> b. SimpleLruReadPage_ReadOnly() - This API's is meant mainly for
> read-access of page and the description also says the same, but now
> we want to use it for updating page as well. It might be better to invent
> similar new API or at the very least modify it's description.
>

Agreed, perhaps SimpleLruReadPage_Optimistic()

> > Two concurrent writers might access the same word concurrently, so we
> protect against that with a new CommitLock. We could partition that by
> pageno also, if needed.
> >
>
> I think it will be better to partition it or use it in some other way to
> avoid
> two concurrent writers block at it, however if you want to first see the
> test results with this, then that is also okay.
>

Many updates would be on same page, so partitioning it would need to be at
least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.

> Overall the idea seems good to pursue, however I have slight feeling
> that using 2 LWLocks (CLOGControlLock in shared mode and new
> CommitLock in Exclusive mode) to set the transaction information
> is somewhat odd, but I could not see any problem with it.
>

Perhaps call it the CommitSerializationLock would help. There are already
locks that are held only in Exclusive mode.

Locking two separate resources at same time is common in other code.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-01 10:11:50
Message-ID:	CAA4eK1K7nqnFh_fCE0bMKC9yTehfDWY9EvYmpDSkv_x=kYjCUA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 1, 2015 at 1:38 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 1 July 2015 at 09:00, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>>
>> I think it will be better to partition it or use it in some other way to
avoid
>> two concurrent writers block at it, however if you want to first see the
>> test results with this, then that is also okay.
>
>
> Many updates would be on same page, so partitioning it would need to be
at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.
>

Sure, it makes sense to try that way, once you have that ready, I can
try this out along with ProcArrayLock patch to see the impact.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-01 10:14:02
Message-ID:	20150701101402.GY30708@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
> On 1 July 2015 at 09:00, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> > a. the semantics of new LWLock (CommitLock) introduced
> > by patch seems to be different in the sense that it is just taken in
> > Exclusive mode (and no Shared mode is required) as per your proposal. We
> > could use existing LWLock APi's, but on the other hand we could even
> > invent new LWLock API for this kind of locking.
> >
>
> LWLock API code is already too complex, so -1 for more changes there

I don't think that's a valid argument. It's better to have the
complexity in one place (lwlock) than have rather similar complexity in
several other places. The clog control lock is far from the only place
that would benefit from tricks along these lines.

Greetings,

Andres Freund

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-01 10:15:40
Message-ID:	CANP8+jLFNZ2m+k0MXa3pt19H5UBDkaQpNPqUhRp9Sd35p7McgA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 1 July 2015 at 11:11, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Wed, Jul 1, 2015 at 1:38 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > On 1 July 2015 at 09:00, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >>
> >>
> >> I think it will be better to partition it or use it in some other way
> to avoid
> >> two concurrent writers block at it, however if you want to first see the
> >> test results with this, then that is also okay.
> >
> >
> > Many updates would be on same page, so partitioning it would need to be
> at least 4-way to be worth doing. Maybe we could stripe into 512 bye pages.
> >
>
> Sure, it makes sense to try that way, once you have that ready, I can
> try this out along with ProcArrayLock patch to see the impact.
>

Seems sensible to measure what the new point of contention is with both
before doing anything further.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-01 10:19:40
Message-ID:	CANP8+jJNXe=nooPPkPKfZMdTsD8Cq3CFc0YxQjtPzf+SU0VuTg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 1 July 2015 at 11:14, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
> > On 1 July 2015 at 09:00, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> > > a. the semantics of new LWLock (CommitLock) introduced
> > > by patch seems to be different in the sense that it is just taken in
> > > Exclusive mode (and no Shared mode is required) as per your proposal.
> We
> > > could use existing LWLock APi's, but on the other hand we could even
> > > invent new LWLock API for this kind of locking.
> > >
> >
> > LWLock API code is already too complex, so -1 for more changes there
>
> I don't think that's a valid argument. It's better to have the
> complexity in one place (lwlock) than have rather similar complexity in
> several other places. The clog control lock is far from the only place
> that would benefit from tricks along these lines.
>

What "tricks" are being used??

Please explain why taking 2 locks is bad here, yet works fine elsewhere.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-01 10:21:11
Message-ID:	20150701102111.GZ30708@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2015-07-01 11:19:40 +0100, Simon Riggs wrote:
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.

I didn't say anything about 'bad'. It's more complicated than one
lock. Suddenly you have to care about lock ordering and such. The
algorithms for ensuring correctness gets more complicated.

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-07-02 18:17:16
Message-ID:	CA+TgmoYjpNKdHDFUtJLAMna-O5LGuTDnanHFAOT5=hN_VAuW2Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 1, 2015 at 6:21 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2015-07-01 11:19:40 +0100, Simon Riggs wrote:
>> What "tricks" are being used??
>>
>> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>
> I didn't say anything about 'bad'. It's more complicated than one
> lock. Suddenly you have to care about lock ordering and such. The
> algorithms for ensuring correctness gets more complicated.

Taking two locks might also be more expensive than just taking one. I
suppose benchmarking will reveal whether there is an issue there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-11 09:55:19
Message-ID:	CAA4eK1J-w5DzM15H+Y4OUcfy8yekEDNm-+5PEHqO1OK7+o-bqQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 1, 2015 at 3:49 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 1 July 2015 at 11:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
>> > On 1 July 2015 at 09:00, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> >
>> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
>> > > a. the semantics of new LWLock (CommitLock) introduced
>> > > by patch seems to be different in the sense that it is just taken in
>> > > Exclusive mode (and no Shared mode is required) as per your
proposal. We
>> > > could use existing LWLock APi's, but on the other hand we could even
>> > > invent new LWLock API for this kind of locking.
>> > >
>> >
>> > LWLock API code is already too complex, so -1 for more changes there
>>
>> I don't think that's a valid argument. It's better to have the
>> complexity in one place (lwlock) than have rather similar complexity in
>> several other places. The clog control lock is far from the only place
>> that would benefit from tricks along these lines.
>
>
> What "tricks" are being used??
>
> Please explain why taking 2 locks is bad here, yet works fine elsewhere.
>

One thing that could be risky in this new scheme of locking
is that now in functions TransactionIdSetPageStatus and
TransactionIdSetStatusBit, we modify slru's shared state with Control Lock
in Shared mode whereas I think it is mandated in the code that those
should be modified with ControlLock in Exlusive mode. This could have
some repercussions.

Another thing is that in this flow, with patch there will be three locks
(we take per-buffer locks before doing I/O) that will get involved rather
than
two, so one effect of this patch will be that currently while doing I/O,
concurrent committers will be allowed to proceed as we release ControlLock
before doing the same whereas with Patch, they will not be allowed as they
are blocked by CommitLock. Now may be this scenario is less common and
doesn't matter much if the patch improves the more common scenario,
however this is an indication of what Andres tries to highlight that having
more
locks for this might make patch more complicated.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-11 10:14:34
Message-ID:	CANP8+jJdLKiKFraU5Zk124r2MOagLSUQBLspgJ2cbF=A-3SwZg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 11 August 2015 at 10:55, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Wed, Jul 1, 2015 at 3:49 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > On 1 July 2015 at 11:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >>
> >> On 2015-07-01 09:08:11 +0100, Simon Riggs wrote:
> >> > On 1 July 2015 at 09:00, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >> >
> >> > > On Tue, Jun 30, 2015 at 12:32 PM, Simon Riggs <
> simon(at)2ndquadrant(dot)com>
> >> > > a. the semantics of new LWLock (CommitLock) introduced
> >> > > by patch seems to be different in the sense that it is just taken in
> >> > > Exclusive mode (and no Shared mode is required) as per your
> proposal. We
> >> > > could use existing LWLock APi's, but on the other hand we could even
> >> > > invent new LWLock API for this kind of locking.
> >> > >
> >> >
> >> > LWLock API code is already too complex, so -1 for more changes there
> >>
> >> I don't think that's a valid argument. It's better to have the
> >> complexity in one place (lwlock) than have rather similar complexity in
> >> several other places. The clog control lock is far from the only place
> >> that would benefit from tricks along these lines.
> >
> >
> > What "tricks" are being used??
> >
> > Please explain why taking 2 locks is bad here, yet works fine elsewhere.
> >
>
> One thing that could be risky in this new scheme of locking
> is that now in functions TransactionIdSetPageStatus and
> TransactionIdSetStatusBit, we modify slru's shared state with Control Lock
> in Shared mode whereas I think it is mandated in the code that those
> should be modified with ControlLock in Exlusive mode. This could have
> some repercussions.
>

Do you know of any? This is a technical forum, so if we see a problem we
say what it is, and if we don't, that's usually classed as a positive point
in a code review.

> Another thing is that in this flow, with patch there will be three locks
> (we take per-buffer locks before doing I/O) that will get involved rather
> than
> two, so one effect of this patch will be that currently while doing I/O,
> concurrent committers will be allowed to proceed as we release ControlLock
> before doing the same whereas with Patch, they will not be allowed as they
> are blocked by CommitLock. Now may be this scenario is less common and
> doesn't matter much if the patch improves the more common scenario,
> however this is an indication of what Andres tries to highlight that
> having more
> locks for this might make patch more complicated.
>

It's easy to stripe the CommitLock in that case, if it is a problem.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-11 10:39:51
Message-ID:	CAA4eK1L8ks5QP8SckddKoehKZLVmApiOgUTW6hDjK+H2VxRynA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> On 11 August 2015 at 10:55, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
>> > What "tricks" are being used??
>> >
>> > Please explain why taking 2 locks is bad here, yet works fine
>> elsewhere.
>> >
>>
>> One thing that could be risky in this new scheme of locking
>> is that now in functions TransactionIdSetPageStatus and
>> TransactionIdSetStatusBit, we modify slru's shared state with Control Lock
>> in Shared mode whereas I think it is mandated in the code that those
>> should be modified with ControlLock in Exlusive mode. This could have
>> some repercussions.
>>
>
> Do you know of any? This is a technical forum, so if we see a problem we
> say what it is, and if we don't, that's usually classed as a positive point
> in a code review.
>

One of the main reason of saying this is that it is written in File
level comments in slru.c that for accessing (examine or modify)
the shared state, Control lock *must* be held in Exclusive mode
except in function SimpleLruReadPage_ReadOnly(). So, whereas
I agree that I should think more about if there is any breakage due
to patch, but I don't find any explanation either in your e-mail or in
patch why it is safe to modify the state after patch when it was not
before. If you think it is safe, then atleast modify comments in
slru.c.

> Another thing is that in this flow, with patch there will be three locks
>> (we take per-buffer locks before doing I/O) that will get involved rather
>> than
>> two, so one effect of this patch will be that currently while doing I/O,
>> concurrent committers will be allowed to proceed as we release ControlLock
>> before doing the same whereas with Patch, they will not be allowed as they
>> are blocked by CommitLock. Now may be this scenario is less common and
>> doesn't matter much if the patch improves the more common scenario,
>> however this is an indication of what Andres tries to highlight that
>> having more
>> locks for this might make patch more complicated.
>>
>
> It's easy to stripe the CommitLock in that case, if it is a problem.
>

Sure, I think other places in code that take both the other locks also
needs to be checked for updation.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-11 13:53:31
Message-ID:	CAA4eK1LTLBYO3Vrh6m3tNeni8Zez2c3E=Ete0dGTma3b6zvAow@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 11, 2015 at 4:09 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
wrote:
>>
>>>
>>> Another thing is that in this flow, with patch there will be three locks
>>> (we take per-buffer locks before doing I/O) that will get involved
rather than
>>> two, so one effect of this patch will be that currently while doing I/O,
>>> concurrent committers will be allowed to proceed as we release
ControlLock
>>> before doing the same whereas with Patch, they will not be allowed as
they
>>> are blocked by CommitLock. Now may be this scenario is less common and
>>> doesn't matter much if the patch improves the more common scenario,
>>> however this is an indication of what Andres tries to highlight that
having more
>>> locks for this might make patch more complicated.
>>
>>
>> It's easy to stripe the CommitLock in that case, if it is a problem.
>
>
> Sure, I think other places in code that take both the other locks also
> needs to be checked for updation.
>

One more point here why do we need CommitLock before calling
SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
instead of CommitLock?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-11 13:57:14
Message-ID:	CANP8+jJ=zgXn6YSXh4TXo-A0V-7p4dLRMiPnAUS8_CrEipvP8g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 11 August 2015 at 14:53, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> One more point here why do we need CommitLock before calling
> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
> then can we use LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE);
> instead of CommitLock?
>

That prevents read only access, not just commits, so that isn't a better
suggestion.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-11 14:02:15
Message-ID:	CANP8+jJcJKH09Wa7_mp7xikAryv=rgjO0RArZ_3VoCKwWK_nuA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 11 August 2015 at 11:39, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Tue, Aug 11, 2015 at 3:44 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
>
>> On 11 August 2015 at 10:55, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>>> > What "tricks" are being used??
>>> >
>>> > Please explain why taking 2 locks is bad here, yet works fine
>>> elsewhere.
>>> >
>>>
>>> One thing that could be risky in this new scheme of locking
>>> is that now in functions TransactionIdSetPageStatus and
>>> TransactionIdSetStatusBit, we modify slru's shared state with Control
>>> Lock
>>> in Shared mode whereas I think it is mandated in the code that those
>>> should be modified with ControlLock in Exlusive mode. This could have
>>> some repercussions.
>>>
>>
>> Do you know of any? This is a technical forum, so if we see a problem we
>> say what it is, and if we don't, that's usually classed as a positive point
>> in a code review.
>>
>
> One of the main reason of saying this is that it is written in File
> level comments in slru.c that for accessing (examine or modify)
> the shared state, Control lock *must* be held in Exclusive mode
> except in function SimpleLruReadPage_ReadOnly(). So, whereas
> I agree that I should think more about if there is any breakage due
> to patch, but I don't find any explanation either in your e-mail or in
> patch why it is safe to modify the state after patch when it was not
> before. If you think it is safe, then atleast modify comments in
> slru.c.
>

"except"...

I explained that a reader will never be reading data that is concurrently
changed by a writer, so it was safe to break the general rule for this
specific case only.

Yes, I will modify comments in the patch.

> Another thing is that in this flow, with patch there will be three locks
>>> (we take per-buffer locks before doing I/O) that will get
>>> involved rather than
>>> two, so one effect of this patch will be that currently while doing I/O,
>>> concurrent committers will be allowed to proceed as we
>>> release ControlLock
>>> before doing the same whereas with Patch, they will not be allowed as
>>> they
>>> are blocked by CommitLock. Now may be this scenario is less common and
>>> doesn't matter much if the patch improves the more common scenario,
>>> however this is an indication of what Andres tries to highlight that
>>> having more
>>> locks for this might make patch more complicated.
>>>
>>
>> It's easy to stripe the CommitLock in that case, if it is a problem.
>>
>
> Sure, I think other places in code that take both the other locks also
> needs to be checked for updation.
>

Not sure what that means, but there are no other places calling CommitLock

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-12 03:49:54
Message-ID:	CAA4eK1LMPDUECo4_DnWBpDwXpjGLvjohJ_CG-r_NjD1wvxemJw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 11, 2015 at 7:27 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 11 August 2015 at 14:53, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
>>
>> One more point here why do we need CommitLock before calling
>> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
>> then can we use LWLockAcquire(shared->buffer_locks[slotno],
LW_EXCLUSIVE);
>> instead of CommitLock?
>
>
> That prevents read only access, not just commits, so that isn't a better
suggestion.

read only access of what (clog page?)?

Here we are mainly doing three operations read clog page, write transaction
status
on clog page and update shared control state. So basically two resources
are
involved clog page and shared control state, so which one of those you are
talking?

Apart from above, in below code, it is assumed that we have exclusive lock
on
clog page which we don't in the proposed patch as some one can read the
same page while we are modifying it. In current code, this assumption is
valid
because during Write we take CLogControlLock in Exclusive mode and while
Reading we take the same in Shared mode.

TransactionIdSetStatusBit()
{
..
/* note this assumes exclusive access to the clog page */
byteval = *byteptr;
byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
byteval |= (status << bshift);
*byteptr = byteval;
..
}

Now even if this is a problem, I think we can solve it with some more lower
level lock or may be with atomic operation, but I have mentioned it to check
your opinion on the same.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-12 04:14:15
Message-ID:	CAA4eK1+xQvrrU2nJD+tPSdpD-2J92iCrzJ44uKUZPTaPNyQuKA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 11, 2015 at 7:32 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 11 August 2015 at 11:39, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
>>>>
>>>> Another thing is that in this flow, with patch there will be three
locks
>>>> (we take per-buffer locks before doing I/O) that will get involved
rather than
>>>> two, so one effect of this patch will be that currently while doing
I/O,
>>>> concurrent committers will be allowed to proceed as we release
ControlLock
>>>> before doing the same whereas with Patch, they will not be allowed as
they
>>>> are blocked by CommitLock. Now may be this scenario is less common and
>>>> doesn't matter much if the patch improves the more common scenario,
>>>> however this is an indication of what Andres tries to highlight that
having more
>>>> locks for this might make patch more complicated.
>>>
>>>
>>> It's easy to stripe the CommitLock in that case, if it is a problem.
>>
>>
>> Sure, I think other places in code that take both the other locks also
>> needs to be checked for updation.
>
>
> Not sure what that means, but there are no other places calling
CommitLock
>

What I mean is that if want to ensure that we don't keep CommitLock while
doing I/O, then we might also want to ensure the same while waiting for I/O.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-22 14:14:48
Message-ID:	20150822141448.GH8552@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

Amit pinged me about my opinion of this patch. I don't really have
time/energy for an in-depth review right now, but since I'd looked
enough to see some troublesome points, I thought I'd write those.

On 2015-06-30 08:02:25 +0100, Simon Riggs wrote:
> Proposal for improving this is to acquire the ClogControlLock in Shared
> mode, if possible.

I find that rather scary. That requires for all read and write paths in
clog.c and slru.c to only ever read memory locations once. Otherwise
those reads may not get the same results. That'd mean we'd need to add
indirections via volatile to all reads/writes. It also requires that we
never read in 4 byte quantities.

> This is safe because people checking visibility of an xid must always run
> TransactionIdIsInProgress() first to avoid race conditions, which will
> always return true for the transaction we are currently committing. As a
> result, we never get concurrent access to the same bits in clog, which
> would require a barrier.

I don't think that's really sufficient. There's a bunch of callers do
lookups without such checks, e.g. in heapam.c. It might be safe due to
other circumstances, but at the very least this is a significant but
sublte API revision.

> Two concurrent writers might access the same word concurrently, so we
> protect against that with a new CommitLock. We could partition that by
> pageno also, if needed.

To me it seems better to make this more integrated with slru.c. Change
the locking so that the control lock (relevant for page mappings et al)
is different from the locks for reading/writing data.

> * If we're doing an async commit (ie, lsn is valid), then we must wait
> * for any active write on the page slot to complete. Otherwise our
> * update could reach disk in that write, which will not do since we
> @@ -273,7 +290,9 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
> * write-busy, since we don't care if the update reaches disk sooner than
> * we think.
> */
> - slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
> + if (!InRecovery)
> + LWLockAcquire(CommitLock, LW_EXCLUSIVE);
> + slotno = SimpleLruReadPage_ReadOnly(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
>
> /*
> * Set the main transaction id, if any.
> @@ -312,6 +331,9 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
> ClogCtl->shared->page_dirty[slotno] = true;
>
> LWLockRelease(CLogControlLock);
> +
> + if (!InRecovery)
> + LWLockRelease(CommitLock);
> }

TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
writes an 8 byte variable (the lsn). That's not safe.

Maybe write_ok = XLogRecPtrIsInvalid(lsn) is trying to address that? If
so, I don't see how. A page is returned with only the shared lock held
if it's in VALID state before. Even if that were changed, this'd be a
mightily subtle thing to do without a very fat comment.

> @@ -447,7 +447,8 @@ SimpleLruReadPage(SlruCtl ctl, int pageno, bool write_ok,
> * It is unspecified whether the lock will be shared or exclusive.
> */
> int
> -SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid)
> +SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, bool write_ok,
> + TransactionId xid)
> {
> SlruShared shared = ctl->shared;
> int slotno;
> @@ -460,7 +461,9 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid)
> {
> if (shared->page_number[slotno] == pageno &&
> shared->page_status[slotno] != SLRU_PAGE_EMPTY &&
> - shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS)
> + shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS &&
> + (write_ok ||
> + shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS))
> {
> /* See comments for SlruRecentlyUsed macro */
> SlruRecentlyUsed(shared, slotno);
> @@ -472,7 +475,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid)
> LWLockRelease(shared->ControlLock);
> LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);
>
> - return SimpleLruReadPage(ctl, pageno, true, xid);
> + return SimpleLruReadPage(ctl, pageno, write_ok, xid);
> }

This function's name would definitely need to be changed, and it'll need
to be documented when/how it's safe to use write_ok = true. Same with
slru.c's header.

A patch like this will need far more changes. Every read and write from
a page has to be guaranteed to only be done once, otherwise you can get
'effectively torn' values.

That is, you can't just do
static void
TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr lsn, int slotno)
...
char *byteptr;
char byteval;
...
/* note this assumes exclusive access to the clog page */
byteval = *byteptr;
byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
byteval |= (status << bshift);
*byteptr = byteval;
...

the compiler is entirely free to "optimize" away the byteval variable
and do all these masking operations on memory! It can intermittenly
write temporary values to byteval, because e.g. the register pressure is
too high.

To actually rely on single-copy-atomicity you have to enforce that these
reads/writes can only happen. Leavout out some possible macro
indirection that'd have to look like
byteval = (volatile char *) byteptr;
...
*(volatile char *) byteptr = byteval;
some for TransactionIdGetStatus(), without the write side obviously.

and stuff like
if (!XLogRecPtrIsInvalid(lsn))
{
int lsnindex = GetLSNIndex(slotno, xid);

if (ClogCtl->shared->group_lsn[lsnindex] < lsn)
ClogCtl->shared->group_lsn[lsnindex] = lsn;
}
just aren't possible at all unless we drop a bunch of platforms.

In essence I think this patch allows us to roughly benchmark whether and
how benefitial fixing the contention around clog is, but it seems pretty
far from something that can actually be applied.

Greetings,

Andres Freund

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-25 08:38:22
Message-ID:	CANP8+jJSwKmyewz15oXcO2839ZJq9SvpRooN=dXhSedrhtQEFQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 12 August 2015 at 04:49, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Tue, Aug 11, 2015 at 7:27 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
> >
> > On 11 August 2015 at 14:53, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> >>
> >> One more point here why do we need CommitLock before calling
> >> SimpleLruReadPage_ReadOnly() in the patch and if it is not required,
> >> then can we use LWLockAcquire(shared->buffer_locks[slotno],
> LW_EXCLUSIVE);
> >> instead of CommitLock?
> >
> >
> > That prevents read only access, not just commits, so that isn't a better
> suggestion.
>
> read only access of what (clog page?)?
>
> Here we are mainly doing three operations read clog page, write
> transaction status
> on clog page and update shared control state. So basically two resources
> are
> involved clog page and shared control state, so which one of those you are
> talking?
>

Sorry, your suggestion was good. Using
LWLockAcquire(shared->buffer_locks[slotno], LW_EXCLUSIVE); now seems
sufficient.

Apart from above, in below code, it is assumed that we have exclusive lock
> on
> clog page which we don't in the proposed patch as some one can read the
> same page while we are modifying it. In current code, this assumption is
> valid
> because during Write we take CLogControlLock in Exclusive mode and while
> Reading we take the same in Shared mode.
>

Not exactly, no. This is not a general case, it is for one important and
very specific case only, exactly suited to our transaction manager. I have
checked all call paths and we are good.

New patch attached. I will reply to Andres' post separately since this does
not yet address all of his detailed points.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment	Content-Type	Size
clog_optimize.v4.patch	application/octet-stream	4.2 KB

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-25 08:51:10
Message-ID:	CANP8+jLJ2nX+O7AbxF-vYHYuxuX91DXRO2_P9cyrPVFV=tBrrw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 22 August 2015 at 15:14, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2015-06-30 08:02:25 +0100, Simon Riggs wrote:
> > Proposal for improving this is to acquire the ClogControlLock in Shared
> > mode, if possible.
>
> I find that rather scary. That requires for all read and write paths in
> clog.c and slru.c to only ever read memory locations once. Otherwise
> those reads may not get the same results. That'd mean we'd need to add
> indirections via volatile to all reads/writes. It also requires that we
> never read in 4 byte quantities.

There is is a very specific case in which this is possible. We already
allow writes to data structures in the transaction manager without locks
*in certain cases* and this is all that is proposed here. Nothing scary
about doing something we already do, as long as we get it right.

> > This is safe because people checking visibility of an xid must always run
> > TransactionIdIsInProgress() first to avoid race conditions, which will
> > always return true for the transaction we are currently committing. As a
> > result, we never get concurrent access to the same bits in clog, which
> > would require a barrier.
>
> I don't think that's really sufficient. There's a bunch of callers do
> lookups without such checks, e.g. in heapam.c. It might be safe due to
> other circumstances, but at the very least this is a significant but
> sublte API revision.

I've checked the call sites you mention and they refer to tests made AFTER
we know have waited for the xid to complete via the lock manager. So as of
now, there are no callers of TransactionIdGetStatus() that have not already
confirmed that the xid is no longer active in the lock manager or the
procarray. Since we set clog before touching lock manager or procarray we
can be certain there is no concurrent reads and writes.

> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
> writes an 8 byte variable (the lsn). That's not safe.
>

Agreed, thanks for spotting that.

I see this as the biggest problem to overcome with this patch.

We write WAL in pages, so we don't need to store the low order bits (varies
according to WAL page size). We seldom use the higher order bits, since it
takes a while to go thru (8192 * INT_MAX) = 32PB of WAL. So it seems like
we can have a base LSN for a whole clog page, plus 4-byte LSN offsets. That
way we can make the reads and writes atomic on all platforms. All of that
can be hidden in clog.c in macros, so low impact, modular code.

> A patch like this will need far more changes. Every read and write from
> a page has to be guaranteed to only be done once, otherwise you can get
> 'effectively torn' values.
>
> That is, you can't just do
> static void
> TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr
> lsn, int slotno)
> ...
> char *byteptr;
> char byteval;
> ...
> /* note this assumes exclusive access to the clog page */
> byteval = *byteptr;
> byteval &= ~(((1 << CLOG_BITS_PER_XACT) - 1) << bshift);
> byteval |= (status << bshift);
> *byteptr = byteval;
> ...
>
> the compiler is entirely free to "optimize" away the byteval variable
> and do all these masking operations on memory! It can intermittenly
> write temporary values to byteval, because e.g. the register pressure is
> too high.
>
> To actually rely on single-copy-atomicity you have to enforce that these
> reads/writes can only happen. Leavout out some possible macro
> indirection that'd have to look like
> byteval = (volatile char *) byteptr;
> ...
> *(volatile char *) byteptr = byteval;
> some for TransactionIdGetStatus(), without the write side obviously.
>

Seems doable.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-26 10:40:37
Message-ID:	CAA4eK1LJkAhn9uqejMXBNc+Dp-q-+ZEXnc0x=bzh-s3407A_Nw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> On 22 August 2015 at 15:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
>

> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>> writes an 8 byte variable (the lsn). That's not safe.
>>
>
> Agreed, thanks for spotting that.
>
> I see this as the biggest problem to overcome with this patch.
>

How about using 64bit atomics or spinlock to protect this variable?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-26 10:48:13
Message-ID:	CANP8+jLxC1yhMtwPCGfbyqjuhoasYvuT8C09hqK=3tGsN_Hf+Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 26 August 2015 at 11:40, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
>
>> On 22 August 2015 at 15:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>
>
>> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>>> writes an 8 byte variable (the lsn). That's not safe.
>>>
>>
>> Agreed, thanks for spotting that.
>>
>> I see this as the biggest problem to overcome with this patch.
>>
>
> How about using 64bit atomics or spinlock to protect this variable?
>

Spinlock is out IMHO because this happens on every clog lookup. So it must
be an atomic read.

I'm wondering if its worth making this work on 32-bit systems at all. The
contention problems only occur on higher end servers, so we can just
disable this optimization if we aren't on a 64bit server.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-08-31 11:34:59
Message-ID:	CAA4eK1L8a6CA_Pjb-Ke1+JzE9yvx_aaowMKt5j6Kn9YumwkpYQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Aug 26, 2015 at 4:18 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 26 August 2015 at 11:40, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
wrote:
>>>
>>> On 22 August 2015 at 15:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>>
>>>>
>>>> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>>>> writes an 8 byte variable (the lsn). That's not safe.
>>>
>>>
>>> Agreed, thanks for spotting that.
>>>
>>> I see this as the biggest problem to overcome with this patch.
>>
>>
>> How about using 64bit atomics or spinlock to protect this variable?
>
>
> Spinlock is out IMHO because this happens on every clog lookup. So it
must be an atomic read.
>

Agreed, however using atomics is still an option, yet another way could
be before updating group_lsn, check if we already have CLogControlLock
in Exclusive mode then update it, else release the lock, re-acquire in
Exclusive mode and update the variable. The first thing that comes to mind
with this idea is that it will be less performant, yes thats true, but it
will be
only done for asynchronous commits (mode which is generally not recommended
for production-use) and that too not on every commit, so may be the impact
is
not that high. I have updated the patch (attached with mail) to show
you what
I have in mind.

Another point about the latest patch:

+ (write_ok ||
+ shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS))

Do you think that with new locking protocol as proposed in this
patch, it is safe to return if page status is SLRU_PAGE_WRITE_IN_PROGRESS
even if write_ok is true?

I think the case where it can cause problem is during
SlruInternalWritePage()
where it performs below actions:
1. first marks the status of page as SLRU_PAGE_WRITE_IN_PROGRESS.
2. then take buffer lock in Exclusive mode.
3. release control lock.
4. perform the write
5. re-acquire the control lock in Exclusive mode.

Now consider another client which has to update the transaction status:
1. Control lock in Shared mode.
2. Get the slot
3. Acquire the buffer lock in Exclusive mode

Now consider client which has to update the transaction status performs
its step-1 after step-3 of writer, if that happens, then that could lead to
deadlock because writer will wait for client to release control lock and
client will wait for writer to release buffer lock.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
clog_optimize.v5.patch	application/octet-stream	6.3 KB

From:	Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-09-18 18:01:51
Message-ID:	55FC518F.4070002@redhat.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 08/31/2015 07:34 AM, Amit Kapila wrote:
> I have updated the patch (attached with mail) to show
> you what I have in mind.
>

I havn't been able to get a successful run with _v5 using pgbench.

TransactionIdSetStatusBit assumes an exclusive lock on CLogControlLock
when called, but that part is removed from TransactionIdSetPageStatus now.

I tried an

if (!LWLockHeldByMe(CLogControlLock))
{
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
mode = LW_EXCLUSIVE;
}

approach, but didn't get further. Plus that probably isn't the best way,
since we will traverse all LWLock's, and we need to account for cases w/
and w/o sub-transactions, e.g. multiple calls to
TransactionIdSetStatusBit within the CLogControlLock exclusive boundary.

Best regards,
Jesper

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-09-19 03:32:23
Message-ID:	CAA4eK1+fD8v9YvqZnMkEvWggdRceRVdDyERN0GQ7Y==928pA-w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Sep 18, 2015 at 11:31 PM, Jesper Pedersen <
jesper(dot)pedersen(at)redhat(dot)com> wrote:

> On 08/31/2015 07:34 AM, Amit Kapila wrote:
>
>> I have updated the patch (attached with mail) to show
>> you what I have in mind.
>>
>>
> I havn't been able to get a successful run with _v5 using pgbench.
>
>
This patch is still not ready for performance test, as you might
have seen in comments that we are still discussing locking
issues.

> TransactionIdSetStatusBit assumes an exclusive lock on CLogControlLock
> when called, but that part is removed from TransactionIdSetPageStatus now.
>

It doesn't seem to be a necessary condition, thats why in this patch
a smaller granularity lock is introduced.

> I tried an
>
> if (!LWLockHeldByMe(CLogControlLock))
> {
> LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
> mode = LW_EXCLUSIVE;
> }
>
> approach, but didn't get further.

I suspect the problem is something else.

> Plus that probably isn't the best way, since we will traverse all LWLock's,

Yes, but it does in such a way that probably the caller will find it in
very few initial entries in the array.

Thank you very much for doing valuable performance tests with the
CLog patches.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Reducing ClogControlLock contention
Date:	2015-12-16 22:28:15
Message-ID:	CANP8+jKUfWNz3FQS_2o19bV7vESO+v_WmBqkVrRoHPr1pTs5Kw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 22 August 2015 at 15:14, Andres Freund <andres(at)anarazel(dot)de> wrote:

> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
> writes an 8 byte variable (the lsn). That's not safe.
>

This point was the main sticking point here.

It turns out that we don't need to store the LSN (8 bytes).

WAL is fsynced every time we finish writing a file, so we only actually
need to store the byte position within each file, so no more than 16MB.
That fits within a 4 byte value, so can be written atomically.

So I have a valid way forward for this approach. Cool.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services