Re: FlexLocks

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: FlexLocks
Date: 2011-11-15 13:50:25
Message-ID: CA+Tgmob4YE_k5dpO0T07PNf1SOKPybo+wj4m4FryOS7Z4_yOzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've been noodling around with various methods of reducing
ProcArrayLock contention and (after many false starts) I think I've
finally found one that works really well. I apologize in advance if
this makes your head explode; I think that the design I have hear is
solid, but it represents a significant and invasive overhaul of the
LWLock system - I think for the better, but you'll have to be the
judge. I'll start with the performance numbers (from that good ol'
32-core Nate Boley system), where I build from commit
f1585362856d4da17113ba2e4ba46cf83cba0cf2, with and without the
attached patches, and then ran pgbench on logged and unlogged tables
with various numbers of clients, with shared_buffers = 8GB,
maintenance_work_mem = 1GB, synchronous_commit = off,
checkpoint_segments = 300, checkpoint_timeout = 15min,
checkpoint_completion_target = 0.9, wal_writer_delay = 20ms. The
numbers below are (as usual) the median of three-five minute runs at
scale factor 100. The lines starting with "m" and a number are that
number of clients on unpatched master, and the lines starting with "f"
are that number of clients with this patch set.

The really big win here is unlogged tables at 32 clients, where
throughput has *doubled* and now scales *better than linearly* as
compared with the single-client results.

== Unlogged Tables ==
m01 tps = 679.737639 (including connections establishing)
f01 tps = 668.275270 (including connections establishing)
m08 tps = 4771.757193 (including connections establishing)
f08 tps = 4867.520049 (including connections establishing)
m32 tps = 10736.232426 (including connections establishing)
f32 tps = 21303.295441 (including connections establishing)
m80 tps = 7829.989887 (including connections establishing)
f80 tps = 19835.231438 (including connections establishing)

== Permanent Tables ==
m01 tps = 634.424125 (including connections establishing)
f01 tps = 633.450405 (including connections establishing)
m08 tps = 4544.781551 (including connections establishing)
f08 tps = 4556.298219 (including connections establishing)
m32 tps = 9902.844302 (including connections establishing)
f32 tps = 11028.745881 (including connections establishing)
m80 tps = 7467.437442 (including connections establishing)
f80 tps = 11909.738232 (including connections establishing)

A couple of other interesting things buried in these numbers:

1. Permanent tables don't derive nearly as much benefit as unlogged
tables. I believe that this is because, for permanent tables, the
major bottleneck is WALInsertLock. Fixing ProcArrayLock squeezes out
a healthy 10%, but we'll have to make significant performance on
WALInsertLock to get anywhere close to linear scaling.
2. The drop-off between 32 clients and 80 clients is greatly reduced
with this patch set; indeed, for permanent tables, tps increased
slightly between 32 and 80 clients. I believe that small decrease for
unlogged tables is likely due to the fact that by 80 tables,
WALInsertLock starts to become a contention point, due to the need to
insert the commit records.

In terms of the actual patches, it's been bugging me for a while that
the LWLock code contains a lot of infrastructure that's not easily
reusable by other parts of the system. So the first of the two
attached patches, flexlock-v1.patch, separates the LWLock code into an
upper layer and a lower layer. The lower layer I called "FlexLocks",
and it's designed to allow a variety of locking implementations to be
built on top of it and reuse as much of the basic infrastructure as I
could figure out how to make reusable without hurting performance too
much. LWLocks become the anchor client of the FlexLock system; in
essence, most of flexlock.c is code that was removed from lwlock.c.
The second patch, procarraylock.c, uses that infrastructure to define
a new type of FlexLock specifically for ProcArrayLock. It basically
works like a regular LWLock, except that it has a special operation to
optimize ProcArrayEndTransaction(). In the uncontended case, instead
of acquiring and releasing the lock, it just grabs the lock, observes
that there is no contention, clears the critical PGPROC fields (which
isn't noticeably slower than updating the state of the lock would be)
and releases the spin lock. There's then no need to reacquire the
spinlock to "release" the lock; we're done. In the contended case,
the backend wishing to end adds itself to a queue of ending
transactions. When ProcArrayLock is released, the last person out
clears the PGPROC structures for all the waiters and wakes them all
up; they don't need to reacquire the lock, because the work they
wished to perform while holding it is already done. Thus, in the
*worst* case, ending transactions only need to acquire the spinlock
protecting ProcArrayLock half as often (once instead of twice), and in
the best case (where backends have to keep retrying only to repeatedly
fail to get the lock) it's far better than that.

Of course, there are ways that this could be implemented without the
FlexLock stuff, if people don't like this solution. Myself, I find it
quite elegant (though there are certainly arguable points in there
where the code could probably be improved), but then again, I wrote
it. For what it's worth, I believe that there are other places where
the FlexLock infrastructure could be helpful. In this case, the new
ProcArrayLock is very specific to what ProcArrayLock actually does,
and can't be really reused for anything else. But I've had a thought
that we might want to have a type of FlexLock that contains an LSN.
The lock holder advances the LSN and can then release everyone who was
waiting for a value <= that LSN without them needing to reacquire the
lock. This could be useful for things like WALWriteLock, and sync
rep. Also, I think there might be interesting applications for buffer
locks, perhaps by having a lock type that manages both content locks
and pins. Alternatively, if we want to support CRCs, it might be
useful to have a third buffer lock mode in between shared and
exclusive. SX would conflict with itself and with exclusive but not
with shared, and would be required to write out the page or set hint
bits but not just to examine tuples; this could be used to ensure that
the page doesn't change (thus invalidating the CRC) while the write is
in progress. I'm not necessarily saying that any of these particular
things are what we want to do, just throwing out the idea that we may
want a variety of lock types that are similar to lightweight locks but
with subtly different behavior, yet with common infrastructure for
error handling and wait queue management.

Anyway, this is all up for discussion, argument, etc. - but here are
the patches. Comments, idea, thoughts, code review, and/or testing
are appreciated.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
flexlock-v1.patch application/octet-stream 73.7 KB
procarraylock-v1.patch application/octet-stream 33.5 KB

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-15 16:55:49
Message-ID: 4EC245350200002500042F62@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I'm not necessarily saying that any of these particular
> things are what we want to do, just throwing out the idea that we
> may want a variety of lock types that are similar to lightweight
> locks but with subtly different behavior, yet with common
> infrastructure for error handling and wait queue management.

The locking in the clog area is pretty funky. I bet we could craft
a special flavor of FlexLock to make that cleaner. And I would be
surprised if some creative thinking didn't yield a far better FL
scheme for SSI than we can manage with existing LW locks.

Your description makes sense to me, and your numbers prove the value
of the concept. Whether there's anything in the implementation I
would quibble about will take some review time.

-Kevin


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-15 18:40:19
Message-ID: CA+U5nMJ7wn1HB9QqNvitH_UnawZDE0OoEaqvGYsaDNcAEU7Tgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 1:50 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> It basically
> works like a regular LWLock, except that it has a special operation to
> optimize ProcArrayEndTransaction().  In the uncontended case, instead
> of acquiring and releasing the lock, it just grabs the lock, observes
> that there is no contention, clears the critical PGPROC fields (which
> isn't noticeably slower than updating the state of the lock would be)
> and releases the spin lock.  There's then no need to reacquire the
> spinlock to "release" the lock; we're done.  In the contended case,
> the backend wishing to end adds itself to a queue of ending
> transactions.  When ProcArrayLock is released, the last person out
> clears the PGPROC structures for all the waiters and wakes them all
> up; they don't need to reacquire the lock, because the work they
> wished to perform while holding it is already done.  Thus, in the
> *worst* case, ending transactions only need to acquire the spinlock
> protecting ProcArrayLock half as often (once instead of twice), and in
> the best case (where backends have to keep retrying only to repeatedly
> fail to get the lock) it's far better than that.

Which is the same locking avoidance technique we already use for sync
rep and for the new group commit patch.

I've been saying for some time that we should use the same technique
for ProcArray and clog also, so we only need to queue once rather than
queue three times at end of each transaction.

I'm not really enthused by the idea of completely rewriting lwlocks
for this. Seems like specialised code is likely to be best, as well as
having less collateral damage.

With that in mind, should we try to fuse the group commit with the
procarraylock approach, so we just queue once and get woken when all
the activities have been handled? If the first woken proc performs the
actions then wakes people further down the queue it could work quite
well.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-15 20:16:31
Message-ID: CA+TgmobxsXY92y8wc9QjSFQrEim5R-SD++s1MXKXqR8aPufQ6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 1:40 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Which is the same locking avoidance technique we already use for sync
> rep and for the new group commit patch.

Yep...

> I've been saying for some time that we should use the same technique
> for ProcArray and clog also, so we only need to queue once rather than
> queue three times at end of each transaction.
>
> I'm not really enthused by the idea of completely rewriting lwlocks
> for this. Seems like specialised code is likely to be best, as well as
> having less collateral damage.

Well, the problem that I have with that is that we're going to end up
with a lot of specialized code, particularly around error recovery.
This code doesn't remove the need for ProcArrayLock to be taken in
exclusive mode, and I don't think there IS any easy way to remove the
need for that to happen sometimes. So we have to deal with the
possibility that an ERROR might occur while we hold the lock, which
means we have to release the lock and clean up our state. That means
every place that has a call to LWLockReleaseAll() will now also need
to cleanup ProperlyCleanUpProcArrayLockStuff(). And the next time we
need to add some kind of specialized lock, we'll need to do the same
thing again. It seems to me that that rapidly gets unmanageable, not
to mention *slow*. We need some kind of common infrastructure for
releasing locks, and this is an attempt to create such a thing. I'm
not opposed to doing it some other way, but I think doing each one as
a one-off isn't going to work out very well.

Also, in this particular case, I really do want shared and exclusive
locks on ProcArrayLock to stick around; I just want one additional
operation as well. It's a lot less duplicated code to do that this
way than it is to write something from scratch. The FlexLock patch
may look big, but it's mostly renaming and rearranging; it's really
not adding much code.

> With that in mind, should we try to fuse the group commit with the
> procarraylock approach, so we just queue once and get woken when all
> the activities have been handled? If the first woken proc performs the
> actions then wakes people further down the queue it could work quite
> well.

Well, there's too much work there to use the same approach I took
here: we can't very well hold onto the LWLock spinlock while flushing
WAL or waiting for synchronous replication. Fusing together some
parts of the commit sequence might be the right approach (I don't
know), but honestly my gut feeling is that the first thing we need to
do is go in the opposite direction and break up WALInsertLock into
multiple locks that allow better parallelization of WAL insertion. Of
course if someone finds a way to fuse the whole commit sequence
together in some way that improves performance, fantastic, but having
tried a lot of things before I came up with this approach, I'm a bit
reluctant to abandon it in favor of an approach that hasn't been coded
or tested yet. I think we should pursue this approach for now, and we
can always revise it later if someone comes up with something even
better. As a practical matter, the test results show that with these
patches, ProcArrayLock is NOT a bottleneck at 32 cores, which seems
like enough reason to be pretty happy with it, modulo implementation
details.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-15 20:33:07
Message-ID: 1321388865-sup-4640@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Robert Haas's message of mar nov 15 17:16:31 -0300 2011:
> On Tue, Nov 15, 2011 at 1:40 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Which is the same locking avoidance technique we already use for sync
> > rep and for the new group commit patch.
>
> Yep...
>
> > I've been saying for some time that we should use the same technique
> > for ProcArray and clog also, so we only need to queue once rather than
> > queue three times at end of each transaction.
> >
> > I'm not really enthused by the idea of completely rewriting lwlocks
> > for this. Seems like specialised code is likely to be best, as well as
> > having less collateral damage.
>
> Well, the problem that I have with that is that we're going to end up
> with a lot of specialized code, particularly around error recovery.
> This code doesn't remove the need for ProcArrayLock to be taken in
> exclusive mode, and I don't think there IS any easy way to remove the
> need for that to happen sometimes. So we have to deal with the
> possibility that an ERROR might occur while we hold the lock, which
> means we have to release the lock and clean up our state. That means
> every place that has a call to LWLockReleaseAll() will now also need
> to cleanup ProperlyCleanUpProcArrayLockStuff(). And the next time we
> need to add some kind of specialized lock, we'll need to do the same
> thing again. It seems to me that that rapidly gets unmanageable, not
> to mention *slow*. We need some kind of common infrastructure for
> releasing locks, and this is an attempt to create such a thing. I'm
> not opposed to doing it some other way, but I think doing each one as
> a one-off isn't going to work out very well.

I agree. In fact, I would think that we should look into rewriting the
sync rep locking (and group commit) on top of flexlocks, not the other
way around. As Kevin says nearby it's likely that we could find some
way to rewrite the SLRU (clog and such) locking protocol using these new
things too.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-15 20:47:54
Message-ID: 4EC27B9A0200002500042F84@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:

> As Kevin says nearby it's likely that we could find some way to
> rewrite the SLRU (clog and such) locking protocol using these new
> things too.

Yeah, I really meant all SLRU, not just clog. And having seen what
Robert has done here, I'm kinda glad I haven't gotten around to
trying to reduce LW lock contention yet, even though we're getting
dangerously far into the release cycle -- I think it can be done
much better with the, er, flexibility offered by the FlexLock patch.

-Kevin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-15 22:54:52
Message-ID: CA+TgmoYjBS3xotBuZ+GTxfTpnNLT7c2Gyx6=vS9xN11BxEdvZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 3:47 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
>
>> As Kevin says nearby it's likely that we could find some way to
>> rewrite the SLRU (clog and such) locking protocol using these new
>> things too.
>
> Yeah, I really meant all SLRU, not just clog.  And having seen what
> Robert has done here, I'm kinda glad I haven't gotten around to
> trying to reduce LW lock contention yet, even though we're getting
> dangerously far into the release cycle -- I think it can be done
> much better with the, er, flexibility offered by the FlexLock patch.

I've had a thought that the SLRU machinery could benefit from having
the concept of a "pin", which it currently doesn't. I'm not certain
whether that thought is correct.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FlexLocks
Date: 2011-11-16 05:37:38
Message-ID: 20111116053737.GC69215@csail.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 10:55:49AM -0600, Kevin Grittner wrote:
> And I would be
> surprised if some creative thinking didn't yield a far better FL
> scheme for SSI than we can manage with existing LW locks.

One place I could see it being useful is for
SerializableFinishedListLock, which protects the queue of committed
sxacts that can't yet be cleaned up. When committing a transaction, it
gets added to the list, and then scans the queue to find and clean up
any sxacts that are no longer needed. If there's contention, we don't
need multiple backends doing that scan; it's enough for one to complete
it on everybody's behalf.

I haven't thought it through, but it may also help with the other
contention bottleneck on that lock: that every transaction needs to add
itself to the cleanup list when it commits.

Mostly unrelatedly, the other thing that's looking like it would be really
useful would be some support for atomic integer operations. This would
be useful for some SSI things like writableSxactCount, and some things
elsewhere like the strong lock count in the regular lock manager.
I've been toying with the idea of creating an AtomicInteger type with a
few operations like increment-and-get, compare-and-set, swap, etc. This
would be implemented using the appropriate hardware operations on
platforms that support them (x86_64, perhaps others) and fall back on a
spinlock implementation on other platforms. I'll probably give it a try
and see what it looks like, but if anyone has any thoughts, let me know.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 08:03:57
Message-ID: CA+U5nMJrrnH-Bz9Dfkugh7icRsM0cw7-XWsUjzZdEjPwTXKTng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 8:33 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:

>> > I'm not really enthused by the idea of completely rewriting lwlocks
>> > for this. Seems like specialised code is likely to be best, as well as
>> > having less collateral damage.
>>
>> Well, the problem that I have with that is that we're going to end up
>> with a lot of specialized code, particularly around error recovery.
>> This code doesn't remove the need for ProcArrayLock to be taken in
>> exclusive mode, and I don't think there IS any easy way to remove the
>> need for that to happen sometimes.  So we have to deal with the
>> possibility that an ERROR might occur while we hold the lock, which
>> means we have to release the lock and clean up our state.  That means
>> every place that has a call to LWLockReleaseAll() will now also need
>> to cleanup ProperlyCleanUpProcArrayLockStuff().  And the next time we
>> need to add some kind of specialized lock, we'll need to do the same
>> thing again.  It seems to me that that rapidly gets unmanageable, not
>> to mention *slow*.  We need some kind of common infrastructure for
>> releasing locks, and this is an attempt to create such a thing.  I'm
>> not opposed to doing it some other way, but I think doing each one as
>> a one-off isn't going to work out very well.
>
> I agree.  In fact, I would think that we should look into rewriting the
> sync rep locking (and group commit) on top of flexlocks, not the other
> way around.  As Kevin says nearby it's likely that we could find some
> way to rewrite the SLRU (clog and such) locking protocol using these new
> things too.

I see the commonality between ProcArray locking and Sync Rep/ Group
Commit locking. It's basically the same design, so although it wasn't
my first thought, I agree.

I did originally write that using spinlocks, but that was objected to.
Presumably the same objection would hold here also, but if it doesn't
that's good.

Mixing the above 3 things together is enough for me; I just don't see
the reason to do a global search and replace on the lwlock name in
order to do that. This is 2 patches at same time, 1 we clearly need, 1
I'm not sure about. Perhaps some more explanation about the flexlocks
structure and design will smooth that unease.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Simon Riggs" <simon(at)2ndQuadrant(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 15:26:13
Message-ID: 4EC381B50200002500043031@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:

> I just don't see the reason to do a global search and replace on
> the lwlock name

I was going to review further before commenting on that, but since
it has now come up -- it seems odd that a source file which uses
only LW locks needs to change so much for the FlexLock
implementation. I'm not sure source code which uses the next layer
up from FlexLock should need to be aware of it quite so much. I
assume that this was done to avoid adding a layer to some code where
it could cause an unnecessary performance hit, but maybe it would be
worth just wrapping with a macro to isolate the levels where that's
all it takes.

For example, if these two macros were defined, predicate.c wouldn't
have needed any modifications, and I suspect that is true of many
other files (although possibly needing a few other macros):

#define LWLockId FlexLockId
#define LWLockHeldByMe(lock) FlexLockHeldByMe(lock)

Particularly with the function call it seems like it's a mistake to
assume that test will always be the same between LW locks and flex
locks. There may be a better way to do it than the above, but I
think a worthy goal would be to impose zero source code changes on
code which continues to use "traditional" lightweight locks.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Simon Riggs" <simon(at)2ndQuadrant(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 15:41:14
Message-ID: 23485.1321458074@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>> I just don't see the reason to do a global search and replace on
>> the lwlock name

> I was going to review further before commenting on that, but since
> it has now come up -- it seems odd that a source file which uses
> only LW locks needs to change so much for the FlexLock
> implementation.

Yeah, -1 on wideranging source changes for me too. There is no reason
that the current LWLock API need change. (I'm not saying that it has
to be same ABI though --- macro wrappers would be fine.)

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 15:41:14
Message-ID: CA+TgmoYDv2_-mtAtyYyMnBdC+PV8D_dqa3pQoi6wOH7wYscrag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 16, 2011 at 10:26 AM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> For example, if these two macros were defined, predicate.c wouldn't
> have needed any modifications, and I suspect that is true of many
> other files (although possibly needing a few other macros):
>
> #define LWLockId FlexLockId
> #define LWLockHeldByMe(lock) FlexLockHeldByMe(lock)
>
> Particularly with the function call it seems like it's a mistake to
> assume that test will always be the same between LW locks and flex
> locks.  There may be a better way to do it than the above, but I
> think a worthy goal would be to impose zero source code changes on
> code which continues to use "traditional" lightweight locks.

Well, it would certainly be easy enough to add those macros, and I'm
not necessarily opposed to it, but I fear it could end up being a bit
confusing in the long run. If we adopt this infrastructure, then I
expect knowledge of different types of FlexLocks to gradually
propagate through the system. Now, you're always going to use
LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
but a FlexLockId isn't guaranteed to be an LWLockId - any given value
might also refer to a FlexLock of some other type. If we let everyone
continue to refer to those things as LWLockIds, then it seems like
only a matter of time before someone has a variable that's declared as
LWLockId but actually doesn't refer to an LWLock at all. I think it's
better to bite the bullet and do the renaming up front, rather than
having to think about it every time you modify some code that uses
LWLockId or LWLockHeldByMe and say to yourself, "oh, wait a minute, is
this really guaranteed to be an LWLock?"

For LWLockHeldByMe, a sensible compromise might be to add a function
that asserts that the FlexLockId passed as an argument is in fact
pointing to an LWLock, and then calls FlexLockHeldByMe() and returns
the result. That way you'd presumably noticed if you used the more
specific function when you needed the more general one (because,
hopefully, the regression tests would fail). But I'm not seeing any
obvious way of providing a similar degree of insulation against
abusing LWLockId.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 15:51:33
Message-ID: 23744.1321458693@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Well, it would certainly be easy enough to add those macros, and I'm
> not necessarily opposed to it, but I fear it could end up being a bit
> confusing in the long run. If we adopt this infrastructure, then I
> expect knowledge of different types of FlexLocks to gradually
> propagate through the system. Now, you're always going to use
> LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
> but a FlexLockId isn't guaranteed to be an LWLockId - any given value
> might also refer to a FlexLock of some other type. If we let everyone
> continue to refer to those things as LWLockIds, then it seems like
> only a matter of time before someone has a variable that's declared as
> LWLockId but actually doesn't refer to an LWLock at all. I think it's
> better to bite the bullet and do the renaming up front, rather than
> having to think about it every time you modify some code that uses
> LWLockId or LWLockHeldByMe and say to yourself, "oh, wait a minute, is
> this really guaranteed to be an LWLock?"

In that case, I think you've chosen an unfortunate naming convention
and should rethink it. There is not any benefit to be gained from a
global search and replace here, and as somebody who spends quite enough
time dealing with cross-branch coding differences already, I'm going to
put my foot down about introducing a useless one.

Perhaps it would be better to think of this as "they're all lightweight
locks, but some have different locking policies". Or "we're taking a
different type of lock on this particular lock" --- that would match up
rather better with the way we think about heavyweight locks.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 16:09:19
Message-ID: CA+TgmobKnDqFH8KJSSuRCdhyganHPW0jbo-xTBqeMfpFpx4xgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 16, 2011 at 10:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Well, it would certainly be easy enough to add those macros, and I'm
>> not necessarily opposed to it, but I fear it could end up being a bit
>> confusing in the long run.  If we adopt this infrastructure, then I
>> expect knowledge of different types of FlexLocks to gradually
>> propagate through the system.  Now, you're always going to use
>> LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
>> but a FlexLockId isn't guaranteed to be an LWLockId - any given value
>> might also refer to a FlexLock of some other type.  If we let everyone
>> continue to refer to those things as LWLockIds, then it seems like
>> only a matter of time before someone has a variable that's declared as
>> LWLockId but actually doesn't refer to an LWLock at all.  I think it's
>> better to bite the bullet and do the renaming up front, rather than
>> having to think about it every time you modify some code that uses
>> LWLockId or LWLockHeldByMe and say to yourself, "oh, wait a minute, is
>> this really guaranteed to be an LWLock?"
>
> In that case, I think you've chosen an unfortunate naming convention
> and should rethink it.  There is not any benefit to be gained from a
> global search and replace here, and as somebody who spends quite enough
> time dealing with cross-branch coding differences already, I'm going to
> put my foot down about introducing a useless one.
>
> Perhaps it would be better to think of this as "they're all lightweight
> locks, but some have different locking policies".  Or "we're taking a
> different type of lock on this particular lock" --- that would match up
> rather better with the way we think about heavyweight locks.

I struggled a lot with how to name this, and I'm not going to pretend
that what I came up with is necessarily ideal. But the basic idea
here is that all FlexLocks share the following properties in common:

- they are identified by a FlexLockId
- they are released by FlexLockReleaseAll
- they make use of the lwlock-related fields (renamed in the patch) in
PGPROC for sleep and wakeup handling
- they have a type indicator, a mutex, a retry flag, and a wait queue

But the following things are different per-type:

- acquire, conditional acquire (if any), and release functions
- available lock modes
- additional data fields that are part of the lock

Now, it seemed to me that if I was going to split the LWLock facililty
into two layers, either the upper layer could be LWLocks, or the lower
layer could be LWLocks, but they couldn't both be LWLocks. Since we
use LWLockAcquire() and LWLockRelease() all over the place but only
make reference to LWLockId in comparatively few places, it seemed to
me to be by far the less invasive renaming to make the upper layer be
LWLocks and the lower layer be something else.

Now maybe there is some better way to do this, but at the moment, I'm
not seeing it. If we call them all LWLocks, but only some of them
support LWLockAcquire(), then that's going to be pretty weird. The
situation is not really analagous to heavy-weight locks, where every
lock supports every lock mode, but in practice only table locks make
use of them all. In this particular case, we do not want to clutter
up the vanilla LWLock implementation with a series of special cases
that are only useful for a minority of locks in the system. That will
cause them to stop being lightweight, which misses the point; and it
will be ugly as hell, because the exact frammishes needed will
doubtless vary from one lock to another, and having just one lock type
that supports every single one of those frammishes is certainly a bad
idea.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 16:14:40
Message-ID: CAM-w4HNxn29Jg-JnF976ALx4vnSCbDRgucOye_2TpBe_bPnbhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 16, 2011 at 3:41 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>  Now, you're always going to use
> LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
> but a FlexLockId isn't guaranteed to be an LWLockId - any given value
> might also refer to a FlexLock of some other type.  If we let everyone
> continue to refer to those things as LWLockIds, then it seems like
> only a matter of time before someone has a variable that's declared as
> LWLockId but actually doesn't refer to an LWLock at all.  I think it's
> better to bite the bullet and do the renaming up front, rather than
> having to think about it every time you modify some code that uses
> LWLockId or LWLockHeldByMe and say to yourself, "oh, wait a minute, is
> this really guaranteed to be an LWLock?"

But that's an advantage to having a distinct API layer for LWLock
instead of having callers directly call FlexLock methods. The LWLock
macros can AssertMacro that the lockid they were passed are actually
LWLocks and not some other type of lock. That would require assigning
FlexLockIds that are recognizably LWLocks but that's not implausible
is it?

--
greg


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 16:17:42
Message-ID: 4EC38DC60200002500043042@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Now maybe there is some better way to do this, but at the moment,
> I'm not seeing it. If we call them all LWLocks, but only some of
> them support LWLockAcquire(), then that's going to be pretty
> weird.

Is there any way to typedef our way out of it, such that a LWLock
*is a* FlexLock, but a FlexLock isn't a LWLock? If we could do
that, you couldn't use just a plain old FlexLock in LWLockAcquire(),
but you could do the cleanups, etc., that you want.

-Kevin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 17:00:21
Message-ID: CA+Tgmob2MbLWQ49ZUEvCc11HLN38jhG_Y+D8RfX34qTYWeUXDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 16, 2011 at 11:14 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Wed, Nov 16, 2011 at 3:41 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>  Now, you're always going to use
>> LWLockAcquire() and LWLockRelease() to acquire and release an LWLock,
>> but a FlexLockId isn't guaranteed to be an LWLockId - any given value
>> might also refer to a FlexLock of some other type.  If we let everyone
>> continue to refer to those things as LWLockIds, then it seems like
>> only a matter of time before someone has a variable that's declared as
>> LWLockId but actually doesn't refer to an LWLock at all.  I think it's
>> better to bite the bullet and do the renaming up front, rather than
>> having to think about it every time you modify some code that uses
>> LWLockId or LWLockHeldByMe and say to yourself, "oh, wait a minute, is
>> this really guaranteed to be an LWLock?"
>
> But that's an advantage to having a distinct API layer for LWLock
> instead of having callers directly call FlexLock methods. The LWLock
> macros can AssertMacro that the lockid they were passed are actually
> LWLocks and not some other type of lock. That would require assigning
> FlexLockIds that are recognizably LWLocks but that's not implausible
> is it?

Well, that works for the most part. You still need a few generic
functions, like FlexLockReleaseAll(), which releases all FlexLocks of
all types, not just those of some particular type. And it doesn't
solve the problem with FlexLockId, which can potentially refer to a
FlexLock of any type, not just a LWLock.

I think we might be getting slightly more excited about this problem
than it actually deserves. Excluding lwlock.{c,h}, the new files
added by this patch, and the documentation changes, this patch adds
103 lines and removes 101. We can uncontroversially reduce each
numbers by 14 by adding a separate LWLockHeldByMe() function that does
the same thing as FlexLockHeldByMe() but also asserts the lock type.
That would leave us adding 89 lines of code and removing 87.

If we (against my better judgement) take the position that we must
continue to use LWLockId rather than FlexLockId as the type name in
any place that only treats with LWLocks we could reduce each of those
numbers by an additional 34, giving new totals of 55 and 53 lines of
changes outside the lwlock/flexlock code itself rather than 89 and 87.
I humbly submit that this is not really enough to get excited about.
We've make far more sweeping notational changes than that more than
once - even, I think, with some regularity.

This may seem invasive because it's touching LWLocks, and we use those
everywhere, but in practice the code footprint is very small because
typical usage is just LWLockAcquire(BlahLock) and then
LWLockRelease(BlahLock). And I'm not proposing to change that usage
in any way; avoiding any change in that area was, in fact, one of my
main design goals.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FlexLocks
Date: 2011-11-16 17:04:14
Message-ID: CA+TgmoZW0mY4HpzYS=Jw7d=YLuwi71jCHEVmyQS7Wzjr=_hvXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 16, 2011 at 11:17 AM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> Now maybe there is some better way to do this, but at the moment,
>> I'm not seeing it.  If we call them all LWLocks, but only some of
>> them support LWLockAcquire(), then that's going to be pretty
>> weird.
>
> Is there any way to typedef our way out of it, such that a LWLock
> *is a* FlexLock, but a FlexLock isn't a LWLock?  If we could do
> that, you couldn't use just a plain old FlexLock in LWLockAcquire(),
> but you could do the cleanups, etc., that you want.

Well, if we just say:

typedef FlexLockId LWLockId;

...that's about equivalent to the #define from the compiler's point of
view. We could alternatively change one or the other of them to be a
struct with one member, but I think the cure might be worse than the
disease. By my count, we are talking about saving perhaps as many as
34 lines of code changes here, and that's only if complicating the
type handling doesn't require any changes to places that are untouched
at present, which I suspect it would.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: FlexLocks
Date: 2011-11-16 17:25:15
Message-ID: 4EC39D9B020000250004305B@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

>> Is there any way to typedef our way out of it [?]

> Well, if we just say:
>
> typedef FlexLockId LWLockId;
>
> ...that's about equivalent to the #define from the compiler's
> point of view.

Bummer -- I was hoping there was some equivalent to "subclassing"
that I just didn't know about. :-(

> We could alternatively change one or the other of them to be a
> struct with one member, but I think the cure might be worse than
> the disease. By my count, we are talking about saving perhaps as
> many as 34 lines of code changes here, and that's only if
> complicating the type handling doesn't require any changes to
> places that are untouched at present, which I suspect it would.

So I stepped through all the changes of this type, and I notice that
most of them are in areas where we've talked about likely benefits
of creating new FlexLock variants instead of staying with LWLocks;
if any of that is done (as seems likely), it further reduces the
impact from 34 lines. If we take care of LWLockHeldByMe() as you
describe, I'll concede the FlexLockId changes.

-Kevin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: FlexLocks
Date: 2011-11-16 20:17:06
Message-ID: CA+Tgmoax_14rbx8Y6mmgvW64gCQL4ZviDzwEObXEMuiV=TwmxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 16, 2011 at 12:25 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> We could alternatively change one or the other of them to be a
>> struct with one member, but I think the cure might be worse than
>> the disease.  By my count, we are talking about saving perhaps as
>> many as 34 lines of code changes here, and that's only if
>> complicating the type handling doesn't require any changes to
>> places that are untouched at present, which I suspect it would.
>
> So I stepped through all the changes of this type, and I notice that
> most of them are in areas where we've talked about likely benefits
> of creating new FlexLock variants instead of staying with LWLocks;
> if any of that is done (as seems likely), it further reduces the
> impact from 34 lines.  If we take care of LWLockHeldByMe() as you
> describe, I'll concede the FlexLockId changes.

Updated patches attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
flexlock-v2.patch application/octet-stream 70.3 KB
procarraylock-v1.patch application/octet-stream 33.5 KB

From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FlexLocks
Date: 2011-11-17 04:16:18
Message-ID: CABOikdPd-YmicyV8kZw25xksQZ4oJy7tS89yxHbadoh=Awe8tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 15, 2011 at 7:20 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The lower layer I called "FlexLocks",
> and it's designed to allow a variety of locking implementations to be
> built on top of it and reuse as much of the basic infrastructure as I
> could figure out how to make reusable without hurting performance too
> much.  LWLocks become the anchor client of the FlexLock system; in
> essence, most of flexlock.c is code that was removed from lwlock.c.
> The second patch, procarraylock.c, uses that infrastructure to define
> a new type of FlexLock specifically for ProcArrayLock.  It basically
> works like a regular LWLock, except that it has a special operation to
> optimize ProcArrayEndTransaction().  In the uncontended case, instead
> of acquiring and releasing the lock, it just grabs the lock, observes
> that there is no contention, clears the critical PGPROC fields (which
> isn't noticeably slower than updating the state of the lock would be)
> and releases the spin lock.

(Robert, we already discussed this a bit privately, so apologies for
duplicating this here)

Another idea is to have some sort of shared work queue mechanism which
might turn out to be more manageable and extendable. What I am
thinking about is having a {Request, Response} kind of structure per
backend in shared memory. An obvious place to hold them is in PGPROC
for every backend. We the have a new API like LWLockExecute(lock,
mode, ReqRes). The caller first initializes the ReqRes structure with
the work it needs get done and then calls LWLockExecute with that.
IOW, the code flow would look like this:

<Initialize the Req/Res structure with request type and input data>
LWLockExecute(lock, mode, ReqRes)
<Consume Response and proceed further>

If the lock is available in the desired mode, LWLockExecute() will
internally finish the work and return immediately. If the lock is
contended, the process would sleep. When current holder of the lock
finishes its work and calls LWLockRelease() to release the lock, it
would not only find the processes to wake up, but would also go
through their pending work items and complete them before waking them
up. The Response area will be populated with the result.

I think this general mechanism will be useful for many users of
LWLock, especially those who do very trivial updates/reads from the
shared area, but still need synchronization. One example that Robert
has already found helping a lot if ProcArrayEndTransaction. Also, even
though both shared and exclusive waiters can use this mechanism, it
may make more sense to the exclusive waiters because of the
exclusivity. For sake of simplicity, we can choose to force a
semantics that when LWLockExecute returns, the work is guaranteed to
be done, either by self or some other backend. That will keep the code
simpler for users of this new API.

Thanks,
Pavan
--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FlexLocks
Date: 2011-11-17 04:31:08
Message-ID: CA+TgmobKMc=Y8O4N=rZdWvF7PMJ0aTzSO7iNLiwsPobWVi8iFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 16, 2011 at 11:16 PM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> On Tue, Nov 15, 2011 at 7:20 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> The lower layer I called "FlexLocks",
>> and it's designed to allow a variety of locking implementations to be
>> built on top of it and reuse as much of the basic infrastructure as I
>> could figure out how to make reusable without hurting performance too
>> much.  LWLocks become the anchor client of the FlexLock system; in
>> essence, most of flexlock.c is code that was removed from lwlock.c.
>> The second patch, procarraylock.c, uses that infrastructure to define
>> a new type of FlexLock specifically for ProcArrayLock.  It basically
>> works like a regular LWLock, except that it has a special operation to
>> optimize ProcArrayEndTransaction().  In the uncontended case, instead
>> of acquiring and releasing the lock, it just grabs the lock, observes
>> that there is no contention, clears the critical PGPROC fields (which
>> isn't noticeably slower than updating the state of the lock would be)
>> and releases the spin lock.
>
> (Robert, we already discussed this a bit privately, so apologies for
> duplicating this here)
>
> Another idea is to have some sort of shared work queue mechanism which
> might turn out to be more manageable and extendable. What I am
> thinking about is having a {Request, Response} kind of structure per
> backend in shared memory. An obvious place to hold them is in PGPROC
> for every backend. We the have a new API like LWLockExecute(lock,
> mode, ReqRes). The caller first initializes the ReqRes structure with
> the work it needs get done and then calls LWLockExecute with that.
> IOW, the code flow would look like this:
>
> <Initialize the Req/Res structure with request type and input data>
> LWLockExecute(lock, mode, ReqRes)
> <Consume Response and proceed further>
>
> If the lock is available in the desired mode, LWLockExecute() will
> internally finish the work and return immediately. If the lock is
> contended, the process would sleep. When current holder of the lock
> finishes its work and calls LWLockRelease() to release the lock, it
> would not only find the processes to wake up, but would also go
> through their pending work items and complete them before waking them
> up. The Response area will be populated with the result.
>
> I think this general mechanism will be useful for many users of
> LWLock, especially those who do very trivial updates/reads from the
> shared area, but still need synchronization. One example that Robert
> has already found helping a lot if ProcArrayEndTransaction. Also, even
> though both shared and exclusive waiters can use this mechanism, it
> may make more sense to the exclusive waiters because of the
> exclusivity. For sake of simplicity, we can choose to force a
> semantics that when LWLockExecute returns, the work is guaranteed to
> be done, either by self or some other backend. That will keep the code
> simpler for users of this new API.

I am not convinced that that's a better API. I mean, consider
something like this:

/*
* OK, let's do it. First let other backends know I'm in ANALYZE.
*/
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
MyProc->vacuumFlags |= PROC_IN_ANALYZE;
LWLockRelease(ProcArrayLock);

I'm not sure exactly how you'd proposed to rewrite that, but I think
it's almost guaranteed to be more than three lines of code. Also, you
can't assume that the "work" can be done equally well by any backend.
In this case it could, because the PGPROC structures are all in shared
memory, but that won't work for something like GetSnapshotData(),
which needs to copy a nontrivial amount of data into backend-local
memory.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FlexLocks
Date: 2011-11-17 04:49:58
Message-ID: CABOikdPPityXq9-OBeQqAgYXf4bKDkTDPcFoKS8ypwRMcGwTKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 17, 2011 at 10:01 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

>
> I am not convinced that that's a better API.  I mean, consider
> something like this:
>
>    /*
>     * OK, let's do it.  First let other backends know I'm in ANALYZE.
>     */
>    LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
>    MyProc->vacuumFlags |= PROC_IN_ANALYZE;
>    LWLockRelease(ProcArrayLock);

> I'm not sure exactly how you'd proposed to rewrite that, but I think
> it's almost guaranteed to be more than three lines of code.

I would guess the ReqRes will look something like this where
ReqResRequest/Response would probably be union of all various requests
and responses, one for each type of request:

struct ReqRes {
ReqResRequestType reqtype;
ReqResRequest req;
ReqResResponse res;
}

The code above can be rewritten as:

reqRes.reqtype = RR_PROC_SET_VACUUMFLAGS;
reqRes.req.set_vacuumflags.flags = PROC_IN_ANALYZE;
LWLockExecute(ProcArrayLock, LW_EXCLUSIVE, &reqRes);

I mean, I agree it doesn't look exactly elegant and the number of
requests types and their handling may go up a lot, but we need to do
this only for those heavily contended locks. Other callers can
continue with the current code style. But with this general
infrastructure, there will be still be a way to do this.

>  Also, you
> can't assume that the "work" can be done equally well by any backend.
> In this case it could, because the PGPROC structures are all in shared
> memory, but that won't work for something like GetSnapshotData(),
> which needs to copy a nontrivial amount of data into backend-local
> memory.

Yeah, I am not suggesting we should do (even though I think it should
be possible with appropriate input/output data) this everywhere. But
places where this can done, like end-transaction stuff, the
infrastructure might be quite useful.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FlexLocks
Date: 2011-11-18 11:26:34
Message-ID: CABOikdMqH6hePzvAGtWf37TWO-OpmaeX7zQhHYYbV7N4e_1zSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 17, 2011 at 10:19 AM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> On Thu, Nov 17, 2011 at 10:01 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>>
>> I am not convinced that that's a better API.  I mean, consider
>> something like this:
>>
>>    /*
>>     * OK, let's do it.  First let other backends know I'm in ANALYZE.
>>     */
>>    LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
>>    MyProc->vacuumFlags |= PROC_IN_ANALYZE;
>>    LWLockRelease(ProcArrayLock);
>
>> I'm not sure exactly how you'd proposed to rewrite that, but I think
>> it's almost guaranteed to be more than three lines of code.
>
> I would guess the ReqRes will look something like this where
> ReqResRequest/Response would probably be union of all various requests
> and responses, one for each type of request:
>
> struct ReqRes {
>  ReqResRequestType  reqtype;
>  ReqResRequest         req;
>  ReqResResponse      res;
> }
>
> The code above can be rewritten as:
>
> reqRes.reqtype = RR_PROC_SET_VACUUMFLAGS;
> reqRes.req.set_vacuumflags.flags =  PROC_IN_ANALYZE;
> LWLockExecute(ProcArrayLock, LW_EXCLUSIVE, &reqRes);
>

My apologies for hijacking the thread, but the work seems quite
related, so I thought I should post here instead of starting a new
thread.

Here is a WIP patch based on the idea of having a shared Q. A process
trying to access the shared memory protected by a LWLock, sets up the
task in its PGPROC and calls a new API LWLockExecute(). If the LWLock
is available, the task is performed immediately and the function
returns. Otherwise, the process queues up itself on the lock. When the
last shared lock holder or the exclusive lock holder call
LWLockRelease(), it scans through such pending tasks, executes them
via a callback mechanism and wakes all those processes along with any
other normal waiter(s) waiting on LWLockAcquire().

I have only coded for ProcArrayEndTransaction, but it should fairly
easy to extend the usage at some more places, especially those which
does some simple modifications to the protected area. I don't propose
to use the technique for every user of LWLock, but there can be some
obvious candidates, including this one that Robert found out.

I see 35-40% improvement for 32-80 clients on a 5 minutes pgbench -N
run with scale factor of 100 and permanent tables. This is on a
32-core HP IA box.

There are few things that need some deliberations. The pending tasks
are right now executed while holding the mutex (spinlock). This is
good and bad for obvious reasons. We can possibly change that so that
the work is done without holding the spinlock or leave to the caller
to choose the behavior. Doing it without holding the spinlock will
make the technique interesting for many more callers. We can also
rework the task execution so that pending similar requests from
multiple callers can be combined and executed with a single callback,
if the caller knows its safe to do so. I haven't thought through the
API/callback changes to support that, but its definitely possible and
could be quite useful in many cases. For example, status of many
transactions can be checked with a single lookup of the ProcArray. Or
WAL inserts from multiple processes can be combined and written at
once.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

Attachment Content-Type Size
Shared-Q-v5.patch text/x-patch 15.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FlexLocks
Date: 2011-11-18 16:59:35
Message-ID: CA+TgmoZjAmHjbjjyfFBO5-JFNe9Z4-SgPyqqUnRBi+42qpPsGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 18, 2011 at 6:26 AM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> My apologies for hijacking the thread, but the work seems quite
> related, so I thought I should post here instead of starting a new
> thread.
>
> Here is a WIP patch based on the idea of having a shared Q. A process
> trying to access the shared memory protected by a LWLock, sets up the
> task in its PGPROC and calls a new API LWLockExecute(). If the LWLock
> is available, the task is performed immediately and the function
> returns. Otherwise, the process queues up itself on the lock. When the
> last shared lock holder or the exclusive lock holder call
> LWLockRelease(), it scans through such pending tasks, executes them
> via a callback mechanism and wakes all those processes along with any
> other normal waiter(s) waiting on LWLockAcquire().
>
> I have only coded for ProcArrayEndTransaction, but it should fairly
> easy to extend the usage at some more places, especially those which
> does some simple modifications to the protected area. I don't propose
> to use the technique for every user of LWLock, but there can be some
> obvious candidates, including this one that Robert found out.
>
> I see 35-40% improvement for 32-80 clients on a 5 minutes pgbench -N
> run with scale factor of 100 and permanent tables. This is on a
> 32-core HP IA box.
>
> There are few things that need some deliberations. The pending tasks
> are right now executed while holding the mutex (spinlock). This is
> good and bad for obvious reasons. We can possibly change that so that
> the work is done without holding the spinlock or leave to the caller
> to choose the behavior. Doing it without holding the spinlock will
> make the technique interesting for many more callers. We can also
> rework the task execution so that pending similar requests from
> multiple callers can be combined and executed with a single callback,
> if the caller knows its safe to do so. I haven't thought through the
> API/callback changes to support that, but its definitely possible and
> could be quite useful in many cases. For example, status of many
> transactions can be checked with a single lookup of the ProcArray. Or
> WAL inserts from multiple processes can be combined and written at
> once.

So the upside and downside of this approach is that it modifies the
existing LWLock implementation rather than allowing multiple lock
implementations to exist side-by-side. That means every LWLock in the
system has access to this functionality, which might be convenient if
there turn out to be many uses for this technique. The bad news is
that everyone pays the cost of checking the work queue in
LWLockRelease(). It also means that you can't, for example, create a
custom lock with different lock modes (e.g. S, SX, X, as I proposed
upthread).

I am pretty dubious that there are going to be very many cases where
we can get away with holding the spinlock while doing the work. For
example, WAL flush is a clear example of where we can optimize away
spinlock acquisitions - if we communicate to people we wake up that
their LSN is already flushed, they needn't reacquire the lock to
check. But we certainly can't hold a spinlock across a WAL flush.
The nice thing about the FlexLock approach is that it permits
fine-grained control over these types of policies: one lock type can
switch to exclusive mode, do the work, and then reacquire the spinlock
to hand off the baton; another can do the work while holding the
spinlock; and still a third can forget about work queues altogether
but introduce additional lock modes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FlexLocks
Date: 2011-11-18 18:31:56
Message-ID: CABOikdN=3fNo9esDwgXnwymYEgmyqGWUMXHYB16igE84r-2iiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 18, 2011 at 10:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

>
> So the upside and downside of this approach is that it modifies the
> existing LWLock implementation rather than allowing multiple lock
> implementations to exist side-by-side.  That means every LWLock in the
> system has access to this functionality, which might be convenient if
> there turn out to be many uses for this technique.

Right.

> The bad news is
> that everyone pays the cost of checking the work queue in
> LWLockRelease().

I hope that would be minimal (may be just one instruction) for those
who don't want to use the facility.

>  It also means that you can't, for example, create a
> custom lock with different lock modes (e.g. S, SX, X, as I proposed
> upthread).
>

Thats a valid point.

> I am pretty dubious that there are going to be very many cases where
> we can get away with holding the spinlock while doing the work.  For
> example, WAL flush is a clear example of where we can optimize away
> spinlock acquisitions - if we communicate to people we wake up that
> their LSN is already flushed, they needn't reacquire the lock to
> check.  But we certainly can't hold a spinlock across a WAL flush.
>

I think thats mostly solvable as said upthread. We can and should
improve this mechanism so that the work is carried out with the
necessary LWLock instead of the spinlock. That would let other
processes to queue up for the lock while the tasks are being executed.
Or if the tasks only need shared lock, then other normal shared
requesters can go ahead and acquire the lock.

When I get some time, I would like to see if this can be extended to
have shared snapshots so that multiple callers of GetSnapshotData()
get the same snapshot, computed only once by scanning the proc array,
instead of having each process compute its own snapshot which remains
the same unless some transaction ends in between.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: FlexLocks
Date: 2011-11-21 16:56:44
Message-ID: 201111211656.pALGuiG00528@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> On Wed, Nov 16, 2011 at 12:25 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> >> We could alternatively change one or the other of them to be a
> >> struct with one member, but I think the cure might be worse than
> >> the disease. ?By my count, we are talking about saving perhaps as
> >> many as 34 lines of code changes here, and that's only if
> >> complicating the type handling doesn't require any changes to
> >> places that are untouched at present, which I suspect it would.
> >
> > So I stepped through all the changes of this type, and I notice that
> > most of them are in areas where we've talked about likely benefits
> > of creating new FlexLock variants instead of staying with LWLocks;
> > if any of that is done (as seems likely), it further reduces the
> > impact from 34 lines. ?If we take care of LWLockHeldByMe() as you
> > describe, I'll concede the FlexLockId changes.
>
> Updated patches attached.

It would be helpful if the patch included some text about how flexilocks
are different from ordinary lwlocks.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: FlexLocks
Date: 2011-11-22 22:35:51
Message-ID: 4ECBCF67020000250004341E@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Updated patches attached.

I've gotten through several days of performance tests for this pair
of related patches, with results posted on a separate thread. I'll
link those in to the CF application shortly. To summarize the other
(fairly long) thread on benchmarks, it seemed like there might be a
very slight slowdown at low concurrency, but it could be the random
alignment of code with and without the patch; it was a small enough
fraction of a percent to be negligible, in my opinion. At higher
concurrency levels the patch showed significant performance
improvements. Besides the overall improvement in the median tps
numbers of several percent, there was significant mitigation of the
"performance collapse" phenomenon, where some runs were much slower
than others. It seems a clear win in terms of performance.

I've gotten through code review of the flexlock-v2.patch, and have
decided to post on that before I go through the
procarraylock-v1.patch code.

Not surprisingly, this patch was in good form and applied cleanly.
There were doc changes, and I can't see where any changes to the
tests are required. I liked the structure, and only found a few
nit-picky things to point out:

I didn't see why num_held_flexlocks and held_flexlocks had the
static keyword removed from their declarations.

FlexLockRemember() seems to have a pasto for a comment. Maybe
change to something like: "Add lock to list of locks held by this
backend."

In procarraylock.c there is this:

/* If there are no lockers, clar the critical PGPROC fields. */

s/clar/clear/

I have to admit I don't have my head around the extraWaits issue, so
I can't personally vouch for that code, although I have no reason to
doubt it, either. Everything else was something that I at least
*think* I understand, and it looked good to me.

I'm not changing the status until I get through the other patch
file, which should be tomorrow.

-Kevin