Quick Links

Re: SKIP LOCKED DATA (work in progress)

Lists:	pgsql-hackers

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	SKIP LOCKED DATA (work in progress)
Date:	2014-05-13 23:06:20
Message-ID:	CADLWmXUvd5Z+cFczi6Zj1WcTrXzipgP-wj0pZOWSaRUy=F0omQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

A couple of years ago I posted an outline of a plan [1] and an
initial patch [2] for implementing SKIP LOCKED DATA. I have
recently come back to this idea, rebased the patch and added a
simple isolation test -- please see attached.

However, heap_lock_tuple is clearly not for the faint hearted,
and I freely admit that I don't understand half of the things
going on in there yet. My general approach has been to follow
the example set by NOWAIT, generalising that flag into a 3-way
policy... but of course NOWAIT doesn't have to worry about
cleaning anything up, because it uses ereport, hence my TODOs.

As always, I would be grateful for any feedback.

Thanks!

Thomas Munro

[1]
http://www.postgresql.org/message-id/CADLWmXV7inA-HS=bJ-s6+Ai8DsVGx8zMohr0Ht38EWmXNeqPWw@mail.gmail.com
[2]
http://www.postgresql.org/message-id/CADLWmXUUjmrPU-+9ss7BCATxM-hr6__9mB6Wv7ry3-r+KXGgBw@mail.gmail.com

Attachment	Content-Type	Size
skip-locked-data-v3.patch	text/x-patch	43.3 KB

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-13 23:23:35
Message-ID:	5372A977.3060303@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/14/2014 07:06 AM, Thomas Munro wrote:
> Hi
>
> A couple of years ago I posted an outline of a plan [1] and an
> initial patch [2] for implementing SKIP LOCKED DATA. I have
> recently come back to this idea, rebased the patch and added a
> simple isolation test -- please see attached.

Simon did some similar work a while ago, so I've cc'd him for his input.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-14 20:29:12
Message-ID:	CADLWmXU0ZZwKnJT+A=bm8AnBDTVJ-=kXr=Wcq4KMv6UC+B=-Lw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 13 May 2014 23:06, Thomas Munro <munro(at)ip9(dot)org> wrote:

> A couple of years ago I posted an outline of a plan [1] and an
> initial patch [2] for implementing SKIP LOCKED DATA. I have
> recently come back to this idea, rebased the patch and added a
> simple isolation test -- please see attached.
>

I have attached a new version which copies the skipLocked member
in _copyPlanRowMark, _copyRowMarkClause and _copyLockClause.
Without these, SKIP LOCKED DATA didn't work in pl/pgsql (as reported
by David Rowley off-list). (Also some whitespace changes.)

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-data-v4.patch	text/x-patch	44.1 KB

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-16 08:46:55
Message-ID:	5375D07F.4040902@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I'm rebasing another implementation of this against current HEAD at the
moment. It was well tested but has bitrotted a bit, in particular it
needs merging with the multixact changes (eep).

That should provide a useful basis for comparison and a chance to share
ideas.

I'll follow up with the patch and a git tree when it's ready, hopefully
tonight.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-16 13:21:25
Message-ID:	537610D5.3090405@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/16/2014 04:46 PM, Craig Ringer wrote:
>
> I'll follow up with the patch and a git tree when it's ready, hopefully
> tonight.

Here's a rebased version of Simon's original patch that runs on current
master.

I still need to merge the isolation tests for it merged and sorted out,
and after re-reading it I'd like to change "waitMode" into an enum, not
just some #defines .

Hope it's useful for comparison and ideas.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
skip-locked.patch	text/x-patch	25.1 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-16 21:24:31
Message-ID:	CADLWmXXaDg1QzTOPjfB5oc1hWR+PSEcSyKtHNOC+xcQhAP2+yg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 16 May 2014 13:21, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> On 05/16/2014 04:46 PM, Craig Ringer wrote:
> >
> > I'll follow up with the patch and a git tree when it's ready, hopefully
> > tonight.
>
> Here's a rebased version of Simon's original patch that runs on current
> master.
>
> I still need to merge the isolation tests for it merged and sorted out,
> and after re-reading it I'd like to change "waitMode" into an enum, not
> just some #defines .
>
> Hope it's useful for comparison and ideas.

Thank you! At first glance they're sort of similar which is reassuring.
I'm especially interested in the buffer release semantics which I was
confused about and will look into that (to resolve the TODO notes in my
patch).

I noticed that in applyLockingClause, Simon has "rc->waitMode |= waitMode".
Is that right? The values are 0, 1, and 2, but if you had both NOWAIT and
SKIP LOCKED somewhere in your query you could up with rc->waitMode == 3
unless I'm mistaken. In my patch I have code that would give precedence to
NOWAIT, though looking at it again it could be simpler. (One reviewer
pointed out, that it should really be a single unified enum. In fact I have
been a bit unsure about what scope such an enumeration should have in the
application -- could it even be used in parser code? I tried to follow
existing examples which is why I used #define macros in gram.y).

From a bikeshed colour point of view:
* I used SKIP LOCKED DATA like DB2, and Simon used SKIP LOCKED like
Oracle, and I guess shorter is sweeter
* I used the term wait_policy and an enum, Simon used waitMode and an int
* I had noWait and skipLocked travelling separately in some places, Simon
had a single parameter, which is much better

Best regards,
Thomas Munro

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-17 05:02:25
Message-ID:	5376ED61.4040506@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 05/17/2014 05:24 AM, Thomas Munro wrote:

> I noticed that in applyLockingClause, Simon has "rc->waitMode |=
> waitMode". Is that right? The values are 0, 1, and 2, but if you had
> both NOWAIT and SKIP LOCKED somewhere in your query you could up with
> rc->waitMode == 3 unless I'm mistaken.

I do not think that |= is correct there.

It may be that no case can arise where you get the bogus value, but
since in all other places the values are tested for equalty not as bit
fields ( waitMode == NOWAIT not waitMode & NOWAIT ) it doesn't make
sense to |= it there.

? In my patch I have code that
> would give precedence to NOWAIT, though looking at it again it could be
> simpler.

I agree; if NOWAIT is present anywhere it should be preferred to
SKIP_LOCKED, as it's OK to apply NOWAIT where SKIP LOCKED appears, but
possibly semantically incorrect to apply SKIP LOCKED where only NOWAIT
was expected.

> (One reviewer pointed out, that it should really be a single
> unified enum. In fact I have been a bit unsure about what scope such an
> enumeration should have in the application -- could it even be used in
> parser code? I tried to follow existing examples which is why I used
> #define macros in gram.y).

Not sure there.

> From a bikeshed colour point of view:
> * I used SKIP LOCKED DATA like DB2, and Simon used SKIP LOCKED like
> Oracle, and I guess shorter is sweeter

We have a long tradition of trying to allow noise keywords where it's
harmless.

So the clause should probably be

SKIP LOCKED [DATA]

in much the same way we have

BEGIN [ WORK | TRANSACTION ] ...

There won't be any ambiguity there.

> * I used the term wait_policy and an enum, Simon used waitMode and an int

I prefer an enum and intended to change Simon's patch but didn't have
the time.

I have some isolation tester and regression tests that are still to follow.

> * I had noWait and skipLocked travelling separately in some places,
> Simon had a single parameter, which is much better

Yes, I strongly prefer that.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-18 13:16:30
Message-ID:	CADLWmXWWVfk6NvJEdWJVuzq8iv02Z32j__yEQbWkfKOfhSM2mQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 17 May 2014 05:02, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 05/17/2014 05:24 AM, Thomas Munro wrote:
> > I noticed that in applyLockingClause, Simon has "rc->waitMode |=
> > waitMode". Is that right? The values are 0, 1, and
2, but if you had
> > both NOWAIT and SKIP LOCKED somewhere in your query you could up with
> > rc->waitMode == 3 unless I'm mistaken.
>
> I do not think that |= is correct there.
>
> It may be that no case can arise where you get the bogus value, but
> since in all other places the values are tested for equalty not as bit
> fields ( waitMode == NOWAIT not waitMode & NOWAIT ) it doesn't make
> sense to |= it there.

There are indeed ways (subselects etc) to have more than one
locking clause applying to the same table, which are supposed to
be merged into the appropriate wait policy by that line.

> ? In my patch I have code that
> > would give precedence to NOWAIT, though looking at it again it could be
> > simpler.
>
> I agree; if NOWAIT is present anywhere it should be preferred to
> SKIP_LOCKED, as it's OK to apply NOWAIT where
SKIP LOCKED appears, but
> possibly semantically incorrect to apply SKIP LOCKED where only NOWAIT
> was expected.

In the new v5 patch (attached) I used the same approach as
the "lock strength" support, that is, relying on the values of
the enumeration being arranged so that the enumerators that
should take precendence have higher numerical values so I could
just write:

rc->strength = Max(rc->strength, strength);

+ rc->waitPolicy = Max(rc->waitPolicy, waitPolicy);

> > (One reviewer pointed out, that it should really be a single
> > unified enum. In fact I have been a bit unsure about what scope such an
> > enumeration should have in the application -- could it even be used in
> > parser code? I tried to follow existing examples which is why I used
> > #define macros in gram.y).
>
> Not sure there.

I removed those macros and went for an enumeration -- see below.

> > From a bikeshed colour point of view:
> > * I used SKIP LOCKED DATA like DB2, and Simon used SKIP LOCKED like
> > Oracle, and I guess shorter is sweeter
>
> We have a long tradition of trying to allow noise keywords where it's
> harmless.
>
> So the clause should probably be
>
> SKIP LOCKED [DATA]

Ok, done in the v5 patch.

> I have some isolation tester and regression tests that are still to
follow.

Great, I am looking forward to seeing those. I have added an
isolation test for NOWAIT, and I am planning to design some tests
that will test the various affected branched of
heap_lock_tuple (eg multi-transaction ID etc), and test
interaction with various other features (various lock stregths,
UPDATEs, combinations of SKIP LOCKED DATA, NOWAIT and default
policies etc).

> > * I had noWait and skipLocked travelling separately in some places,
> > Simon had a single parameter, which is much better
>
> Yes, I strongly prefer that.

I agree in principal but couldn't decide *how many* enumerations
to use. In v5 I have added two more:

* one called LockClauseWaitPolicy defined in parsenodes.h that is
used in LockClause and RowMarkCause

* another called RowWaitPolicy defined in plannodes.h that is
used in PlanRowMark and ExecRowMark

The parser produces LockClauseWaitPolicy values, which InitPlan
converts to RowWaitPolicy values by simple assignment (the values
are the same, which obviously wouldn't work if this were compiled
as C++). Then ExecLockRows converts those into LockWaitPolicy
values when calling heal_lock_tuple.

Obviously there is an opportunity to unify all three, or at least
LockClauseWaitPolicy (parser) and RowWaitPolicy (plan and
execution), but I didn't know how appropriate that would be, and
followed the new lock strength feature for inspiration.

One difference between my patch and Simon's is that mine
introduces a new return value for heap_lock_type called
HeapTupleWouldBlock, whereas his returns the existin
HeapTupleBeingUpdated and then handles it differently in
ExecLockRows if in SKIP LOCKED mode.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-data-v5.patch	text/x-patch	48.5 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-19 22:24:14
Message-ID:	CADLWmXUkZq949q3YfHuyhV+90MnQ7LkkavtOv9MMbvqjzKOmQw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello

As a simple example for people wondering what the point of this
feature is, I created a table work (id, data, status)
and then create 10,000 items with status 'NEW' and then started
a number of worker threads that did the following pair of
transactions, with and without SKIP LOCKED DATA on the end of the
SELECT statement, until all rows were deleted:

BEGIN
SELECT id, data FROM work WHERE status = 'NEW' LIMIT 1 FOR UPDATE
-- if no rows returned, then finish
UPDATE work SET status = 'WORK' WHERE id = $id
COMMIT

BEGIN
DELETE FROM work WHERE id = $id
COMMIT

Here are the times taken to process all items, in elapsed seconds, on
a slow laptop (i5 4 core with an SSD, with default postgresql.conf
except for checkpoint_segments set to 300):

| Threads | default | SKIP |
| 1 | 26.78 | 27.02 |
| 2 | 23.46 | 22.00 |
| 3 | 22.02 | 14.83 |
| 4 | 22.59 | 11.16 |
| 5 | 22.37 | 9.05 |
| 6 | 22.55 | 7.66 |
| 7 | 22.46 | 6.69 |
| 8 | 22.57 | 8.39 |
| 9 | 22.40 | 8.38 |
| 10 | 22.38 | 7.93 |
| 11 | 22.43 | 6.86 |
| 12 | 22.34 | 6.77 |

I am not experienced at benchmarking and I don't claim that this
particular workload or configuration is particularly sensible or
representative of anything but it might give some idea of the
motivation.

Best regards,
Thomas Munro

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-23 11:26:17
Message-ID:	CA+TgmoZSW1Tn15nPO0D3bJQH54UMT0quKjxo6FJ_p7s4oGiocg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, May 17, 2014 at 1:02 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> We have a long tradition of trying to allow noise keywords where it's
> harmless.
>
> So the clause should probably be
>
> SKIP LOCKED [DATA]
>
> in much the same way we have
>
> BEGIN [ WORK | TRANSACTION ] ...
>
> There won't be any ambiguity there.

We've had some problems in the past where allowing optional noise
words resulted in grammar conflicts that made future features harder
to add. See previous discussions about LOCK TABLE, wherein we almost
went to the extreme of adding a completely separate ACQUIRE LOCK
command. A lot of these things seem harmless when you first do them,
and then later they seem less harmless.

Anyway, +1 for the general idea of this feature. It's come up a
number of times on this mailing list, and we've had customer requests
for it, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-23 14:40:10
Message-ID:	5263.1400856010@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sat, May 17, 2014 at 1:02 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>> We have a long tradition of trying to allow noise keywords where it's
>> harmless.
>>
>> So the clause should probably be
>>
>> SKIP LOCKED [DATA]
>>
>> in much the same way we have
>>
>> BEGIN [ WORK | TRANSACTION ] ...
>>
>> There won't be any ambiguity there.

> We've had some problems in the past where allowing optional noise
> words resulted in grammar conflicts that made future features harder
> to add.

In this particular case, I'd be worried about whether we'd not end up
having to fully reserve DATA in order to allow it to be optional here.
That would be necessary if this clause could be followed immediately
by an identifier, either now or in the future. That would be a mighty
high price to pay for failing to make up our minds about which syntax
to use. (How many tables out there do you think have "data" as a column
name?)

A different concern is that this patch adds not one but two new unreserved
keywords, ie SKIP and LOCKED. That bloats our parser tables, which are
too darn large already, and it has a nonzero compatibility cost (since
we only allow AS-less column aliases when they are no keyword at all).
If we're pulling syntax out of the air it'd be nice if we could avoid
adding new keywords to the grammar.

regards, tom lane

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-23 18:01:01
Message-ID:	CADLWmXV_mGV-vckE=yjke+4SopmemxJ8vC9m-y9s+nAKW+3aSA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 23 May 2014 15:40, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> A different concern is that this patch adds not one but two new unreserved
> keywords, ie SKIP and LOCKED. That bloats our parser tables, which are
> too darn large already, and it has a nonzero compatibility cost (since
> we only allow AS-less column aliases when they are no keyword at all).
> If we're pulling syntax out of the air it'd be nice if we could avoid
> adding new keywords to the grammar.

How about some of these combinations of existing words:

EXCLUDE LOCK
NOWAIT EXCLUDE
NOWAIT NEXT
NOWAIT FOLLOWING
NOWAIT DISCARD

Of those I think I prefer NOWAIT EXCLUDE (perhaps with NOWAIT ABORT as a
long version of the existing NOWAIT behaviour for contrast).

Or adding just one new keyword:

NOWAIT SKIP
SKIP LOCK

Regards,
Thomas Munro

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-23 19:24:07
Message-ID:	CA+U5nMJtJBmaV-XAmbB+eFuAvzopO0=iNaWMN143qQnD0H6AdA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 23 May 2014 10:40, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> If we're pulling syntax out of the air it'd be nice if we could avoid
> adding new keywords to the grammar.

Oracle, SQLServer and DB2 have this capability. MySQL does not.

SQLServer implements that using the table hint of READPAST. Since that
whole syntax area is radically different to what we have, it isn't
easy to maintain code compatibility.

DB2 z/OS 10 provides SKIP LOCKED DATA clause to allow moving past
already locked rows. That's fairly recent and I don't believe there
will be many programs using that. DB2 UDB supports some complex
functionality using DB2_SKIPINSERTED, DB2_EVALUNCOMMITTED and
DB2_SKIPDELETED, all of which is complex and mostly exists for
benchmarks, AFAICS.

Oracle uses both SKIP LOCKED and NOWAIT.

PostgreSQL already chose to follow the Oracle syntax when we
implemented NOWAIT. So my proposal is that we follow the Oracle syntax
again and use the words SKIP LOCKED.

I don't see any advantage in inventing new syntax that leaves us
incompatible with Oracle, nor do I see any need to be compatible with
both Oracle and DB2 since the latter is much less likely to gain us
anything in practice.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-23 19:31:52
Message-ID:	CAFj8pRAGxydLdigxb9cYoDPYZCqa-ajHfb2nd=n3ehhkMV2atA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2014-05-23 21:24 GMT+02:00 Simon Riggs <simon(at)2ndquadrant(dot)com>:

> On 23 May 2014 10:40, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > If we're pulling syntax out of the air it'd be nice if we could avoid
> > adding new keywords to the grammar.
>
> Oracle, SQLServer and DB2 have this capability. MySQL does not.
>
> SQLServer implements that using the table hint of READPAST. Since that
> whole syntax area is radically different to what we have, it isn't
> easy to maintain code compatibility.
>
> DB2 z/OS 10 provides SKIP LOCKED DATA clause to allow moving past
> already locked rows. That's fairly recent and I don't believe there
> will be many programs using that. DB2 UDB supports some complex
> functionality using DB2_SKIPINSERTED, DB2_EVALUNCOMMITTED and
> DB2_SKIPDELETED, all of which is complex and mostly exists for
> benchmarks, AFAICS.
>
> Oracle uses both SKIP LOCKED and NOWAIT.
>
> PostgreSQL already chose to follow the Oracle syntax when we
> implemented NOWAIT. So my proposal is that we follow the Oracle syntax
> again and use the words SKIP LOCKED.
>
> I don't see any advantage in inventing new syntax that leaves us
> incompatible with Oracle, nor do I see any need to be compatible with
> both Oracle and DB2 since the latter is much less likely to gain us
> anything in practice.
>

Pavel

>
> --
> Simon Riggs http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <munro(at)ip9(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-05-23 20:18:54
Message-ID:	CA+TgmoZex=VZjAe2PRe_2w=VPAcuQy9DWH4eSLhDuagSDGNSbA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, May 23, 2014 at 3:24 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> PostgreSQL already chose to follow the Oracle syntax when we
> implemented NOWAIT. So my proposal is that we follow the Oracle syntax
> again and use the words SKIP LOCKED.
>
> I don't see any advantage in inventing new syntax that leaves us
> incompatible with Oracle, nor do I see any need to be compatible with
> both Oracle and DB2 since the latter is much less likely to gain us
> anything in practice.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-06-29 09:01:09
Message-ID:	CADLWmXUuy0w+kM8-5sf=oW-ZwEtKK0GTjfGxUWXC_EduOYBwAw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 23 May 2014 21:18, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, May 23, 2014 at 3:24 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> PostgreSQL already chose to follow the Oracle syntax when we
>> implemented NOWAIT. So my proposal is that we follow the Oracle syntax
>> again and use the words SKIP LOCKED.
>>
>> I don't see any advantage in inventing new syntax that leaves us
>> incompatible with Oracle, nor do I see any need to be compatible with
>> both Oracle and DB2 since the latter is much less likely to gain us
>> anything in practice.
>
> +1.

Hello

Please find attached a rebased version of my SKIP LOCKED
patch (formerly SKIP LOCKED DATA), updated to support only the
Oracle-like syntax.

I posted a small test program here:
https://github.com/macdice/skip-locked-tests

Would anyone who is interested in a SKIP LOCKED feature and
attending the CHAR(14)/PGDay UK conference next week be
interested in a birds-of-a-feather discussion?

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v6.patch	text/x-patch	62.6 KB

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-06-29 20:38:54
Message-ID:	CA+U5nMKuXtqBCGNmGB7w=+JxUg+i6DMQt55yKCBGRLmpZvpx3g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 29 June 2014 10:01, Thomas Munro <munro(at)ip9(dot)org> wrote:

> Would anyone who is interested in a SKIP LOCKED feature and
> attending the CHAR(14)/PGDay UK conference next week be
> interested in a birds-of-a-feather discussion?

Sounds like a plan. I'll check my schedule.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-23 12:15:11
Message-ID:	CAApHDvqOZGAdSexbMsVamr21QvPV=ELMhZiUPbzQeSKcbMo56Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Jun 29, 2014 at 9:01 PM, Thomas Munro <munro(at)ip9(dot)org> wrote:

>
> Please find attached a rebased version of my SKIP LOCKED
> patch (formerly SKIP LOCKED DATA), updated to support only the
> Oracle-like syntax.
>
>
Hi Thomas,

Apologies for taking this long to get to reviewing this, I'd gotten a bit
side tracked with my own patches during this commitfest.

I'm really glad to see this patch is back again. I think it will be very
useful for processing queues. I could have made good use of it in my last
work, using it for sending unsent emails which were "queued" up in a table
in the database.

I've so far read over the patch and done only some brief tests of the
functionality.

Here's what I've picked up on so far:

* In heap_lock_tuple() there's a few places where you test the wait_policy,
but you're not always doing this in the same order. The previous code did
if (nolock) first each time, but since there's now 3 values I think if
(wait_policy == LockWaitError) should be first each time as likely this is
the most common case.

* The following small whitespace change can be removed in gram.y:

@@ -119,7 +119,6 @@ typedef struct PrivTarget
#define CAS_NOT_VALID 0x10
#define CAS_NO_INHERIT 0x20

-
#define parser_yyerror(msg) scanner_yyerror(msg, yyscanner)
#define parser_errposition(pos) scanner_errposition(pos, yyscanner)

* analyze.c. rc->waitPolicy = Max(rc->waitPolicy, waitPolicy);
I'm not quite sure I understand this yet, but I'm guessing the original
code was there so that something like:

SELECT * FROM (SELECT * FROM a FOR UPDATE) a FOR UPDATE NOWAIT;
Would give:
ERROR: could not obtain lock on row in relation "a"

So it seems that with the patch as you've defined in by the order of the
enum values in LockClauseWaitPolicy that SKIP LOCKED overrides FOR UPDATE.
I'm wondering if this is dangerous.

Should the following really skip locked tuples?
SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE LIMIT 1;

But on the other hand perhaps I've missed a discussion on this, if so then
I think the following comment should be updated to explain it all:

* We also consider that NOWAIT wins if it's specified both ways. This
* is a bit more debatable but raising an error doesn't seem helpful.
* (Consider for instance SELECT FOR UPDATE NOWAIT from a view that
* internally contains a plain FOR UPDATE spec.)
*

* plannodes.h -> RowWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED DATA
options */
Should be "NOWAIT and SKIP LOCKED options" since DATA has now been removed
from the syntax.

* nodeLockRow.c has extra space in if condition: else if (erm->waitPolicy
== RWP_SKIP )

I'm also wondering about this block of code in general:

if (erm->waitPolicy == RWP_WAIT)
wait_policy = LockWaitBlock;
else if (erm->waitPolicy == RWP_SKIP )
wait_policy = LockWaitSkip;
else /* erm->waitPolicy == RWP_NOWAIT */
wait_policy = LockWaitError;

Just after this code heap_lock_tuple() is called, and if that
returns HeapTupleWouldBlock, the code does a goto lnext, which then will
repeat that whole block again. I'm wondering why there's 2 enums that are
for the same thing? if you just had 1 then you'd not need this block of
code at all, you could just pass erm->waitPolicy to heap_lock_tuple().

* parsenodes.h comment does not meet project standards (
http://www.postgresql.org/docs/devel/static/source-format.html)

typedef enum LockClauseWaitPolicy
{
/* order is important (see applyLockingClause which takes the greatest
value when several wait policies have been specified), and values must
match RowWaitPolicy from plannodes.h */

* parsenode.h remove "DATA" from LockClauseWaitPolicy waitPolicy; /* NOWAIT
and SKIP LOCKED DATA */

I have noticed the /* TODO -- work out what needs to be released here */
comments in head_lock_tuple(), and perhaps the lack of cleaning up is what
is causing the following:

create table skiptest (id int primary key); insert into skiptest (id)
select x.x from generate_series(1,1000000) x(x);

Session 1: begin work; select * from skiptest for update limit 999999;
Session 2: select * from skiptest for update skip locked limit 1;
WARNING: out of shared memory
ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.

Yet if I do:

session 1: begin work; select * from skiptest where id > 1 for update;
session 2: select * from skiptest for update skip locked limit 1;
id
----
1
(1 row)

That test makes me think your todo comments are in the right place,
something is missing there for sure!

* plays about with patch for a bit *

I don't quite understand how heap_lock_tuple works, as if I debug session 2
from the above set of queries the first call
to ConditionalLockTupleTuplock() (heapam.c line 4236) succeeds where I'd
have thought it would fail, since session 1 should be locking this tuple?
Later at line 4527 on the line if (!ConditionalXactLockTableWait(xwait)),
it fails to grab the lock and returns HeapTupleWouldBlock. The above query
seems to work ok if I just apply the following patch:

diff --git a/src/backend/access/heap/heapam.c
b/src/backend/access/heap/heapam.c
index b661d0e..b3e9dcc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4525,8 +4525,10 @@ l3:
else /* wait_policy == LockWaitSkip */
{
if
(!ConditionalXactLockTableWait(xwait))
- /* TODO -- work out what
needs to be released here */
+ {
+
UnlockTupleTuplock(relation, tid, mode);
return HeapTupleWouldBlock;
+ }
}

/* if there are updates, follow the update
chain */

Running the test query again, I get:
select * from skiptest for update skip locked limit 1;
id
---------
1000000
(1 row)

Would you also be able to supply a rebased patch with current master? The
applying the patch is a bit noisy and the isolation tests part does not
seem to apply at all now, so I've not managed to look at those yet.

I've still got more to look at, but hopefully this will get things moving
again.

Regards

David Rowley

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-23 23:52:22
Message-ID:	CADLWmXWakSyLGyYQPCM-Mob=tZDEWEfAJ-ghN6-SXozHbksTaA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 23 July 2014 13:15, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> * In heap_lock_tuple() there's a few places where you test the wait_policy,
> but you're not always doing this in the same order. The previous code did if
> (nolock) first each time, but since there's now 3 values I think if
> (wait_policy == LockWaitError) should be first each time as likely this is
> the most common case.

Ok, I have made the them consistent -- though I put LockWaitBlock
first as that is surely the most common case (plain old FOR UPDATE).

> * The following small whitespace change can be removed in gram.y:
> [snip]

Fixed.

> * analyze.c. rc->waitPolicy = Max(rc->waitPolicy, waitPolicy);
> I'm not quite sure I understand this yet, but I'm guessing the original code
> was there so that something like:
>
> SELECT * FROM (SELECT * FROM a FOR UPDATE) a FOR UPDATE NOWAIT;
> Would give:
> ERROR: could not obtain lock on row in relation "a"

Surely only if another session already has one or more tuples locked?

> So it seems that with the patch as you've defined in by the order of the
> enum values in LockClauseWaitPolicy that SKIP LOCKED overrides FOR UPDATE.
> I'm wondering if this is dangerous.

My understanding was that we have to choose a single policy for
attempts to lock rows in that relation, and if you've stated in at
least one place that you don't want to wait, then it makes sense for
NOWAIT to take precedence. I had to figure out how to fit SKIP LOCKED
into that hierarchy, and figured that NOWAIT should still be the
strongest policy, overriding all others, and SKIP LOCKED should
overrule default FOR UPDATE (ie blocking).

> Should the following really skip locked tuples?
> SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE LIMIT 1;

I agree that it's somewhat arbitrary... I suppose the two most
obvious options are either to reduce to a single policy using some
rule (such as the NOWAIT > SKIP LOCKED > default hierarchy I propose),
or simply reject such queries (which I'm guessing wouldn't fly at this
point since existing valid queries would become invalid).

> But on the other hand perhaps I've missed a discussion on this, if so then I
> think the following comment should be updated to explain it all:
>
> * We also consider that NOWAIT wins if it's specified both ways. This
> * is a bit more debatable but raising an error doesn't seem helpful.
> * (Consider for instance SELECT FOR UPDATE NOWAIT from a view that
> * internally contains a plain FOR UPDATE spec.)

Modified to address SKIP LOCKED (there are also couple of words to the
same effect in select.sgml).

> * plannodes.h -> RowWaitPolicy waitPolicy; /* NOWAIT and SKIP LOCKED DATA
> options */
> Should be "NOWAIT and SKIP LOCKED options" since DATA has now been removed
> from the syntax.
>
> * nodeLockRow.c has extra space in if condition: else if (erm->waitPolicy ==
> RWP_SKIP )

Fixed.

> I'm also wondering about this block of code in general:
>
> if (erm->waitPolicy == RWP_WAIT)
> wait_policy = LockWaitBlock;
> else if (erm->waitPolicy == RWP_SKIP )
> wait_policy = LockWaitSkip;
> else /* erm->waitPolicy == RWP_NOWAIT */
> wait_policy = LockWaitError;
>
> Just after this code heap_lock_tuple() is called, and if that returns
> HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that
> whole block again. I'm wondering why there's 2 enums that are for the same
> thing? if you just had 1 then you'd not need this block of code at all, you
> could just pass erm->waitPolicy to heap_lock_tuple().

True. Somewhere upthread I mentioned the difficulty I had deciding
how many enumerations were needed, for the various subsystems, ie
which headers and type they were allowed to share. Then I put off
working on this for so long that a nice example came along that showed
me the way: the lock strength enums LockTupleMode (heapam.h) and
RowMarkType (plannodes.h). The wait policy enums LockWaitPolicy
(heapam.h) and RowWaitPolicy (plannodes.h) mirror them closely, and
the same type of enumeration translation takes place in nodeLockRows.c
immediately above the code you pasted. I don't have any more
principled argument than monkey-see-monkey-do for that one...

> * parsenodes.h comment does not meet project standards
> (http://www.postgresql.org/docs/devel/static/source-format.html)
>
> typedef enum LockClauseWaitPolicy
> {
> /* order is important (see applyLockingClause which takes the greatest
> value when several wait policies have been specified), and values must
> match RowWaitPolicy from plannodes.h */

Fixed.

> * parsenode.h remove "DATA" from LockClauseWaitPolicy waitPolicy; /* NOWAIT
> and SKIP LOCKED DATA */

Fixed.

> I have noticed the /* TODO -- work out what needs to be released here */
> comments in head_lock_tuple(), and perhaps the lack of cleaning up is what
> is causing the following:
>
> create table skiptest (id int primary key); insert into skiptest (id) select
> x.x from generate_series(1,1000000) x(x);
>
> Session 1: begin work; select * from skiptest for update limit 999999;
> Session 2: select * from skiptest for update skip locked limit 1;
> WARNING: out of shared memory
> ERROR: out of shared memory
> HINT: You might need to increase max_locks_per_transaction.
>
> Yet if I do:
>
> session 1: begin work; select * from skiptest where id > 1 for update;
> session 2: select * from skiptest for update skip locked limit 1;
> id
> ----
> 1
> (1 row)
>
> That test makes me think your todo comments are in the right place,
> something is missing there for sure!
>
> * plays about with patch for a bit *
>
> I don't quite understand how heap_lock_tuple works, as if I debug session 2
> from the above set of queries the first call to
> ConditionalLockTupleTuplock() (heapam.c line 4236) succeeds where I'd have
> thought it would fail, since session 1 should be locking this tuple? Later
> at line 4527 on the line if (!ConditionalXactLockTableWait(xwait)), it fails
> to grab the lock and returns HeapTupleWouldBlock. The above query seems to
> work ok if I just apply the following patch:
>
> diff --git a/src/backend/access/heap/heapam.c
> b/src/backend/access/heap/heapam.c
> index b661d0e..b3e9dcc 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -4525,8 +4525,10 @@ l3:
> else /* wait_policy == LockWaitSkip */
> {
> if
> (!ConditionalXactLockTableWait(xwait))
> - /* TODO -- work out what
> needs to be released here */
> + {
> + UnlockTupleTuplock(relation,
> tid, mode);
> return HeapTupleWouldBlock;
> + }
> }
>
> /* if there are updates, follow the update
> chain */
>
> Running the test query again, I get:
> select * from skiptest for update skip locked limit 1;
> id
> ---------
> 1000000
> (1 row)

Ahh! Ok, take a look now, I included that change, but also made the
unlocking conditional on have_tuple_lock, in the second two cases
where HeapTupleWouldBlock is returned. In the first case it doesn't
look possible for it to be true.

> Would you also be able to supply a rebased patch with current master? The
> applying the patch is a bit noisy and the isolation tests part does not seem
> to apply at all now, so I've not managed to look at those yet.

Done, and attached with the above changes.

> I've still got more to look at, but hopefully this will get things moving
> again.

Thank you very much for taking the time to look at this!

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v7.patch	text/x-patch	63.4 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-24 23:16:38
Message-ID:	CADLWmXXOw+PMnZz9T1W8fGkx2518LDUuj-+G=ipnv2OZnQWdMw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 24 July 2014 00:52, Thomas Munro <munro(at)ip9(dot)org> wrote:
> On 23 July 2014 13:15, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> I'm also wondering about this block of code in general:
>>
>> if (erm->waitPolicy == RWP_WAIT)
>> wait_policy = LockWaitBlock;
>> else if (erm->waitPolicy == RWP_SKIP )
>> wait_policy = LockWaitSkip;
>> else /* erm->waitPolicy == RWP_NOWAIT */
>> wait_policy = LockWaitError;
>>
>> Just after this code heap_lock_tuple() is called, and if that returns
>> HeapTupleWouldBlock, the code does a goto lnext, which then will repeat that
>> whole block again. I'm wondering why there's 2 enums that are for the same
>> thing? if you just had 1 then you'd not need this block of code at all, you
>> could just pass erm->waitPolicy to heap_lock_tuple().
>
> True. Somewhere upthread I mentioned the difficulty I had deciding
> how many enumerations were needed, for the various subsystems, ie
> which headers and type they were allowed to share. Then I put off
> working on this for so long that a nice example came along that showed
> me the way: the lock strength enums LockTupleMode (heapam.h) and
> RowMarkType (plannodes.h). The wait policy enums LockWaitPolicy
> (heapam.h) and RowWaitPolicy (plannodes.h) mirror them closely, and
> the same type of enumeration translation takes place in nodeLockRows.c
> immediately above the code you pasted. I don't have any more
> principled argument than monkey-see-monkey-do for that one...

On reflection, I agree that this sucks, and I would like to unify the
three new enums in the current patch (see below for recap) into one
that can be passed between parser, planner, executor and heap access
manager code as I think you are suggesting. My only question is: in
which header should the enum be defined, that all those modules could
include?

Best regards,
Thomas Munro

Enumeration explosion recap:

* parsenode.h defines enum LockClauseWaitPolicy, which is used in
the structs LockClause and RowMarkClause, for use by the parser
code

* plannodes.h defines enum RowWaitPolicy, which is used in the
structs PlanRowMark and ExecRowMark, for use by the planner and
executor code (numbers coincide with LockClauseWaitPolicy)

* heapam.h defines enum LockWaitPolicy, which is used as a
parameter to heap_lock_tuple, for use by heap access code

The parser produces LockClauseWaitPolicy values. InitPlan
converts these to RowWaitPolicy values in execMain.c. Then
nodeLockRows.c converts RowWaitPolicy values to LockWaitPolicy
values (by if-then-else) so it can call heap_lock_tuple.

This roughly mirrors what happens to lock strength information.
The unpatched code simply passes a boolean 'nowait' flag around.
An earlier version of my patch passed a pair of booleans around.
Simon's independent patch[1] uses an int in the various node structs
and the heap_lock_tuple function, and in execNode.h it has macros to
give names to the values, and that is included by access/heapm.h.

[1] http://www.postgresql.org/message-id/537610D5.3090405@2ndquadrant.com

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-26 09:34:19
Message-ID:	CADLWmXVW=p287rA1jMf3zw6HMd1AQv2-8wL1t_mQFwAmCkypYw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I tried getting rid of the offending if-then-else enum conversion code
and replaced it with a simple assignment -- please see attached. I
also added compile time assertions that the enum values line up to
make that work, and are correctly ordered for use in that 'Max'
expression. Please let me know if you think this is an improvement or
an abomination.

I couldn't find an existing reasonable place to share a single wait
policy enumeration between parser/planner/executor and the heap access
module, and I get the feeling that it would be unacceptable to
introduce one.

I suppose that the LockClauseWaitPolicy and RowWaitPolicy could at
least be merged into a single enum defined in nodes.h following the
example of CmdType, which is also used by both parsenodes.h and
plannnode.h, but do I detect a tiny hint of reluctance in its comment,
"so put it here..."?

(The attached patch also has a couple of trivial typo fixes in
documentation and comments).

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v8.patch	text/x-patch	63.9 KB

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-26 09:58:43
Message-ID:	CAApHDvo4D=RUYViqioTy0Q3H0c7X06ywjuykbt_3ij16RuHcoA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, Jul 26, 2014 at 9:34 PM, Thomas Munro <munro(at)ip9(dot)org> wrote:

> I couldn't find an existing reasonable place to share a single wait
> policy enumeration between parser/planner/executor and the heap access
> module, and I get the feeling that it would be unacceptable to
> introduce one.
>
>
I guess the way I justify it in my head is something like, "the 3 enums are
for the same purpose, so having 3 exist all with different names is
confusing and it makes the code harder to follow". So to fix that up I
think, "oh we can just give them all the same name... But then, how can be
we be sure each definition matches the other 2?" ... hmm, "just merge it
into one and put it somewhere that can be accessed from everywhere."

Saying that I don't know what the project best practises are for locations
for sharing such things, but if nothing exists then maybe this would be a
good time to invent somewhere.

Maybe someone with more experience can chime in and give advice on this?

Regards

David Rowley

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-26 14:43:01
Message-ID:	27211.1406385781@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro <munro(at)ip9(dot)org> writes:
> I couldn't find an existing reasonable place to share a single wait
> policy enumeration between parser/planner/executor and the heap access
> module, and I get the feeling that it would be unacceptable to
> introduce one.

There is a precedent in the form of AclMode, which is needed throughout
the system and is currently declared in parsenodes.h. I can't say I've
ever been particularly pleased with that arrangement though, since it
forces inclusion of parsenodes.h in many places that might not otherwise
have any interest in parse nodes.

It might be better if we'd declared AclMode in a single-purpose header,
say utils/aclmode.h, and then #include'd that into parsenodes.h.
There's certainly plenty of other single-datatype headers laying about.

regards, tom lane

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-26 16:49:59
Message-ID:	CADLWmXVSE5_R45KegC=OFvNG+34oAuTxn02eVw17iFCcfdqfyQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 26 July 2014 15:43, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <munro(at)ip9(dot)org> writes:
>> I couldn't find an existing reasonable place to share a single wait
>> policy enumeration between parser/planner/executor and the heap access
>> module, and I get the feeling that it would be unacceptable to
>> introduce one.
>
> There is a precedent in the form of AclMode, which is needed throughout
> the system and is currently declared in parsenodes.h. I can't say I've
> ever been particularly pleased with that arrangement though, since it
> forces inclusion of parsenodes.h in many places that might not otherwise
> have any interest in parse nodes.
>
> It might be better if we'd declared AclMode in a single-purpose header,
> say utils/aclmode.h, and then #include'd that into parsenodes.h.
> There's certainly plenty of other single-datatype headers laying about.

Here is a new version of the patch with a single enum LockWaitPolicy
defined in utils/lockwaitpolicy.h.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v9.patch	text/x-patch	63.7 KB

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-27 13:31:41
Message-ID:	CAApHDvq=xy7B5UVwWn0s-Kff-tdJmtYzbwjvUdmgFeVZDAh30w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Jul 27, 2014 at 4:49 AM, Thomas Munro <munro(at)ip9(dot)org> wrote:

> Here is a new version of the patch with a single enum LockWaitPolicy
> defined in utils/lockwaitpolicy.h.
>
>
That seems much cleaner

A few more comments:
You seem to have lost the comment which indicates that the values of the
enum are important due to the code in applyLockingClause(), but I see now
instead that you've added some assert checks in applyLockingClause(),
likely these should use Assert() rather than StaticAssertExpr().

I've also been looking at the isolation tests and I see that you've added a
series of tests for NOWAIT. I was wondering why you did that as that's
really existing code, probably if you thought the tests were a bit thin
around NOWAIT then maybe that should be a separate patch?

In ExecLockRows(), is there a need to define the wait_policy variable now?
It's just used once so you could probably just pass erm->waitPolicy
directly to heap_lock_tuple().

I'm a bit confused at some of the code in heap_lock_tuple(). If I'm not
wrong then after the line that does have_tuple_lock = true; it's never
possible for have_tuple_lock to be false, but I see you've added checks to
ensure we only unlock if have_tuple_lock is true. I'm thinking you probably
did this because in the goto failed situation the check is done, but I was
thinking that was perhaps put there in case a goto jump was added before
have_tuple_lock is set to true. I'm wondering if it would be ok just to
replace the test with an Assert() instead, or maybe just no check.

Also, I'm just looking at some of the changes that you've done to function
signatures... I see quite a few of them are now beyond 80 chars wide (see
http://www.postgresql.org/docs/devel/static/source-format.html).

Regards

David Rowley

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-27 22:19:45
Message-ID:	CADLWmXUavPdozuk785cg5CW=7bCZHZmvA-Yk5+9itb8aTdF-eg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 27 July 2014 14:31, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Sun, Jul 27, 2014 at 4:49 AM, Thomas Munro <munro(at)ip9(dot)org> wrote:
>>
>> Here is a new version of the patch with a single enum LockWaitPolicy
>> defined in utils/lockwaitpolicy.h.
>>
>
> That seems much cleaner
>
> A few more comments:
> You seem to have lost the comment which indicates that the values of the
> enum are important due to the code in applyLockingClause(), but I see now
> instead that you've added some assert checks in applyLockingClause(), likely
> these should use Assert() rather than StaticAssertExpr().

Here's a new version with explicit numerical values and a comment in
lockwaitpolicy.h to explain that the order is important and point to
the relevant code. The assertions are about the relationship between
constant values known at compile time, so why would we want a runtime
assertion? I have changed it from StaticAssertExpr to
StaticAssertStmt though.

> I've also been looking at the isolation tests and I see that you've added a
> series of tests for NOWAIT. I was wondering why you did that as that's
> really existing code, probably if you thought the tests were a bit thin
> around NOWAIT then maybe that should be a separate patch?

Since I was meddling with code that controls both NOWAIT and SKIP
LOCKED, I wanted to convince myself that I had not broken NOWAIT using
at least a basic smoke test . I suppose by the same logic I should
also wite isolation tests for default blocking FOR UPDATE... Ok, I've
taken nowait.spec out for now.

On the subject of isolation tests, I think skip-locked.spec is only
producing schedules that reach third of the three 'return
HeapTupleWouldBlock' statements in heap_lock_tuple. I will follow up
with some more thorough isolation tests in the next week or so to
cover the other two, and some other scenarios and interactions with
other feature.

> In ExecLockRows(), is there a need to define the wait_policy variable now?
> It's just used once so you could probably just pass erm->waitPolicy directly
> to heap_lock_tuple().

Fixed.

> I'm a bit confused at some of the code in heap_lock_tuple(). If I'm not
> wrong then after the line that does have_tuple_lock = true; it's never
> possible for have_tuple_lock to be false, but I see you've added checks to
> ensure we only unlock if have_tuple_lock is true. I'm thinking you probably
> did this because in the goto failed situation the check is done, but I was
> thinking that was perhaps put there in case a goto jump was added before
> have_tuple_lock is set to true. I'm wondering if it would be ok just to
> replace the test with an Assert() instead, or maybe just no check.

Right, I have removed the redundant conditionals.

> Also, I'm just looking at some of the changes that you've done to function
> signatures... I see quite a few of them are now beyond 80 chars wide (see
> http://www.postgresql.org/docs/devel/static/source-format.html).

Fixed.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v10.patch	text/x-patch	48.6 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-28 21:48:19
Message-ID:	CADLWmXWxuTLwRwXSiXzdeXyxXzTiC532pFUmVs8OuLp6XPE5Mw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 27 July 2014 23:19, Thomas Munro <munro(at)ip9(dot)org> wrote:
> On the subject of isolation tests, I think skip-locked.spec is only
> producing schedules that reach third of the three 'return
> HeapTupleWouldBlock' statements in heap_lock_tuple. I will follow up
> with some more thorough isolation tests in the next week or so to
> cover the other two, and some other scenarios and interactions with
> other feature.

Now with extra isolation tests so that the three different code
branches that can skip rows are covered. I temporarily added some
logging lines to double check that the expected branches are reached
by each permutation while developing the specs. They change the
output and are not part of the patch -- attaching separately.

Attachment	Content-Type	Size
skip-locked-v11.patch	text/x-patch	54.0 KB
debug-why-skipped.patch	text/x-patch	963 bytes

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-29 01:29:48
Message-ID:	20140729012948.GO5475@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:

> It might be better if we'd declared AclMode in a single-purpose header,
> say utils/aclmode.h, and then #include'd that into parsenodes.h.
> There's certainly plenty of other single-datatype headers laying about.

Do you mean src/include/datatype/aclmode.h?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-29 01:35:12
Message-ID:	20140729013511.GP5475@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

David Rowley wrote:

The isolation tester is new so we don't have nearly enough tests for it.
Adding more meaningful tests is good even if they're unrelated to the
patch at hand.

FWIW you can use configure --enable-coverage and "make coverage-html" to
get coverage reports.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-29 01:39:14
Message-ID:	16322.1406597954@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> It might be better if we'd declared AclMode in a single-purpose header,
>> say utils/aclmode.h, and then #include'd that into parsenodes.h.
>> There's certainly plenty of other single-datatype headers laying about.

> Do you mean src/include/datatype/aclmode.h?

I was thinking src/include/utils/, actually, but maybe datatype/ would
be a good choice.

OTOH, what we've got in there now is just timestamp.h, and IIRC it was put
there because it needed to be accessible from both frontend and backend
contexts. That would not be true of aclmode.h, so perhaps aclmode.h
doesn't belong there.

regards, tom lane

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-07-30 21:29:18
Message-ID:	CADLWmXWS9kpSNNLaD_Kst_iQJ-ezpb0LX7vzmkBTLh0VUzUQSA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 29 July 2014 02:35, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> David Rowley wrote:
>
>> I've also been looking at the isolation tests and I see that you've added a
>> series of tests for NOWAIT. I was wondering why you did that as that's
>> really existing code, probably if you thought the tests were a bit thin
>> around NOWAIT then maybe that should be a separate patch?
>
> The isolation tester is new so we don't have nearly enough tests for it.
> Adding more meaningful tests is good even if they're unrelated to the
> patch at hand.

Here are my isolation tests for NOWAIT as a separate patch,
independent of SKIP LOCKED. They cover the tuple lock, regular row
lock and multixact row lock cases. I guess this might be called white
box testing, since it usese knowledge of how to construct schedules
that hit the three interesting code paths that trigger the error, even
though you can't see from the output why the error was raised in each
case without extra instrumentation (though it did cross my mind that
it could be interesting at the very least for testing if the error
message were different in each case). If there are no objections I
will add this to the next commitfest.

Best regards
Thomas Munro

Attachment	Content-Type	Size
nowait-isolation-tests.patch	text/x-patch	7.2 KB

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-01 09:28:26
Message-ID:	CAApHDvr=q+7zyMPdEhnN8yu+BBQQa0jw5STrpfvu6ijkXv86dQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jul 29, 2014 at 1:35 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> David Rowley wrote:
>
> > I've also been looking at the isolation tests and I see that you've
> added a
> > series of tests for NOWAIT. I was wondering why you did that as that's
> > really existing code, probably if you thought the tests were a bit thin
> > around NOWAIT then maybe that should be a separate patch?
>
> The isolation tester is new so we don't have nearly enough tests for it.
> Adding more meaningful tests is good even if they're unrelated to the
> patch at hand.
>
>
I completely agree that some more isolation tests coverage would be a good
thing. I just saw it as something not directly related to this feature, so
thought it would be better as a separate patch. From my experience with the
project, normally when I try to sneak something extra in, it either does
not make the final commit, or gets added in a separate commit.

Regards

David Rowley

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-01 09:37:24
Message-ID:	CAApHDvq+Kc0KnpFN8D=32GQuRXqxD71snHFebv2jeOsRJjAkNA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jul 29, 2014 at 9:48 AM, Thomas Munro <munro(at)ip9(dot)org> wrote:

> On 27 July 2014 23:19, Thomas Munro <munro(at)ip9(dot)org> wrote:
> > On the subject of isolation tests, I think skip-locked.spec is only
> > producing schedules that reach third of the three 'return
> > HeapTupleWouldBlock' statements in heap_lock_tuple. I will follow up
> > with some more thorough isolation tests in the next week or so to
> > cover the other two, and some other scenarios and interactions with
> > other feature.
>
> Now with extra isolation tests so that the three different code
> branches that can skip rows are covered. I temporarily added some
> logging lines to double check that the expected branches are reached
> by each permutation while developing the specs. They change the
> output and are not part of the patch -- attaching separately.
>

I've had a look over the isolation tests now and I can't see too much
wrong, just a couple of typos...

* skip-locked-2.spec

# s2 againt skips because it can't acquired a multi-xact lock

"againt" should be "again"

also mixed use of multixact and multi-xact, probably would be better to
stick to just 1.

Also this file would probably be slightly easier to read with a new line
after each "permutation" line.

* skip_locked-3.spec

# s3 skips to second record due to tuple lock held by s2

There's a missing "the" after "skips to"

Also, won't the lock be held by s1 not s2?

There's just a couple of other tiny things.

* Some whitespace issues shown by git diff --check

src/backend/parser/gram.y:9221: trailing whitespace.
+opt_nowait_or_skip:
src/backend/rewrite/rewriteHandler.c:65: trailing whitespace.
+
LockClauseStrength strength, LockWaitPolicy waitPolicy,

* analyze.c

The StaticAssertStmt's I think these should be removed. The only place
where this behaviour can be changed
is in lockwaitpolicy.h, I think it would be better to just strengthen the
comment on the enum's definition.

Perhaps something more along the lines of:

Policy for what to do when a row lock cannot be obtained immediately.

The enum values defined here have critical importance to how the parser
treats multiple FOR UPDATE/SHARE statements in different nested levels of
the query. Effectively if multiple locking clauses are defined in the query
then the one with the highest enum value takes precedence over the others.

Apart from this I can't see any other problems with the patch and I'd be
very inclined, once the above are fixed up to mark the patch ready for
commiter.

Good work

Regards

David Rowley

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-01 15:55:55
Message-ID:	CADLWmXUHDKVbqXvpxjwcNcaRwv73UfYgYq3+p=AjurMSU_oPVQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 1 August 2014 10:37, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> * skip-locked-2.spec
>
> # s2 againt skips because it can't acquired a multi-xact lock
>
> "againt" should be "again"

Fixed.

> also mixed use of multixact and multi-xact, probably would be better to
> stick to just 1.

Changed to multixact as seen in other places.

> Also this file would probably be slightly easier to read with a new line
> after each "permutation" line.

Done.

> * skip_locked-3.spec
>
> # s3 skips to second record due to tuple lock held by s2
>
> There's a missing "the" after "skips to"

Fixed.

> Also, won't the lock be held by s1 not s2?

s1 holds the row lock, and s2 holds the tuple lock (because it is at
the head of the queue waiting for the row lock). s3 skips because it
couldn't acquire the tuple lock held by s2. The other two specs are
about not being able to acquire a row lock and only need two sessions.
This one tests the code path when there is already a queue forming and
you can't even get the tuple lock, which requires an extra session. I
have updated the comment to make that clearer.

> There's just a couple of other tiny things.
>
> * Some whitespace issues shown by git diff --check
>
> src/backend/parser/gram.y:9221: trailing whitespace.
> +opt_nowait_or_skip:
> src/backend/rewrite/rewriteHandler.c:65: trailing whitespace.
> +
> LockClauseStrength strength, LockWaitPolicy waitPolicy,

Fixed.

> * analyze.c
>
> The StaticAssertStmt's I think these should be removed. The only place where
> this behaviour can be changed
> is in lockwaitpolicy.h, I think it would be better to just strengthen the
> comment on the enum's definition.

Removed.

> Perhaps something more along the lines of:
>
> Policy for what to do when a row lock cannot be obtained immediately.
>
> The enum values defined here have critical importance to how the parser
> treats multiple FOR UPDATE/SHARE statements in different nested levels of
> the query. Effectively if multiple locking clauses are defined in the query
> then the one with the highest enum value takes precedence over the others.

Added something along those lines.

> Apart from this I can't see any other problems with the patch and I'd be
> very inclined, once the above are fixed up to mark the patch ready for
> commiter.
>
> Good work

Thanks for all the guidance, I appreciate it! My review karma account
is now well overdrawn.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v12.patch	text/x-patch	54.1 KB

From:	David Rowley <dgrowleyml(at)gmail(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-02 00:59:15
Message-ID:	CAApHDvp=9ZcnKp1BnBoFQsaJf7_0N3M1J9WY=+n-n97HwOeXOA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, Aug 2, 2014 at 3:55 AM, Thomas Munro <munro(at)ip9(dot)org> wrote:

> On 1 August 2014 10:37, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > Apart from this I can't see any other problems with the patch and I'd be
> > very inclined, once the above are fixed up to mark the patch ready for
> > commiter.
> >
> > Good work
>
> Thanks for all the guidance, I appreciate it! My review karma account
> is now well overdrawn.
>

Ok, then I have nothing more so it's time to pass this one along.

The only notes I can think to leave for the commiter would be around the
precedence order of the lock policy, especially around a query such as:

SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE; --
skip locked wins

Of course the current behaviour is that NOWAIT wins over the standard FOR
UPDATE, but with NOWAIT, there's only a chance of an error, there's no
chance of giving incorrect results.

I checked what Oracle did in this situation and I see that they completely
disallow FOR UPDATE inside of views and subqueries.

I could see an argument here that the outer most FOR UPDATE clause should
be used, but I guess that ship has sailed when NOWAIT was added.

Marking as ready for commiter.

Regards

David Rowley

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	David Rowley <dgrowleyml(at)gmail(dot)com>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-04 16:02:44
Message-ID:	20140804160244.GH5475@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

David Rowley wrote:

> The only notes I can think to leave for the commiter would be around the
> precedence order of the lock policy, especially around a query such as:
>
> SELECT * FROM (SELECT * FROM a FOR UPDATE SKIP LOCKED) a FOR UPDATE; --
> skip locked wins
>
> Of course the current behaviour is that NOWAIT wins over the standard FOR
> UPDATE, but with NOWAIT, there's only a chance of an error, there's no
> chance of giving incorrect results.

Another option is to throw an error at parse analysis time if there is a
conflict in the specified locking policies, as in the above case. Are
there cases in which it would make sense to have one clause trump the
other? It seems reasonable to have NOWAIT trump regular FOR UPDATE (as
it already does), since, as you say, there's chance of error being
thrown at runtime, but not of incorrect result.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-22 19:28:40
Message-ID:	20140822192839.GP6343@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

One thing I just noticed is that we uselessly set an error context
callback when "waiting" in ConditionalMultiXactIdWait, which is pretty
useless (because we don't actually wait there at all) -- we don't set
one in ConditionalXactLockTableWait, which makes sense, but for some
reason I failed to realize this in review of f88d4cfc9. I think I will
take that out instead of cluttering this patch with it.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-22 22:02:42
Message-ID:	20140822220242.GT6343@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

heap_lock_tuple() has the following comment on top:

* In the failure cases, the routine fills *hufd with the tuple's t_ctid,
* t_xmax (resolving a possible MultiXact, if necessary), and t_cmax
* (the last only for HeapTupleSelfUpdated, since we
* cannot obtain cmax from a combocid generated by another transaction).
* See comments for struct HeapUpdateFailureData for additional info.

With the patch as submitted, this struct is not filled in the
HeapTupleWouldBlock case. I'm not sure this is okay, though I admit the
only caller that passes LockWaitSkip doesn't care, so maybe we could
just ignore the issue (after properly modifying the comment). It seems
easy to add a LockBuffer() and "goto failed" rather than a return; that
would make that failure case conform to HeapTupleUpdated and
HeapTupleSelfUpdated. OTOH it might be pointless to worry about what
would be essentially dead code currently ...

Did you consider heap_lock_updated_tuple? A rationale for saying it
doesn't need to pay attention to the wait policy is: if you're trying to
lock-skip-locked an updated tuple, then you either skip it because its
updater is running, or you return it because it's no longer running; and
if you return it, it is not possible for the updater to be locking the
updated version. However, what if there's a third transaction that
locked the updated version? It might be difficult to hit this case or
construct an isolationtester spec file though, since there's a narrow
window you need to race to.

I also pgindented.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
skip-locked-v12-a.patch	text/x-diff	10.5 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-24 21:04:09
Message-ID:	CADLWmXV_bgbAu2LrG_V=ReU9ngA-wRccZUVTxYpG2y8JFnhubw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 22 August 2014 23:02, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> heap_lock_tuple() has the following comment on top:
>
> * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
> * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax
> * (the last only for HeapTupleSelfUpdated, since we
> * cannot obtain cmax from a combocid generated by another transaction).
> * See comments for struct HeapUpdateFailureData for additional info.
>
> With the patch as submitted, this struct is not filled in the
> HeapTupleWouldBlock case. I'm not sure this is okay, though I admit the
> only caller that passes LockWaitSkip doesn't care, so maybe we could
> just ignore the issue (after properly modifying the comment). It seems
> easy to add a LockBuffer() and "goto failed" rather than a return; that
> would make that failure case conform to HeapTupleUpdated and
> HeapTupleSelfUpdated. OTOH it might be pointless to worry about what
> would be essentially dead code currently ...

Thanks Alvaro.

Forgive me if I have misunderstood but it looks like your incremental
patch included a couple of unrelated changes, namely
s/0/InvalidCommandId/ and a reversion of ConditionalMultiXactIdWait.
Undoing those gave me skip-locked-v12-b.patch, attached for reference,
and I've included those changes in a new full patch
skip-locked-v13.patch (also rebased).

+1 for the change from if-then-else to switch statements.

I was less sure about the 'goto failed' change, but I couldn't measure
any change in concurrent throughput in my simple benchmark, so I guess
that extra buffer lock/unlock doesn't matter and I can see why you
prefer that control flow.

> Did you consider heap_lock_updated_tuple? A rationale for saying it
> doesn't need to pay attention to the wait policy is: if you're trying to
> lock-skip-locked an updated tuple, then you either skip it because its
> updater is running, or you return it because it's no longer running; and
> if you return it, it is not possible for the updater to be locking the
> updated version. However, what if there's a third transaction that
> locked the updated version? It might be difficult to hit this case or
> construct an isolationtester spec file though, since there's a narrow
> window you need to race to.

Hmm. I will look into this, thanks.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v12-b.patch	text/x-patch	8.6 KB
skip-locked-v13.patch	text/x-patch	56.2 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-25 01:44:23
Message-ID:	CADLWmXUpOyJuD-Q6=BcGZf8afjbwhixDH72WVXNVv+oFV27GEw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 24 August 2014 22:04, Thomas Munro <munro(at)ip9(dot)org> wrote:
> On 22 August 2014 23:02, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>> Did you consider heap_lock_updated_tuple? A rationale for saying it
>> doesn't need to pay attention to the wait policy is: if you're trying to
>> lock-skip-locked an updated tuple, then you either skip it because its
>> updater is running, or you return it because it's no longer running; and
>> if you return it, it is not possible for the updater to be locking the
>> updated version. However, what if there's a third transaction that
>> locked the updated version? It might be difficult to hit this case or
>> construct an isolationtester spec file though, since there's a narrow
>> window you need to race to.
>
> Hmm. I will look into this, thanks.

While trying to produce the heap_lock_updated_tuple_rec case you
describe (so far unsuccessfully), I discovered I could make SELECT ...
FOR UPDATE NOWAIT block indefinitely on unpatched 9.3 in a different
code path after heap_lock_tuple returns: in another session, UPDATE,
COMMIT, then UPDATE, all after the first session has taken its
snapshot but before it tries to lock a given row. The code in
EvalPlanQualFetch (reached from the HeapTupleUpdated case in
ExecLockRow) finishes up waiting for the uncommitted transaction.

I think I see how to teach EvalPlanQualFetch how to handle wait
policies: for NOWAIT it should ereport (fixing a pre-existing bug
(?)), and I guess it should handle SKIP LOCKED by returning NULL,
similarly to the way it handles deleted rows, and of course in all
cases passing the wait policy forward to heap_lock_tuple, which it
eventually calls.

Still looking at heap_lock_updated_tuple.

The difficulty of course will be testing all these racy cases reproducibly...

Best regards,
Thomas Munro

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-25 01:50:57
Message-ID:	53FA9681.8050003@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 08/25/2014 09:44 AM, Thomas Munro wrote:
> On 24 August 2014 22:04, Thomas Munro <munro(at)ip9(dot)org> wrote:
>> On 22 August 2014 23:02, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>>> Did you consider heap_lock_updated_tuple? A rationale for saying it
>>> doesn't need to pay attention to the wait policy is: if you're trying to
>>> lock-skip-locked an updated tuple, then you either skip it because its
>>> updater is running, or you return it because it's no longer running; and
>>> if you return it, it is not possible for the updater to be locking the
>>> updated version. However, what if there's a third transaction that
>>> locked the updated version? It might be difficult to hit this case or
>>> construct an isolationtester spec file though, since there's a narrow
>>> window you need to race to.
>>
>> Hmm. I will look into this, thanks.
>
> While trying to produce the heap_lock_updated_tuple_rec case you
> describe (so far unsuccessfully), I discovered I could make SELECT ...
> FOR UPDATE NOWAIT block indefinitely on unpatched 9.3 in a different
> code path after heap_lock_tuple returns: in another session, UPDATE,
> COMMIT, then UPDATE, all after the first session has taken its
> snapshot but before it tries to lock a given row. The code in
> EvalPlanQualFetch (reached from the HeapTupleUpdated case in
> ExecLockRow) finishes up waiting for the uncommitted transaction.

I think that's the issue Andres and I patched for 9.3, but I don't know
if it got committed.

I'll need to have a look. A search in the archives for heap_lock_tuple
and nowait might be informative.

> The difficulty of course will be testing all these racy cases reproducibly...

Yep, especially as isolationtester can only really work at the statement
level, and can only handle one blocked connection at a time.

It's possible a helper extension could be used - set up some locks in
shmem, register two sessions for different "test roles" within a given
test to install appropriate hooks, wait until they're both blocked on
the locks the hooks acquire, then release the locks.

That might land up with hook points scattered everywhere, but they could
be some pretty minimal macros at least.

IMO that's a separate project, though. For this kind of testing I've
tended to just set conditional breakpoints in gdb, wait until they trap,
then once everything's lined up release the breakpoints in both sessions
as simultaneously as possible. Not perfect, but has tended to work.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-25 01:57:57
Message-ID:	20140825015757.GU6343@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:

> While trying to produce the heap_lock_updated_tuple_rec case you
> describe (so far unsuccessfully), I discovered I could make SELECT ...
> FOR UPDATE NOWAIT block indefinitely on unpatched 9.3 in a different
> code path after heap_lock_tuple returns: in another session, UPDATE,
> COMMIT, then UPDATE, all after the first session has taken its
> snapshot but before it tries to lock a given row. The code in
> EvalPlanQualFetch (reached from the HeapTupleUpdated case in
> ExecLockRow) finishes up waiting for the uncommitted transaction.

Interesting -- perhaps this helps explain the deadlock issue reported in
http://www.postgresql.org/message-id/CAKrjmhdN+GhAjNwqfHsOtGp+7YN27zR79m99RcAJMNazt5NJrA@mail.gmail.com

> I think I see how to teach EvalPlanQualFetch how to handle wait
> policies: for NOWAIT it should ereport (fixing a pre-existing bug
> (?)), and I guess it should handle SKIP LOCKED by returning NULL,
> similarly to the way it handles deleted rows, and of course in all
> cases passing the wait policy forward to heap_lock_tuple, which it
> eventually calls.
>
> Still looking at heap_lock_updated_tuple.
>
> The difficulty of course will be testing all these racy cases reproducibly...

Does this help?
http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com
The useful trick there is forcing a query to get its snapshot and then
go to sleep before actually doing anything, by way of an advisory lock.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-25 15:17:19
Message-ID:	20140825151719.GV6343@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:
> On 22 August 2014 23:02, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> Forgive me if I have misunderstood but it looks like your incremental
> patch included a couple of unrelated changes, namely
> s/0/InvalidCommandId/ and a reversion of ConditionalMultiXactIdWait.

Yeah, sorry about those, will push separately.

> Undoing those gave me skip-locked-v12-b.patch, attached for reference,
> and I've included those changes in a new full patch
> skip-locked-v13.patch (also rebased).
>
> +1 for the change from if-then-else to switch statements.
>
> I was less sure about the 'goto failed' change, but I couldn't measure
> any change in concurrent throughput in my simple benchmark, so I guess
> that extra buffer lock/unlock doesn't matter and I can see why you
> prefer that control flow.

I was also thinking in reducing the lock level acquired to shared rather
than exclusive in all the paths that "goto failed". Since the lock is
only used to read a couple of fields from the tuple, shared is enough
and should give slightly better concurrency. Per buffer access rules in
src/backend/storage/buffer/README:

: 1. To scan a page for tuples, one must hold a pin and either shared or
: exclusive content lock. To examine the commit status (XIDs and status bits)
: of a tuple in a shared buffer, one must likewise hold a pin and either shared
: or exclusive lock.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Craig Ringer" <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-25 17:18:05
Message-ID:	53FB6FCD.1030405@vmware.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07/31/2014 12:29 AM, Thomas Munro wrote:
> On 29 July 2014 02:35, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>> David Rowley wrote:
>>
>>> I've also been looking at the isolation tests and I see that you've added a
>>> series of tests for NOWAIT. I was wondering why you did that as that's
>>> really existing code, probably if you thought the tests were a bit thin
>>> around NOWAIT then maybe that should be a separate patch?
>>
>> The isolation tester is new so we don't have nearly enough tests for it.
>> Adding more meaningful tests is good even if they're unrelated to the
>> patch at hand.
>
> Here are my isolation tests for NOWAIT as a separate patch,
> independent of SKIP LOCKED. They cover the tuple lock, regular row
> lock and multixact row lock cases.

Thanks, committed.

> I guess this might be called white
> box testing, since it usese knowledge of how to construct schedules
> that hit the three interesting code paths that trigger the error, even
> though you can't see from the output why the error was raised in each
> case without extra instrumentation (though it did cross my mind that
> it could be interesting at the very least for testing if the error
> message were different in each case).

Yeah, seems reasonable. This kind of tests might become obsolete in the
future if the internals change a lot, so that e.g. we don't use
multixids anymore. But even if that happens, there's little harm in
keeping the tests, and meanwhile it's good to have coverage of these cases.

- Heikki

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-25 21:15:31
Message-ID:	CADLWmXXJtq1OsxmrfdNHA9hTLRP+v5mpq5QjxSXF+BeXX+hOow@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 25 August 2014 02:57, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Thomas Munro wrote:
>> The difficulty of course will be testing all these racy cases reproducibly...
>
> Does this help?
> http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com
> The useful trick there is forcing a query to get its snapshot and then
> go to sleep before actually doing anything, by way of an advisory lock.

Yes it does, thanks Alvaro and Craig. I think the attached spec
reproduces the problem using that trick, ie shows NOWAIT blocking,
presumably in EvalPlanQualFetch (though I haven't stepped through it
with a debugger yet). I'm afraid I'm out of Postgres hacking cycles
for a few days, but next weekend I should have a new patch that fixes
this by teaching EvalPlanQualFetch about wait policies, with isolation
tests for NOWAIT and SKIP LOCKED.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
nowait-4.spec	text/x-rpm-spec	626 bytes

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-27 16:18:46
Message-ID:	20140827161845.GF7046@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:
> On 25 August 2014 02:57, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > Thomas Munro wrote:
> >> The difficulty of course will be testing all these racy cases reproducibly...
> >
> > Does this help?
> > http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com
> > The useful trick there is forcing a query to get its snapshot and then
> > go to sleep before actually doing anything, by way of an advisory lock.
>
> Yes it does, thanks Alvaro and Craig. I think the attached spec
> reproduces the problem using that trick, ie shows NOWAIT blocking,
> presumably in EvalPlanQualFetch (though I haven't stepped through it
> with a debugger yet). I'm afraid I'm out of Postgres hacking cycles
> for a few days, but next weekend I should have a new patch that fixes
> this by teaching EvalPlanQualFetch about wait policies, with isolation
> tests for NOWAIT and SKIP LOCKED.

Hmm, http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-27 17:00:17
Message-ID:	CADLWmXUB+GFz0wvPDw6aZm5TfcFCHgrhH37CkfbRXwnDbik4xw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 27 August 2014 17:18, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Thomas Munro wrote:
>> On 25 August 2014 02:57, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>> > Thomas Munro wrote:
>> >> The difficulty of course will be testing all these racy cases reproducibly...
>> >
>> > Does this help?
>> > http://www.postgresql.org/message-id/51FB4305.3070600@2ndquadrant.com
>> > The useful trick there is forcing a query to get its snapshot and then
>> > go to sleep before actually doing anything, by way of an advisory lock.
>>
>> Yes it does, thanks Alvaro and Craig. I think the attached spec
>> reproduces the problem using that trick, ie shows NOWAIT blocking,
>> presumably in EvalPlanQualFetch (though I haven't stepped through it
>> with a debugger yet). I'm afraid I'm out of Postgres hacking cycles
>> for a few days, but next weekend I should have a new patch that fixes
>> this by teaching EvalPlanQualFetch about wait policies, with isolation
>> tests for NOWAIT and SKIP LOCKED.
>
> Hmm, http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com

Thanks, I hadn't seen this, I should have checked the archives better.
I have actually already updated my patch to handle EvalPlanQualFetch
with NOWAIT and SKIP LOCKED with isolation specs, see attached. I
will compare with Craig's and see if I screwed anything up... of
course I am happy to merge and submit a new patch on top of Craig's if
it's going to be committed.

I haven't yet figured out how to get get into a situation where
heap_lock_updated_tuple_rec waits.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v14.patch	text/x-patch	63.4 KB

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-27 19:52:18
Message-ID:	20140827195218.GJ7046@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:
> On 27 August 2014 17:18, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > Thomas Munro wrote:

> >> Yes it does, thanks Alvaro and Craig. I think the attached spec
> >> reproduces the problem using that trick, ie shows NOWAIT blocking,
> >> presumably in EvalPlanQualFetch (though I haven't stepped through it
> >> with a debugger yet). I'm afraid I'm out of Postgres hacking cycles
> >> for a few days, but next weekend I should have a new patch that fixes
> >> this by teaching EvalPlanQualFetch about wait policies, with isolation
> >> tests for NOWAIT and SKIP LOCKED.
> >
> > Hmm, http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com
>
> Thanks, I hadn't seen this, I should have checked the archives better.
> I have actually already updated my patch to handle EvalPlanQualFetch
> with NOWAIT and SKIP LOCKED with isolation specs, see attached. I
> will compare with Craig's and see if I screwed anything up... of
> course I am happy to merge and submit a new patch on top of Craig's if
> it's going to be committed.

I tried Craig's patch with your test case and found that it stalls in
XactLockTableWait inside EPQFetch because it doesn't throw an error in
the noWait case before waiting. I think I will fix that and push,
including both test cases. Then we can see about rebasing your patch.

I am wondering about backpatching Craig's fix. It looks to me like it
should be backpatched as far back as NOWAIT exists, but that was in 8.1
and we haven't ever gotten a complaint until Craig's report AFAIK, which
I understand wasn't coming from a user finding a problem but rather some
new development. So I hesitate.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>, Thomas Munro <munro(at)ip9(dot)org>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: FOR [SHARE\|UPDATE] NOWAIT may still block in EvalPlanQualFetch
Date:	2014-08-27 23:22:24
Message-ID:	20140827232224.GL7046@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Craig Ringer wrote:
> FOR SHARE|UPDATE NOWAIT will still block if they have to follow a ctid
> chain because the call to EvalPlanQualFetch doesn't take a param for
> noWait, so it doesn't know not to block if the updated row can't be locked.

Applied with some further editorialization.

In another thread, I wrote in response to Thomas Munro:

> I tried Craig's patch with your test case and found that it stalls in
> XactLockTableWait inside EPQFetch because it doesn't throw an error in
> the noWait case before waiting. I think I will fix that and push,
> including both test cases.

Done. I also made the noWait case try a ConditionalXactLockTableWait
before raising an error, because there's a small chance that the
updating transaction aborts just before we check; in this case there is
no need to raise an error.

> I am wondering about backpatching Craig's fix. It looks to me like it
> should be backpatched as far back as NOWAIT exists, but that was in 8.1
> and we haven't ever gotten a complaint until Craig's report AFAIK, which
> I understand wasn't coming from a user finding a problem but rather some
> new development. So I hesitate.

On further reflection, the reason users don't complain is that it's
quite difficult to notice that there is an issue. And on the other
hand, since they are already specifying NOWAIT in the query, surely they
must already be expecting an error to be raised; it's not like this
introduces a new failure mode. My inclination would be to backpatch the
fix all the way back.

However, due to time constraints (because it fails to apply cleanly
to older branches) and due to lack of user complaints, I only
backpatched to 9.4. We can always revisit that, if there's demand.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-27 23:25:18
Message-ID:	20140827232518.GM7046@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:

> Thanks, I hadn't seen this, I should have checked the archives better.
> I have actually already updated my patch to handle EvalPlanQualFetch
> with NOWAIT and SKIP LOCKED with isolation specs, see attached. I
> will compare with Craig's and see if I screwed anything up... of
> course I am happy to merge and submit a new patch on top of Craig's if
> it's going to be committed.

Thanks, please rebase.

> I haven't yet figured out how to get get into a situation where
> heap_lock_updated_tuple_rec waits.

Well, as I think I said in the first post I mentioned this, maybe there
is no such situation. In any case, like the EvalPlanQualFetch issue, we
can fix it later if we find it.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-28 07:31:37
Message-ID:	CADLWmXUVkvDqodVCiRGdWW_BAHjwZZWEAi22GUFhx+KhY6JkTQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 28 August 2014 00:25, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Thomas Munro wrote:
>
>> Thanks, I hadn't seen this, I should have checked the archives better.
>> I have actually already updated my patch to handle EvalPlanQualFetch
>> with NOWAIT and SKIP LOCKED with isolation specs, see attached. I
>> will compare with Craig's and see if I screwed anything up... of
>> course I am happy to merge and submit a new patch on top of Craig's if
>> it's going to be committed.
>
> Thanks, please rebase.

Thanks -- new patch attached.

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v15.patch	text/x-patch	62.9 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-08-31 00:36:28
Message-ID:	CADLWmXU9LXhxowjjzN5ZTQtiNxv7dWT3VjNzqiKYxJA5BEN72w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 28 August 2014 00:25, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Thomas Munro wrote:
>> I haven't yet figured out how to get get into a situation where
>> heap_lock_updated_tuple_rec waits.
>
> Well, as I think I said in the first post I mentioned this, maybe there
> is no such situation. In any case, like the EvalPlanQualFetch issue, we
> can fix it later if we find it.

I finally came up with a NOWAIT spec that reaches
heap_lock_updated_rec and then blocks. I can't explain why exactly...
but please see attached. The fix seems fairly straightforward. Do
you think I should submit an independent patch to fix this case (well
there are really two cases, since there is a separate multixact path)
for the existing NOWAIT support and then tackle the SKIP LOCKED
equivalent separately?

Best regards,
Thomas Munro

Attachment	Content-Type	Size
nowait-6.spec	text/x-rpm-spec	635 bytes

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-09-06 10:40:00
Message-ID:	CADLWmXWxCBV4dnW6eOyiSt=7hRvnxpgfDLy+TFRLPuOEWxKpeA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 31 August 2014 01:36, Thomas Munro <munro(at)ip9(dot)org> wrote:
> On 28 August 2014 00:25, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>> Thomas Munro wrote:
>>> I haven't yet figured out how to get get into a situation where
>>> heap_lock_updated_tuple_rec waits.
>>
>> Well, as I think I said in the first post I mentioned this, maybe there
>> is no such situation. In any case, like the EvalPlanQualFetch issue, we
>> can fix it later if we find it.
>
> I finally came up with a NOWAIT spec that reaches
> heap_lock_updated_rec and then blocks. I can't explain why exactly...
> but please see attached. The fix seems fairly straightforward. Do
> you think I should submit an independent patch to fix this case (well
> there are really two cases, since there is a separate multixact path)
> for the existing NOWAIT support and then tackle the SKIP LOCKED
> equivalent separately?

Oops, that isolation wasn't right, please disregard. Unless you think
this obscure case needs to be addressed before the SKIP LOCKED patch
can be committed, I will investigate it separately and maybe submit
something to a future commitfest.

Best regards,
Thomas Munro

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-09-10 13:47:28
Message-ID:	20140910134728.GB4701@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:

> @@ -2022,7 +2030,7 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, bool noWait,
> */
> test = heap_lock_tuple(relation, &tuple,
> estate->es_output_cid,
> - lockmode, noWait,
> + lockmode, wait_policy,
> false, &buffer, &hufd);
> /* We now have two pins on the buffer, get rid of one */
> ReleaseBuffer(buffer);

Doesn't this heap_lock_tuple() need to check for a WouldBlock result?
Not sure that this is the same case that you were trying to test in
heap_lock_updated_tuple; I think this requires an update chain (so that
EPQFetch is invoked) and some tuple later in the chain is locked by a
third transaction.

I attach some additional minor suggestions to your patch. Please feel
free to reword comments differently if you think my wording isn't an
improvements (or I've maked an english mistakes).

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-09-10 19:37:22
Message-ID:	CA+TgmoZoWLAcFDQDu758nfSbtFPTP3hg7Mb0gjKzpq-jDU8GFA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Sep 10, 2014 at 9:47 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> I attach some additional minor suggestions to your patch.

I don't see any attachment.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Thomas Munro <munro(at)ip9(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-09-10 20:35:55
Message-ID:	20140910203555.GE4701@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas wrote:
> On Wed, Sep 10, 2014 at 9:47 AM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > I attach some additional minor suggestions to your patch.
>
> I don't see any attachment.

Uh. Here it is.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment	Content-Type	Size
skip-locked-v15-a.patch	text/x-diff	5.6 KB

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-09-11 23:10:37
Message-ID:	CADLWmXWVdFCOnEJgRUpWNXLTW=nbSf-8KnZ5MndvjRzYdp_itw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 10 September 2014 14:47, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Thomas Munro wrote:
>
>> @@ -2022,7 +2030,7 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode, bool noWait,
>> */
>> test = heap_lock_tuple(relation, &tuple,
>> estate->es_output_cid,
>> - lockmode, noWait,
>> + lockmode, wait_policy,
>> false, &buffer, &hufd);
>> /* We now have two pins on the buffer, get rid of one */
>> ReleaseBuffer(buffer);
>
> Doesn't this heap_lock_tuple() need to check for a WouldBlock result?
> Not sure that this is the same case that you were trying to test in
> heap_lock_updated_tuple; I think this requires an update chain (so that
> EPQFetch is invoked) and some tuple later in the chain is locked by a
> third transaction.

You're right, that case was clearly lacking. The new patch, attached,
releases the buffer and returns NULL in that case. But I have no idea
how to reach it in an isolation test! skip-locked-4.spec gets pretty
close, it creates the right kind of update chain to get into
EvalPlanQualFetch and then enter the conditional block beginning:

if (TransactionIdIsValid(SnapshotDirty.xmax))

But to reach the case you mentioned, it would need to get past that
(xmax is not a valid transaction) but then the tuple would need to be
locked by another session before heap_lock_tuple is called a few lines
below. That's a race scenario that I don't believe we can create
using advisory lock tricks in an isolation test.

> I attach some additional minor suggestions to your patch. Please feel
> free to reword comments differently if you think my wording isn't an
> improvements (or I've maked an english mistakes).

Thanks, these are incorporated in the new version (also rebased).

Best regards,
Thomas Munro

Attachment	Content-Type	Size
skip-locked-v16.patch	text/x-patch	63.6 KB

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-09-12 02:56:21
Message-ID:	20140912025621.GI4701@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:

> > Doesn't this heap_lock_tuple() need to check for a WouldBlock result?
> > Not sure that this is the same case that you were trying to test in
> > heap_lock_updated_tuple; I think this requires an update chain (so that
> > EPQFetch is invoked) and some tuple later in the chain is locked by a
> > third transaction.
>
> You're right, that case was clearly lacking. The new patch, attached,
> releases the buffer and returns NULL in that case. But I have no idea
> how to reach it in an isolation test! skip-locked-4.spec gets pretty
> close, it creates the right kind of update chain to get into
> EvalPlanQualFetch and then enter the conditional block beginning:
>
> if (TransactionIdIsValid(SnapshotDirty.xmax))
>
> But to reach the case you mentioned, it would need to get past that
> (xmax is not a valid transaction) but then the tuple would need to be
> locked by another session before heap_lock_tuple is called a few lines
> below. That's a race scenario that I don't believe we can create
> using advisory lock tricks in an isolation test.

Hm, are you able to reproduce it using GDB?

Craig Ringer was saying elsewhere that there are other cases that are
impossible to test reliably and was proposing addings hooks or
something to block backends at convenient times. Not an easy problem ...

> > I attach some additional minor suggestions to your patch. Please feel
> > free to reword comments differently if you think my wording isn't an
> > improvements (or I've maked an english mistakes).
>
> Thanks, these are incorporated in the new version (also rebased).

Great, thanks; I'll look at it again soon to commit, as I think we're
done now.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Thomas Munro <munro(at)ip9(dot)org>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-09-14 22:30:27
Message-ID:	CADLWmXXss83oiYD0pn_SfQfg+yNEpPbPvgDb8w6Fh--jScSybA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 12 September 2014 03:56, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Thomas Munro wrote:
>> But to reach the case you mentioned, it would need to get past that
>> (xmax is not a valid transaction) but then the tuple would need to be
>> locked by another session before heap_lock_tuple is called a few lines
>> below. That's a race scenario that I don't believe we can create
>> using advisory lock tricks in an isolation test.
>
> Hm, are you able to reproduce it using GDB?
>
> Craig Ringer was saying elsewhere that there are other cases that are
> impossible to test reliably and was proposing addings hooks or
> something to block backends at convenient times. Not an easy problem ...

+1, I think that is a great idea.

FWIW here's some throwaway code that I used to do that:

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 79667f1..fbb3b55 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -54,6 +54,7 @@
#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/snapmgr.h"
@@ -2029,6 +2030,20 @@ EvalPlanQualFetch(EState *estate, Relation
relation, int lockmode,
}

/*
+ * Begin wait point debugging hack...
+ * TODO: Only in a special build mode...
+ * We tell anyone waiting that we have reached
wait point #42.
+ * We wait for permission to proceed from wait
point #43.
+ */
+ elog(WARNING, "XXX reached point 42, waiting
at point 43");
+ DirectFunctionCall1(pg_advisory_unlock_int8,
Int64GetDatum(42));
+ DirectFunctionCall1(pg_advisory_lock_int8,
Int64GetDatum(43));
+ elog(WARNING, "XXX continuing after point 43");
+ /*
+ * End wait point debugging hack.
+ */
+
+ /*
* This is a live tuple, so now try to lock it.
*/
test = heap_lock_tuple(relation, &tuple,

Using the attached isolation spec, that race case is reached. Yeah,
it's crude and confusing having those three advisory locks (one to
allow an update chain to be created after s1 takes a snapshot, and the
other two so that s2 can block s1 at the right point to produce that
race case), but I found this less messy than trying to reproduce
complicated concurrency scenarios with GDB.

IMHO it would be great if there were a tidy and supported way to do
this kind of thing, perhaps with a formal notion of named wait points
which are only compiled in in special test builds, and an optional set
of extra isolation specs that use them.

>> > I attach some additional minor suggestions to your patch. Please feel
>> > free to reword comments differently if you think my wording isn't an
>> > improvements (or I've maked an english mistakes).
>>
>> Thanks, these are incorporated in the new version (also rebased).
>
> Great, thanks; I'll look at it again soon to commit, as I think we're
> done now.

Thanks!

Thomas Munro

Attachment	Content-Type	Size
test.spec	text/x-rpm-spec	1.3 KB

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Thomas Munro <munro(at)ip9(dot)org>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SKIP LOCKED DATA (work in progress)
Date:	2014-10-07 20:29:45
Message-ID:	20141007202944.GQ7043@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thomas Munro wrote:

Pushed, thanks.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services