Re: foreign key locks

Lists: pgsql-hackers
From: "Kevin Grittner" <kgrittn(at)mail(dot)com>
To: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org,"Andres Freund" <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2012-11-29 15:23:26
Message-ID: 20121129152326.69330@gmx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> Alvaro Herrera wrote:
>
>> Here's version 24.
>
> This no longer applies after the rmgr rm_desc patch.

I took a shot at merging those so I could review the patch against
HEAD; attached in hopes that it will be useful.

Hopefully this isn't too far off from what .

-Kevin

Attachment Content-Type Size
fklocks-25.patch.gz application/x-gzip 97.1 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2012-11-29 16:02:20
Message-ID: 20121129160220.GA4942@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> Kevin Grittner wrote:
> > Alvaro Herrera wrote:
> >
> >> Here's version 24.
> >
> > This no longer applies after the rmgr rm_desc patch.
>
> I took a shot at merging those so I could review the patch against
> HEAD; attached in hopes that it will be useful.
>
> Hopefully this isn't too far off from what .

Uh, sorry for remaining silent -- I'm about to post an updated patch,
having just pushed my merge to github.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2012-11-29 16:16:54
Message-ID: 20121129161654.GB4942@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Kevin Grittner wrote:
> > Kevin Grittner wrote:
> > > Alvaro Herrera wrote:
> > >
> > >> Here's version 24.
> > >
> > > This no longer applies after the rmgr rm_desc patch.
> >
> > I took a shot at merging those so I could review the patch against
> > HEAD; attached in hopes that it will be useful.
> >
> > Hopefully this isn't too far off from what .
>
> Uh, sorry for remaining silent -- I'm about to post an updated patch,
> having just pushed my merge to github.

Here it is.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
fklocks-26.patch.gz application/octet-stream 96.8 KB

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>
Cc: "Kevin Grittner" <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Andres Freund" <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2012-12-22 09:53:47
Message-ID: b1a78c715f9195e6f96546c8e723afd8.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
> Here it is.
>
> fklocks-26.patch.gz

This applies today after removing, not only the infamous catversion.h chunk, but also a file_fdw
chunk. (It's not really a problem.)

I also wondered if anyone has any perl/python/bash programs handy that uses these new options. I
can hack something together myself, but I just thought I'd ask.

Thanks,

Erik Rijkers


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2012-12-22 11:50:12
Message-ID: 20121222115012.GA4202@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-12-22 10:53:47 +0100, Erik Rijkers wrote:
> On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
> > Here it is.
> >
> > fklocks-26.patch.gz
>
> This applies today after removing, not only the infamous catversion.h chunk, but also a file_fdw
> chunk. (It's not really a problem.)
>
> I also wondered if anyone has any perl/python/bash programs handy that uses these new options. I
> can hack something together myself, but I just thought I'd ask.

I have mostly done code review and some very targeted testing (pgbench,
gdb + psql). So no ready scripts, sorry.
There are imo some unfinished pieces (the whole EPQ integration) that
will require significant retesting once Alvaro has time to work on this
again..

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2012-12-28 20:48:47
Message-ID: 20121228204847.GD4150@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Erik Rijkers wrote:
> On Thu, November 29, 2012 17:16, Alvaro Herrera wrote:
> > Here it is.
> >
> > fklocks-26.patch.gz
>
> This applies today after removing, not only the infamous catversion.h chunk, but also a file_fdw
> chunk. (It's not really a problem.)

FWIW, while it's probably not interesting to you, it must be noted that
the catversion.h number to use must match a
contrib/pg_upgrade/pg_upgrade.h symbol.

As far as file_fdw hunks, I only see this one:

--- a/contrib/file_fdw/output/file_fdw.source
+++ b/contrib/file_fdw/output/file_fdw.source
@@ -191,7 +191,7 @@ ERROR: cannot change foreign table "agg_csv"
DELETE FROM agg_csv WHERE a = 100;
ERROR: cannot change foreign table "agg_csv"
SELECT * FROM agg_csv FOR UPDATE OF agg_csv;
-ERROR: SELECT FOR UPDATE/SHARE cannot be used with foreign table "agg_csv"
+ERROR: SELECT FOR UPDATE/SHARE/KEY UPDATE/KEY SHARE cannot be used with foreign table "agg_csv"
LINE 1: SELECT * FROM agg_csv FOR UPDATE OF agg_csv;
^
-- but this should be ignored

Surely that needs to be patched still? And this hunk applies without
any trouble, so I don't see why you had to remove it.

Anyway, here's v27, which is the same code merged to latest master.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
fklocks-27.patch.gz application/octet-stream 96.9 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2013-01-10 21:00:40
Message-ID: 20130110210040.GB4110@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here's version 28 of this patch. The only substantive change here from
v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
or LockTupleNoKeyExclusive, depending on whether the key columns are
being modified by the update. This needs to go through EvalPlanQual, so
that function is now getting the lock mode as a parameter instead of
hardcoded LockTupleExclusive. (All other users of GetTupleForTrigger
still use LockTupleExclusive, so there's no change for anybody other
than FOR EACH ROW BEFORE UPDATE triggers).

At this point, I think I've done all I wanted to do in connection with
this patch, and I intend to commit. I think this is a good time to get
it committed, so that people can play with it more extensively and
report any breakage I might have caused before we even get into beta.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
fklocks-28.patch.gz application/octet-stream 99.1 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-10 21:22:18
Message-ID: 20130110212218.GE4329@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
> Here's version 28 of this patch. The only substantive change here from
> v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> or LockTupleNoKeyExclusive, depending on whether the key columns are
> being modified by the update. This needs to go through EvalPlanQual, so
> that function is now getting the lock mode as a parameter instead of
> hardcoded LockTupleExclusive. (All other users of GetTupleForTrigger
> still use LockTupleExclusive, so there's no change for anybody other
> than FOR EACH ROW BEFORE UPDATE triggers).

Is that enough in case of a originally non-key update in read committed
mode that turns into a key update due to a concurrent key update?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-11 15:11:47
Message-ID: 20130111151147.GB4208@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:
> On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
> > Here's version 28 of this patch. The only substantive change here from
> > v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> > or LockTupleNoKeyExclusive, depending on whether the key columns are
> > being modified by the update. This needs to go through EvalPlanQual, so
> > that function is now getting the lock mode as a parameter instead of
> > hardcoded LockTupleExclusive. (All other users of GetTupleForTrigger
> > still use LockTupleExclusive, so there's no change for anybody other
> > than FOR EACH ROW BEFORE UPDATE triggers).
>
> Is that enough in case of a originally non-key update in read committed
> mode that turns into a key update due to a concurrent key update?

Hm, let me try to work through your example. You say that a transaction
T1 does a non-key update, and is working through the BEFORE UPDATE
trigger; then transaction T2 does a key update and changes the key
underneath T1? So at that point T1 becomes a key update, because it's
now using the original key values which are no longer the key?

I don't think this can really happen, because T2 (which is requesting
TupleLockExclusive) would block on the lock that the trigger is grabbing
(TupleLockNoKeyExclusive) on the tuple. So T2 would sleep until T1 is
committed.

Now, maybe you meant that the BEFORE UPDATE trigger changes the key
value but the user-supplied UPDATE query does not. So the trigger turns
the no-key update into a key update. What would happen here is that
GetTupleForTrigger would acquire TupleLockNoKeyExclusive on the tuple,
and later heap_update would acquire TupleLockExclusive. So there is
lock escalation happening. This could cause a deadlock against someone
else grabbing a TupleLockKeyShare on the tuple. I think the answer is
"don't do that", i.e. don't update the key columns in a BEFORE UPDATE
trigger.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-11 15:19:30
Message-ID: 20130111151930.GB6049@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-01-11 12:11:47 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > On 2013-01-10 18:00:40 -0300, Alvaro Herrera wrote:
> > > Here's version 28 of this patch. The only substantive change here from
> > > v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> > > or LockTupleNoKeyExclusive, depending on whether the key columns are
> > > being modified by the update. This needs to go through EvalPlanQual, so
> > > that function is now getting the lock mode as a parameter instead of
> > > hardcoded LockTupleExclusive. (All other users of GetTupleForTrigger
> > > still use LockTupleExclusive, so there's no change for anybody other
> > > than FOR EACH ROW BEFORE UPDATE triggers).
> >
> > Is that enough in case of a originally non-key update in read committed
> > mode that turns into a key update due to a concurrent key update?
>
> Hm, let me try to work through your example. You say that a transaction
> T1 does a non-key update, and is working through the BEFORE UPDATE
> trigger; then transaction T2 does a key update and changes the key
> underneath T1? So at that point T1 becomes a key update, because it's
> now using the original key values which are no longer the key?
>
> I don't think this can really happen, because T2 (which is requesting
> TupleLockExclusive) would block on the lock that the trigger is grabbing
> (TupleLockNoKeyExclusive) on the tuple. So T2 would sleep until T1 is
> committed.

No, I was thinking about an update without triggers present.

T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
T1: BEGIN; -- read committed
T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id = 1 */
T2: BEGIN; -- read committed
T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key update, waiting */
T1: COMMIT;
T2: /* UPDATE follows to updated row, due to the changed name its a key update now */

Does that make sense?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-11 16:10:49
Message-ID: 20130111161049.GD4208@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:

> No, I was thinking about an update without triggers present.
>
> T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
> T1: BEGIN; -- read committed
> T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id = 1 */
> T2: BEGIN; -- read committed
> T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key update, waiting */
> T1: COMMIT;
> T2: /* UPDATE follows to updated row, due to the changed name its a key update now */
>
> Does that make sense?

So I guess your question is "is T2 now holding a TupleLockExclusive
lock?" To answer it, I turned your example into a isolationtester spec:

setup
{
CREATE TABLE tbl(id serial primary key, name text unique, data text);
INSERT INTO tbl VALUES (1, 'blarg', 'no data');
}

teardown
{
DROP TABLE tbl;
}

session "s1"
step "s1b" { BEGIN; }
step "s1u" { UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; }
step "s1c" { COMMIT; }

session "s2"
step "s2b" { BEGIN; }
step "s2u" { UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; }
step "s2c" { COMMIT; }

session "s3"
step "s3l" { SELECT * FROM tbl FOR KEY SHARE; }

permutation "s1b" "s1u" "s2b" "s2u" "s1c" "s3l" "s2c"

And the results:
Parsed test spec with 3 sessions

starting permutation: s1b s1u s2b s2u s1c s3l s2c
step s1b: BEGIN;
step s1u: UPDATE tbl SET name = 'foo' WHERE name = 'blarg';
step s2b: BEGIN;
step s2u: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; <waiting ...>
step s1c: COMMIT;
step s2u: <... completed>
step s3l: SELECT * FROM tbl FOR KEY SHARE; <waiting ...>
step s2c: COMMIT;
step s3l: <... completed>
id name data

1 blarg blarg

So session 3 is correctly waiting for session 2 to finish before being
ablt to grab its FOR KEY SHARE lock, indicating that session 2 is
holding a FOR UPDATE lock. Good.

If I change session 1 to update the data column instead of name, session
3 no longer needs to wait for session 2, meaning session 2 now only
grabs a FOR NO KEY UPDATE lock. Also good.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-11 16:31:13
Message-ID: 20130111163113.GD6049@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-01-11 13:10:49 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
>
> > No, I was thinking about an update without triggers present.
> >
> > T0: CREATE TABLE tbl(id serial pk, name text unique, data text);
> > T1: BEGIN; -- read committed
> > T1: UPDATE tbl SET name = 'foo' WHERE name = 'blarg'; /* key update of row id = 1 */
> > T2: BEGIN; -- read committed
> > T2: UPDATE tbl SET name = 'blarg', data = 'blarg' WHERE id = 1; /* no key update, waiting */
> > T1: COMMIT;
> > T2: /* UPDATE follows to updated row, due to the changed name its a key update now */
> >
> > Does that make sense?
>
> So I guess your question is "is T2 now holding a TupleLockExclusive
> lock?" To answer it, I turned your example into a isolationtester spec:

Great! I reread the code and it does make sense the way its implemented
now. I misremembered something...

I vote for adding that spectest including some appropriate permutations.

FWIW: Looks good to me. It could use another pair of eyes, but I guess
that will have to come by being used.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2013-01-18 20:02:04
Message-ID: 20130118200204.GH4063@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Here's version 28 of this patch. The only substantive change here from
> v26 is that I've made GetTupleForTrigger() use either LockTupleExclusive
> or LockTupleNoKeyExclusive, depending on whether the key columns are
> being modified by the update. This needs to go through EvalPlanQual, so
> that function is now getting the lock mode as a parameter instead of
> hardcoded LockTupleExclusive. (All other users of GetTupleForTrigger
> still use LockTupleExclusive, so there's no change for anybody other
> than FOR EACH ROW BEFORE UPDATE triggers).
>
> At this point, I think I've done all I wanted to do in connection with
> this patch, and I intend to commit. I think this is a good time to get
> it committed, so that people can play with it more extensively and
> report any breakage I might have caused before we even get into beta.

While trying this out this morning I noticed a bug in the XLog code:
trying to lock the updated version of a tuple when the page that
contains the updated version requires a backup block, would cause this:

PANIC: invalid xlog record length 0

The reason is that there is an (unknown to me) rule that there must be
some data not associated with a buffer:

/*
* NOTE: We disallow len == 0 because it provides a useful bit of extra
* error checking in ReadRecord. This means that all callers of
* XLogInsert must supply at least some not-in-a-buffer data. [...]
*/

This seems pretty strange to me. And having the rule be spelled out
only in a comment within XLogInsert and not at its top, and not nearby
the XLogRecData struct definition either, seems pretty strange to me.
I wonder how come every PG hacker except me knows this.

For the curious, I attach an isolationtester spec file I used to
reproduce the problem (and verify the fix).

To fix the crash I needed to do this:

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 99fced1..9762890 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4838,7 +4838,7 @@ l4:
{
xl_heap_lock_updated xlrec;
XLogRecPtr recptr;
- XLogRecData rdata;
+ XLogRecData rdata[2];
Page page = BufferGetPage(buf);

xlrec.target.node = rel->rd_node;
@@ -4846,13 +4846,18 @@ l4:
xlrec.xmax = new_xmax;
xlrec.infobits_set = compute_infobits(new_infomask, new_infomask2);

- rdata.data = (char *) &xlrec;
- rdata.len = SizeOfHeapLockUpdated;
- rdata.buffer = buf;
- rdata.buffer_std = true;
- rdata.next = NULL;
+ rdata[0].data = (char *) &xlrec;
+ rdata[0].len = SizeOfHeapLockUpdated;
+ rdata[0].buffer = InvalidBuffer;
+ rdata[0].next = &(rdata[1]);
+
+ rdata[1].data = NULL;
+ rdata[1].len = 0;
+ rdata[1].buffer = buf;
+ rdata[1].buffer_std = true;
+ rdata[1].next = NULL;

- recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED, &rdata);
+ recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED, rdata);

PageSetLSN(page, recptr);
PageSetTLI(page, ThisTimeLineID);

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
xlog-lock-updated.spec text/plain 622 bytes

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2013-01-18 20:24:50
Message-ID: CA+U5nMK=++qmoLik=xvsZdKyg5ojTEk2bK0rX6DRX01MSP38nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18 January 2013 20:02, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> XLOG_HEAP2_LOCK_UPDATED

Every xlog record needs to know where to put the block.

Compare with XLOG_HEAP_LOCK

xlrec.target.node = relation->rd_node;

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: foreign key locks
Date: 2013-01-18 20:37:47
Message-ID: 28623.1358541467@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> The reason is that there is an (unknown to me) rule that there must be
> some data not associated with a buffer:

> /*
> * NOTE: We disallow len == 0 because it provides a useful bit of extra
> * error checking in ReadRecord. This means that all callers of
> * XLogInsert must supply at least some not-in-a-buffer data. [...]
> */

> This seems pretty strange to me. And having the rule be spelled out
> only in a comment within XLogInsert and not at its top, and not nearby
> the XLogRecData struct definition either, seems pretty strange to me.
> I wonder how come every PG hacker except me knows this.

I doubt it ever came up before. What use is logging only the content of
a buffer page? Surely you'd need to know, for example, which relation
and page number it is from.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-18 20:46:43
Message-ID: 20130118204643.GA4359@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > The reason is that there is an (unknown to me) rule that there must be
> > some data not associated with a buffer:
>
> > /*
> > * NOTE: We disallow len == 0 because it provides a useful bit of extra
> > * error checking in ReadRecord. This means that all callers of
> > * XLogInsert must supply at least some not-in-a-buffer data. [...]
> > */
>
> > This seems pretty strange to me. And having the rule be spelled out
> > only in a comment within XLogInsert and not at its top, and not nearby
> > the XLogRecData struct definition either, seems pretty strange to me.
> > I wonder how come every PG hacker except me knows this.
>
> I doubt it ever came up before. What use is logging only the content of
> a buffer page? Surely you'd need to know, for example, which relation
> and page number it is from.

It only got to be a length of 0 because the the data got removed due to
a logged full page write. And the backup block contains the data about
which blocks it is logging in itself.
I wonder if the check shouldn't just check write_len instead, directly
below the loop that ads backup blocks.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-18 21:01:18
Message-ID: 693.1358542878@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
>> I doubt it ever came up before. What use is logging only the content of
>> a buffer page? Surely you'd need to know, for example, which relation
>> and page number it is from.

> It only got to be a length of 0 because the the data got removed due to
> a logged full page write. And the backup block contains the data about
> which blocks it is logging in itself.

And if the full-page-image case *hadn't* been invoked, what then? I
still don't see a very good argument for xlog records with no fixed
data.

> I wonder if the check shouldn't just check write_len instead, directly
> below the loop that ads backup blocks.

We're not changing this unless you can convince me that the read-side
error check mentioned in the comment is useless.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-18 21:13:18
Message-ID: 20130118211317.GB4359@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-01-18 16:01:18 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
> >> I doubt it ever came up before. What use is logging only the content of
> >> a buffer page? Surely you'd need to know, for example, which relation
> >> and page number it is from.
>
> > It only got to be a length of 0 because the the data got removed due to
> > a logged full page write. And the backup block contains the data about
> > which blocks it is logging in itself.
>
> And if the full-page-image case *hadn't* been invoked, what then? I
> still don't see a very good argument for xlog records with no fixed
> data.

In that case data would have been logged?

The code in question was:

xl_heap_lock_updated xlrec;
xlrec.target.node = rel->rd_node;
...
xlrec.xmax = new_xmax;

- rdata.data = (char *) &xlrec;
- rdata.len = SizeOfHeapLockUpdated;
- rdata.buffer = buf;
- rdata.buffer_std = true;
- rdata.next = NULL;
- recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED, &rdata);

Other wal logging code (and fklocks now as well) just put those into
two XLogRecData blocks to avoid the issue.

> > I wonder if the check shouldn't just check write_len instead, directly
> > below the loop that ads backup blocks.
>
> We're not changing this unless you can convince me that the read-side
> error check mentioned in the comment is useless.

Youre right, the read side check is worth quite a bit. I think I am
retracting my suggestion.
I guess the amount of extra data thats uselessly logged although never
used in in the redo functions doesn't really amass to anything
significant in comparison to the backup block data.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Kevin Grittner <kgrittn(at)mail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-19 10:21:27
Message-ID: CA+U5nM+6OO9QeU_3gM6StFcw4E_emjv5yC-6n6HWfvHKM9UVHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18 January 2013 21:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>> On 2013-01-18 15:37:47 -0500, Tom Lane wrote:
>>> I doubt it ever came up before. What use is logging only the content of
>>> a buffer page? Surely you'd need to know, for example, which relation
>>> and page number it is from.
>
>> It only got to be a length of 0 because the the data got removed due to
>> a logged full page write. And the backup block contains the data about
>> which blocks it is logging in itself.
>
> And if the full-page-image case *hadn't* been invoked, what then? I
> still don't see a very good argument for xlog records with no fixed
> data.
>
>> I wonder if the check shouldn't just check write_len instead, directly
>> below the loop that ads backup blocks.
>
> We're not changing this unless you can convince me that the read-side
> error check mentioned in the comment is useless.

There's some confusion here, I think. Alvaro wasn't proposing a WAL
record that had no fixed data.

The current way XLogRecData works is that if you put data and buffer
together on the same chunk then we optimise the data away if we take a
backup block.

Alvaro chose what looks like the right way to do this when you have
simple data: use one XLogRecData chunk and let the data part get
optimised away. But doing that results in a WAL record that has a
backup block, plus data of 0 length, which then fails the later check.

All current code gets around this by including multiple XLogRecData
chunks, which results in including additional data that is superfluous
when the backup block is present. Alvaro was questioning this; I
didn't understand that either when he said it, but I do now.

The zero length check should stay, definitely.

What this looks like is that further compression of the WAL is
possible, but given its alongside backup blocks, that's on the order
of a 1% saving, so probably isn't worth considering right now. The way
to do that would to include a small token to show record has been
optimised, rather than being zero length.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: foreign key locks
Date: 2013-01-23 18:33:12
Message-ID: 20130123183312.GG4249@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I just pushed this patch to the master branch. There was a
corresponding catversion bump and pg_control version bump. I have
verified that "make check-world" passes on my machine, as well as
isolation tests and pg_upgrade.

Tom Lane said at one point "this is too complex to maintain". Several
times during the development I feared he might well be right. I am sure
he will be discovering many oversights and poor design choices, when
gets around to reviewing the code; hopefully some extra effort will be
all that's needed to make the whole thing work sanely and not eat
anyone's data. I just hope that nothing so serious comes up that the
patch will need to be reverted completely.

This patch is the result of the work of many people. I am not allowed
to mention the patch sponsors in the commit message, so I'm doing it
here: first and foremost I need to thank my previous and current
employers Command Prompt and 2ndQuadrant -- they were extremely kind in
allowing me to work on this for days on end (and not all of it was
supported by financial sponsors). Joel Jacobson started the whole
effort by posting a screencast of a problem his company was having; I
hope they found a workaround in the meantime, because his post was in
mid 2010. The key idea of this patch' design came from Simon Riggs;
Noah Misch provided additional design advice that allowed the project
torun to completion. Noah and Andres Freund spent considerable time
reviewing early versions of this patch; they uncovered many embarrasing
bugs in my implementation. Kevin Grittner, Robert Haas, and others,
provided useful comments as well. Noah Misch, Andres Freund, Marti
Raudsepp and Alexander Shulgin also provided bits of code.

Any bugs that remain are, of course, my own fault only.

Financial support came from

* Command Prompt Inc. of Washington, US
* 2ndQuadrant Ltd. of United Kingdom
* Trustly (then Glue Finance) of Sweden
* Novozymes A/S of Denmark
* MailerMailer LLC of Maryland, US
* PostgreSQL Experts, Inc. of California, US.

My sincerest thanks to everyone.

I seriously hope that no patch of mine ever becomes this monstruous
again.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services