Re: fixing bug in combocid.c

Lists: pgsql-patches
From: Karl Schnaitter <karlsch(at)soe(dot)ucsc(dot)edu>
To: pgsql-patches(at)postgresql(dot)org
Subject: fixing bug in combocid.c
Date: 2008-08-29 22:00:04
Message-ID: 48B87164.4050802@soe.ucsc.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

This patch is for a bug in backend/utils/time/combocid.c.

In HeapTupleHeaderAdjustCmax, the code makes the incorrect assumption
that the raw command id corresponds to cmin, when the raw command id can
in fact be a combo cid. It needs to be fixed to call
HeapTupleHeaderGetCmin to get cmin. That's all there is to my patch, but
read on for more details.

One thing that can happen is that multiple subtransactions may delete a
tuple and each abort. Each subtransaction (except the first one) will
store a combo cid that uses another combo cid as cmin. Then when cmin is
accessed by a later operation, HeapTupleHeaderGetCmin will return the
previous combo cid. In unlucky situations, the bogus cmin will be so
high that it looks like the tuple was inserted in the future.

I have pasted an example below to show how things can go wrong. The
second SELECT leaves out some tuples because their stored cmin is
actually a large combo cid. The third SELECT gives correct results
because the inserting transaction committed, so the command id is
ignored. The script file with these commands in in the attachment, and I
think it could make for a valuable regression test.

Side note: Another idea for a fix would be to have tqual.c change a
combocid to a simple cmin when it does a time qual check and sees that
the deleting subtransaction aborted. This is no good because it requires
a write lock to change two header fields atomically, namely the
HEAP_COMBOCID hint and t_cid. There are probably other issues, but this
seems like a show-stopper.

The one-line patch is in the attachment. It gives correct behavior on my
test case. This is my first patch; I hope I did it right!

Thanks!
Karl Schnaitter

test=# CREATE TABLE tab(a INT);
CREATE TABLE
test=# BEGIN;
BEGIN
test=# INSERT INTO tab VALUES(1);
INSERT 0 1
test=# INSERT INTO tab VALUES(2);
INSERT 0 1
test=# INSERT INTO tab VALUES(3);
INSERT 0 1
test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SELECT a FROM tab;
a
---
1
2
3
(3 rows)

test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SAVEPOINT s; DELETE FROM tab; ROLLBACK TO SAVEPOINT s;
SAVEPOINT
DELETE 3
ROLLBACK
test=# SELECT a FROM tab;
a
---
1
(1 row)

test=# COMMIT;
COMMIT
test=# SELECT a FROM tab;
a
---
1
2
3
(3 rows)

test=# DROP TABLE tab;
DROP TABLE

Attachment Content-Type Size
combocid.tgz application/x-gzip 710 bytes

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Karl Schnaitter <karlsch(at)soe(dot)ucsc(dot)edu>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: fixing bug in combocid.c
Date: 2008-09-01 18:55:32
Message-ID: 48BC3AA4.7060804@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Karl Schnaitter wrote:
> This patch is for a bug in backend/utils/time/combocid.c.
>
> In HeapTupleHeaderAdjustCmax, the code makes the incorrect assumption
> that the raw command id corresponds to cmin, when the raw command id can
> in fact be a combo cid. It needs to be fixed to call
> HeapTupleHeaderGetCmin to get cmin. That's all there is to my patch, but
> read on for more details.

Yep. Patch committed.

Thanks for the report!

> This is my first patch; I hope I did it right!

You did well :-). No need to tar or gzip small patches like that, though.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com