WIP: Deferrable unique constraints

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)googlemail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: WIP: Deferrable unique constraints
Date: 2009-07-07 18:38:29
Message-ID: 8e2dbb700907071138y4ebe75cw81879aa513cf82a3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is a feature I would find useful, and I wanted to learn more
about the database internals, so I thought I would give it a go as an
exercise.

I added 2 new index insert modes to support deferrable. During the
initial index insert, if the constraint is deferrable, it does a
"partial" check of uniqueness. This is a non-blocking check which
allows duplicates in, but can distinguish between definitely unique
and potentially non-unique. For potential uniqueness violations a
deferred trigger is queued to do a full check at the end of the
statement or transaction, or when SET CONSTRAINTS is called. The
trigger then replays the insert in a "fake insert" mode, which doesn't
actually insert anything - it just checks that what is already there
is unique, waiting for other transactions if necessary.

This approach works well if the number of potential conflicts is
small. For example, if uniqueness is not actually violated, there is
no significant penalty for the deferred check, since no triggers are
queued:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 79.131 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11403.906 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.145 ms
pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated
UPDATE 1000000
Time: 21258.245 ms
pgdevel=# COMMIT;
COMMIT
Time: 83.267 ms

compared with:

pgdevel=# CREATE TABLE foo(a int UNIQUE);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 110.277 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11196.600 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.146 ms
pgdevel=# UPDATE foo SET a=a+1; -- Uniqueness never violated
UPDATE 1000000
Time: 21054.430 ms
pgdevel=# COMMIT;
COMMIT
Time: 186.174 ms

However, if there are lots of temporarily non-unique values, the
performance impact is much greater, since the unique check is repeated
for each row at commit time:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY DEFERRED);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 64.244 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 11621.438 ms
pgdevel=# BEGIN;
BEGIN
Time: 0.146 ms
pgdevel=# UPDATE foo SET a=a+2; -- Uniqueness temporarily violated
UPDATE 1000000
Time: 21807.128 ms
pgdevel=# COMMIT;
COMMIT
Time: 14974.916 ms

And similarly for an "immediate" check:

pgdevel=# CREATE TABLE foo(a int UNIQUE DEFERRABLE INITIALLY IMMEDIATE);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
Time: 44.477 ms
pgdevel=# INSERT INTO foo (SELECT * FROM generate_series(0, 1999999,
2)); -- 1M rows
INSERT 0 1000000
Time: 12083.089 ms
pgdevel=# UPDATE foo SET a=a+2;
UPDATE 1000000
Time: 37173.210 ms

Also, as for FKs and other queued triggers, this doesn't scale because
the queue is held in memory.

Curing the scalability problem by spooling the queue to disk shouldn't
be too hard to do, but that doesn't address the problem that if a
significant proportion of rows from the table need to be checked, it
is far quicker to scan the whole index once than check row by row.

In fact, with this data on my machine, a single scan takes around
0.5sec, so on that basis a full scan would be quicker if more than
around 3% of the data is flagged as potentially non-unique, although
the planner might be a better judge of that.

I can't see a way of coding something that can switch over to a full
scan, without first recording all the potential violations. Then I'm
not entirely sure how best to do the switch-over logic.

- Dean

Attachment Content-Type Size
deferrable_unique.patch application/octet-stream 66.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-07-07 18:42:33 Re: information_schema.columns changes needed for OLEDB
Previous Message Tom Lane 2009-07-07 18:21:51 Re: Re: Synch Rep: direct transfer of WAL file from the primary to the standby