Re: Deferrable Unique Constraints

Lists: pgsql-hackers
From: George Essig <george_essig(at)yahoo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Deferrable Unique Constraints
Date: 2005-01-26 18:12:12
Message-ID: 20050126181212.38684.qmail@web53707.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I noticed that implementing deferrable unique constraints is on the
TODO list. I don't think its been said yet, but currently you can
implement a deferrable unique constraint by using a deferrable
constraint trigger together with a procedural language like plpgsql.
If you need an index on a column, you can use a regular index instead
of a unique index.

Yes, I noticed that getting rid of constraint triggers is also on the
TODO list.

Below is an example.

George Essig

------------------------------------------

create table t (x integer, y integer);
create index t_x_in on t (x);

-- Create a trigger function to test for duplicate values of x.
-- Table t column x unique insert update trigger function.

create or replace function t_x_un_ins_up_tr() RETURNS "trigger"
AS '
declare
invalid integer;
begin

-- Not absolutely necessary, but avoids a query if the new and old
-- values of x are the same.

if TG_OP = ''UPDATE'' then
if new.x = old.x then
return new;
end if;
end if;

-- If 2 or more rows have the same value of x, set invalid to 1.

select 1 into invalid
from t
where x = new.x
offset 1 limit 1;

-- If found, raise exception.

if FOUND then
raise EXCEPTION
''Violation of unique constraint on column x in table t by new row:
x %, y %'', new.x, new.y;
end if;

return new;
end;'
LANGUAGE plpgsql;

-- Create a deferrable constraint trigger that executes the trigger function.
-- This runs at transaction commit time for every row that was inserted or updated.

create constraint trigger t_x_un_ins_up_tr after insert or update on t
deferrable initially deferred
for each row
execute procedure t_x_un_ins_up_tr ();

-- Begin a transaction.
-- Insert duplicate values of x successfully.
-- Violation of constraint when transaction is committed.

test=# begin;
BEGIN
test=# insert into t (x, y) values (1,1);
INSERT 30332079 1
test=# insert into t (x, y) values (1,2);
INSERT 30332080 1
test=# commit;
ERROR: Violation of unique constraint on column x in table t by new row:
x 1, y 1
test=# select * from t;
x | y
---+---
(0 rows)

-- Begin a transaction.
-- Insert duplicate values of x successfully.
-- Update one of the duplicate values to another value.
-- Commit transaction successfully.

test=# begin;
BEGIN
test=# insert into t (x, y) values (1,1);
INSERT 30332083 1
test=# insert into t (x, y) values (1,2);
INSERT 30332084 1
test=# update t set x = 2 where y = 2;
UPDATE 1
test=# commit;
COMMIT
test=# select * from t;
x | y
---+---
1 | 1
2 | 2
(2 rows)


From: Greg Stark <gsstark(at)mit(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-26 19:41:14
Message-ID: 877jm0yo4l.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


George Essig <george_essig(at)yahoo(dot)com> writes:

> I noticed that implementing deferrable unique constraints is on the
> TODO list. I don't think its been said yet, but currently you can
> implement a deferrable unique constraint by using a deferrable
> constraint trigger together with a procedural language like plpgsql.

You have a race condition. Two transactions can insert conflicting records and
if they commit at the same time they would both not see each other's
uncommitted records.

Off the top of my head it seems the way to go about doing this would be to
simply not insert the records in the index until commit time. This doesn't
actually sound so hard, is there any problem with this approach?

You could almost implement this with a deferred trigger, a boolean column, and
a partial unique index. However I don't think deferred constraint triggers can
modify the record data.

The unfortunate bit here is that even if this worked the trigger setting the
boolean flag which puts the record into the index would create a new copy of
the record. Since it's modifying a record inserted by the same transaction it
could in theory just modify it in place. I don't think any attempt is made to
do that though. In any case a real native implementation wouldn't really need
the flag so this problem wouldn't come up.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-26 19:57:53
Message-ID: 19003.1106769473@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> Off the top of my head it seems the way to go about doing this would be to
> simply not insert the records in the index until commit time. This doesn't
> actually sound so hard, is there any problem with this approach?

Yeah:
begin;
insert into foo (key, ...) values (33, ...);
select * from foo where key = 33;
...

If the SELECT uses an indexscan it will fail to find the just-inserted
row.

regards, tom lane


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-26 20:48:05
Message-ID: 87vf9jyl16.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > Off the top of my head it seems the way to go about doing this would be to
> > simply not insert the records in the index until commit time. This doesn't
> > actually sound so hard, is there any problem with this approach?
>
> Yeah:
> begin;
> insert into foo (key, ...) values (33, ...);
> select * from foo where key = 33;
> ...
>
> If the SELECT uses an indexscan it will fail to find the just-inserted
> row.

Well presumably you would need a non-unique index created for query execution
purposes. The unique index would be purely for enforcing the constraint.

--
greg


From: Neil Conway <neilc(at)samurai(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-27 04:31:29
Message-ID: 1106800289.3226.47.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-01-26 at 15:48 -0500, Greg Stark wrote:
> Well presumably you would need a non-unique index created for query execution
> purposes. The unique index would be purely for enforcing the constraint.

Yuck.

You could perhaps relax the uniqueness of the index during the
transaction itself, and keep around some backend-local indication of
which index entries it have been inserted. Then at transaction-commit
you'd need to re-check the inserted index entries to verify that they
are unique. It would be nice to just keep a pin on the leaf page that we
inserted into, although we'd need to take care to follow subsequent page
splits (could we use the existing L & Y techniques to do this?).
Needless to say, it would be pretty ugly...

-Neil


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-27 04:42:44
Message-ID: 12329.1106800964@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway <neilc(at)samurai(dot)com> writes:
> You could perhaps relax the uniqueness of the index during the
> transaction itself, and keep around some backend-local indication of
> which index entries it have been inserted. Then at transaction-commit
> you'd need to re-check the inserted index entries to verify that they
> are unique.

Yeah, what I've been visualizing is a list of "tentative duplicates" ---
that is, you do the immediate unique check same as now, and if it passes
(which hopefully is most of the time) then you're in the clear.
Otherwise you log the apparent duplicate key value to be rechecked at
commit.

> It would be nice to just keep a pin on the leaf page that we
> inserted into, although we'd need to take care to follow subsequent page
> splits (could we use the existing L & Y techniques to do this?).

I do not believe we can do that without risking deadlocks. It'll be
safer just to repeat the search for each key value that's of concern.

regards, tom lane


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-27 04:47:43
Message-ID: 20050127044743.GC12171@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 27, 2005 at 03:31:29PM +1100, Neil Conway wrote:

> You could perhaps relax the uniqueness of the index during the
> transaction itself, and keep around some backend-local indication of
> which index entries it have been inserted. Then at transaction-commit
> you'd need to re-check the inserted index entries to verify that they
> are unique. It would be nice to just keep a pin on the leaf page that we
> inserted into, although we'd need to take care to follow subsequent page
> splits (could we use the existing L & Y techniques to do this?).

Maybe we can do something like

1. use a boolean-returning unique insertion. If it fails, returns
false, doesn't ereport(ERROR); if it works, inserts and returns true.

2. the caller checks the return value. If false, records the insertion
attempt into an should-check-later list.

3. at transaction end, unique insertion is tried again with the items on
the list. If it fails, the transaction is aborted.

It's only a SMOC, nothing difficult AFAICS. Disk-spilling logic
included, because it'd be probably needed.

--
Alvaro Herrera (<alvherre[(at)]dcc(dot)uchile(dot)cl>)
Si no sabes adonde vas, es muy probable que acabes en otra parte.


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-27 05:48:53
Message-ID: 87k6pzxvzu.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Yeah, what I've been visualizing is a list of "tentative duplicates" ---
> that is, you do the immediate unique check same as now, and if it passes
> (which hopefully is most of the time) then you're in the clear.

I don't see how you're in the clear. If session A does an insert and it
doesn't see a duplicate and doesn't commit, but then B does an insert and sees
the duplicate from A and marks his tentative, and then commits, shouldn't B's
commit succeed? Then when A commits shouldn't his fail? So A still has to
recheck even if there was no sign of a duplicate when he inserted.

Unless there's some way for B to indicate to A that his insert has become
tentative then I think you have to resign yourself to checking all deferred
unique constraints, not just ones that seem suspect.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-27 14:24:11
Message-ID: 19156.1106835851@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> I don't see how you're in the clear. If session A does an insert and it
> doesn't see a duplicate and doesn't commit, but then B does an insert and sees
> the duplicate from A and marks his tentative, and then commits, shouldn't B's
> commit succeed?

No. B, being the second to get there, has to wait to see if A commits
or not. This is true already and it wouldn't change. We would
however postpone the wait until B's commit time.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Neil Conway <neilc(at)samurai(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Deferrable Unique Constraints
Date: 2005-01-27 14:39:05
Message-ID: 19325.1106836745@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> writes:
> It's only a SMOC, nothing difficult AFAICS. Disk-spilling logic
> included, because it'd be probably needed.

The question of disk-spilling is really the only part that seems very
hard. It would be useful to see if we could solve the problem of
spilling pending-trigger-event lists at the same time. Common
infrastructure possible, perhaps?

regards, tom lane