Re: complex referential integrity constraints

From: "Joris Dobbelsteen" <Joris(at)familiedobbelsteen(dot)nl>
To: "elein" <elein(at)varlena(dot)com>, "Robert Haas" <Robert(dot)Haas(at)dyntek(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: complex referential integrity constraints
Date: 2007-02-19 10:58:51
Message-ID: 73427AD314CC364C8DF0FFF9C4D693FF5579@nehemiah.joris2k.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>-----Original Message-----
>From: pgsql-general-owner(at)postgresql(dot)org
>[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of elein
>Sent: zondag 18 februari 2007 23:16
>To: Robert Haas
>Cc: pgsql-general(at)postgresql(dot)org
>Subject: Re: [GENERAL] complex referential integrity constraints
>
>On Fri, Feb 16, 2007 at 09:58:56AM -0500, Robert Haas wrote:
>> So, I have the following problem.
>>
>> Suppose you have two kinds of animals, sheep and wolves. Since they
>> have very similar properties, you create a single table to hold both
>> kinds of animals, and an animal_type table to specify the
>type of each
>> animal:
>>
>> CREATE TABLE animal_type (
>> id integer not null,
>> name varchar(80) not null,
>> primary key (id)
>> );
>> INSERT INTO animal_type VALUES (1, 'Sheep'); INSERT INTO animal_type
>> VALUES (2, 'Wolf');
>>
>> CREATE TABLE animal (
>> id serial,
>> type_id integer not null references
>animal_type (id),
>> name varchar(80) not null,
>> age integer not null,
>> weight_in_pounds integer not null,
>> primary key (id)
>> );
>>
>> The animal_type table is more or less written in stone, but
>the animal
>> table will be updated frequently. Now, let's suppose that
>we want to
>> keep track of all of the cases where one animal is mauled by another
>> animal:
>>
>> CREATE TABLE mauling (
>> id serial,
>> attacker_id integer not null references animal (id),
>> victim_id integer not null references animal (id),
>> attack_time timestamp not null,
>> primary key (id)
>> );
>>
>> The problem with this is that I have a very unsettled feeling about
>> the foreign key constraints on this table. The victim_id constraint
>> is fine, but the attacker_id constraint is really
>inadequate, because
>> the attacker CAN NEVER BE A SHEEP. I really want a way to write a
>> constraint that says that the attacker must be an animal, but
>> specifically, a wolf.
>>
>> It would be really nice to be able to write:
>>
>> FOREIGN KEY (attacker_id, 2) REFERENCES animal (id, type_id)
>>
>> Or:
>>
>> CREATE UNIQUE INDEX wolves ON animal (id) WHERE type_id = 2;
>> -- and then
>> FOREIGN KEY (attacker_id) REFERENCES INDEX wolves
>>
>> ...but that's entirely speculative syntax. I don't think
>there's any
>> easy way to do this. (Please tell me I'm wrong.)
>>
>> The problem really comes in when people start modifying the animal
>> table. Every once in a while we have a case where we record
>something
>> as a wolf, but it turns out to have been a sheep in wolf's
>clothing.
>> In this case, we want to do something like this:
>>
>> UPDATE animal SET type_id = 1 WHERE id = 572;
>>
>> HOWEVER, this operation MUST NOT be allowed if it turns out
>there is a
>> row in the mauling table where attacker_id = 572, because that would
>> violate my integrity constraints that says that sheep do not maul.
>>
>> Any suggestions? I've thought about creating rules or triggers to
>> check the conditions, but I'm scared that this could either (a) get
>> really complicated when there are a lot more tables and constraints
>> involved or
>> (b) introduce race conditions.
>
>Why don't you add a field in animal_types that is boolean mauler.
>Then you can add a trigger on the mauling table to raise an
>error when the attacker_id is an animal type mauler.

This is only partial. You need a lot more triggers to guarentee the
constraints are enforced.
Precisely you need to validate:
* mauling on insert/update of attacker_id
* animal on update of type_id
* animal_type on update of your property

Of course you need to think about the MVCC model, such that:
Transaction 1 executes
INSERT INTO mauling VALUES ('someattacker'),
Transaction 2 executes
UPDATE animal_type SET mauler = false WHERE name = 'someattacker',
such that both transaction happen in parallel.

This is perfectly possible and will make it possible to violate the
constraint, UNLESS locking of the tuples is done correctly.

These contraints are not trivial to implement (unfortunally). It would
be great if they where.

- Joris

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Karsten Hilbert 2007-02-19 11:03:07 Re: Why *exactly* is date_trunc() not immutable ?
Previous Message David Primero Segundo 2007-02-19 10:19:48 RE: Postgresql 8.1 y Debian [ Era: Re: postgreSQL ]