Proposal: ON UPDATE REMOVE foreign key action

Lists: pgsql-hackers
From: Kirill Berezin <enelar(at)exsul(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proposal: ON UPDATE REMOVE foreign key action
Date: 2016-10-03 15:37:29
Message-ID: CAAObgf-A5=5NOjwvHsOS0SuWb+QLg2O=oF6oa3RfZ8QANd9ArQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

*One-line Summary:* On foreign key update we unable to remove all depended
records. Currently we have "ON REMOVE CASCADE DELETE", but no "ON UPDATE
CASCADE DELETE". We can only update field to NULL or DEFAULT.

*Business Use-case:* Cache expiration on hash/version update. Revoke all
access on account id update.

In my case i met this situation: I am using access links to share user
account. Account owner can give private link to somebody, and its session
become mirrored. (Owner access to account granted). You cant imagine
facebook desktop and mobile sessions. It's just shortcut for
entering credentials. Now i am implementing "revoke all but me". Its done
simple, since each user is uuid indexed, i am just generate new uuid for
current account. Old uuid become invalid to other sessions - since no
record is found in base.
I want to remove any pending access links, prevent bad guy restore access.
I can possibly set linked account to NULL, and then clear record on
expiration, but i feel that automatically removing on update event is more
rational.

*User impact with the change:* Instead of writing "on update" triggers for
each depended table, wished action is done by single line.

*Implementation details:* On cascade switch "update" action to "delete".

*Estimated Development Time:* Few hours or less.

*Opportunity Window Period:* Non applicable, minor feature

*Budget Money:* I am ready to implement myself, if approved.

*Contact Information:* enelar(at)exsul(dot)net


From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Kirill Berezin <enelar(at)exsul(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: ON UPDATE REMOVE foreign key action
Date: 2016-10-03 19:27:39
Message-ID: CAKOSWNk6HvS4bzGDBt0rcYWTWhEV3_e2=QiZujrx+upcZ0uy4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/3/16, Kirill Berezin <enelar(at)exsul(dot)net> wrote:
> *One-line Summary:* On foreign key update we unable to remove all depended
> records. Currently we have "ON REMOVE CASCADE DELETE", but no "ON UPDATE
> CASCADE DELETE". We can only update field to NULL or DEFAULT.

I think there are three causes why we don't have it implemented.
The first one is that there is no such grammar in the SQL spec (your
version is also wrong: SQL spec has "ON DELETE CASCADE" as well as "ON
DELETE CASCADE" [or any other action instead of "CASCADE"]).

The second one is in almost all cases there is no reason to delete
rows because of updating referenced row. If these rows are still
connected, they should be updated, if not --- left as is ("NO ACTION")
or with reference link deleted ("SET NULL" or "DEFAULT").
These rows has data, that's why they are still in tables. They can be
deleted (by reference) if and only if "parent" or "linked" data (all
data, not just referenced key) is deleted.

> *Business Use-case:* Cache expiration on hash/version update. Revoke all
> access on account id update.

> In my case i met this situation: I am using access links to share user
> account. Account owner can give private link to somebody, and its session
> become mirrored. (Owner access to account granted).

And the third cause is avoiding of bad design. If you has to give
access to anyone and you know access will be revoked soon (or late),
it is wise to give private link with different identificator which can
be easily found and removed by a grantor id (your id).

> You cant imagine facebook desktop and mobile sessions.

Which, of course, have different session ids. You can revoke session
without renaming your own.

> It's just shortcut for
> entering credentials. Now i am implementing "revoke all but me". Its done
> simple, since each user is uuid indexed, i am just generate new uuid for
> current account. Old uuid become invalid to other sessions - since no
> record is found in base.
> I want to remove any pending access links, prevent bad guy restore access.
> I can possibly set linked account to NULL,

Why just don't delete them when grantor revokes access?

> and then clear record on
> expiration, but i feel that automatically removing on update event is more
> rational.

I personally don't see necessity to introduce new non-spec grammar.
If you think I has not understood you, send an example with schema ---
what you have now and how you expect it should be.

--
Best regards,
Vitaly Burovoy


From: Kirill Berezin <enelar(at)exsul(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Kirill Berezin <enelar(at)exsul(dot)net>, Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>, Pantelis Theodosiou <ypercube(at)gmail(dot)com>
Subject: Re: Proposal: ON UPDATE REMOVE foreign key action
Date: 2016-10-04 15:25:46
Message-ID: CAAObgf8CGnjbGjJBCKvkG2xPU4e8G9f+D7fO4Zx2ButwwJLdng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Disclaimer: sorry, i dont understand, should i reply to each of you
personally, or just answer to channel. Some feedbacks were sended in
personal, and some include channel copy.

Thanks for responses, you understand it correctly.

When i said "anybody", i mean inclusive owner himself. For example cookie
poisoning.
There is no "another" session, technically. They similar to the server,
they even can have same IP.
Yes, we can't prevent it with CSRF cookies, but it is not the point of
current conversation.

I can make business logic outside table: make extra query. Im just dont
like how it looks from perspective of encapsulation.
Each table should describe itself, like object in OOP language. With SQL
constructions or triggers/constraits.

Second part of my use case is data cache. When user update
version(generation), cache should be flushed. As easy example: imagine i am
fetching currency value. And till end of the day, i am honor current
course. (Or any other information, that has certain origin checkpoints).
When i am updating origin state (current day, server version, ip address,
neural network generation), i am have to invalidate all previous data.

Like i am calculating digits of the square root, of some number. The more i
spend time, the closer my approx value to irrational result. But when
original value has changed - all previous data does not make sense. I am
flushing it and starting from digit 1.

This is allegorical examples to my real-world cases. I may try imagine some
hypothetical situations, when this functionality more welcomed. But, i am
respect reasons why do not apply this proposal. If my update didn't shift
the balance, its ok. on update trigger is not such painful.


From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Kirill Berezin <enelar(at)exsul(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Pantelis Theodosiou <ypercube(at)gmail(dot)com>
Subject: Re: Proposal: ON UPDATE REMOVE foreign key action
Date: 2016-10-05 05:52:57
Message-ID: CAKOSWN=dLmYopTYPhC=deDBpQk6kN=Q670HOwnxC7Lffp=QoRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/4/16, Kirill Berezin <enelar(at)exsul(dot)net> wrote:
> Disclaimer: sorry, i dont understand, should i reply to each of you
> personally, or just answer to channel. Some feedbacks were sended in
> personal, and some include channel copy.

Usually discussions are in the list, therefore you should use "reply
to all" (see [1]).
Exception is when a sender notes "Off the list".

> Thanks for responses, you understand it correctly.
>
> When i said "anybody", i mean inclusive owner himself. For example cookie
> poisoning.
> There is no "another" session, technically. They similar to the server,
> they even can have same IP.
> Yes, we can't prevent it with CSRF cookies, but it is not the point of
> current conversation.
>
> I can make business logic outside table: make extra query.

Good decision. Your case needs exactly what you've just written.

> Im just dont like how it looks from perspective of encapsulation.
> Each table should describe itself, like object in OOP language.
> With SQL constructions or triggers/constraits.

SQL is not OOP. There is no "encapsulation".

> Second part of my use case is data cache.

Hmm. Usage of RDBMS as a cache with an overhead for Isolation and
Durability (from ACID)? Really?
As for me it is a bad idea for most cases.

> When user update
> version(generation), cache should be flushed. As easy example: imagine i am
> fetching currency value. And till end of the day, i am honor current
> course. (Or any other information, that has certain origin checkpoints).
> When i am updating origin state (current day, server version, ip address,
> neural network generation), i am have to invalidate all previous data.

It is a bad example. Companies working with currency exchange rates
always keep their values as historical data.

> Like i am calculating digits of the square root, of some number. The more i
> spend time, the closer my approx value to irrational result. But when
> original value has changed - all previous data does not make sense. I am
> flushing it and starting from digit 1.

Why do you "update" original value instead of deleting old one and
inserting new value?

> This is allegorical examples to my real-world cases. I may try imagine some
> hypothetical situations, when this functionality more welcomed. But, i am
> respect reasons why do not apply this proposal. If my update didn't shift
> the balance, its ok. on update trigger is not such painful.

All your cases (except the exchange rate one) can be done using two
queries: delete original row (which deletes other linked data "ON
DELETE CASCADE") and insert a new one. You don't even have to use
transactions!
If your business logic is so "OOP", you can use stored procedures, but
introducing new grammar specially for concrete task is a bad idea.

Of course at first sight there is a meaningless sequence "ON UPDATE
SET (NULL|DEFAULT)", but the meaning of SET NULL and SET DEFAULT for
both ON UPDATE and ON DELETE is using them for "unlinking" data from
the referenced one. It is similar to "NO ACTION" but explicitly change
them as they are no longer connected to the referenced row (by
referencing column list).

Also your proposal is not consistent: ON UPDATE REMOVE (DELETE?), but
ON DELETE - what? again remove/delete?

[1] https://wiki.postgresql.org/wiki/Mailing_Lists#Using_the_discussion_lists
--
Best regards,
Vitaly Burovoy