SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?
Date: 2012-06-16 18:52:20
Message-ID: 6456.1339872740@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

While looking at Vik Reykja's pending patch to improve the FK triggers
by skipping processing when a NULL column didn't change, I started to
wonder whether that really had no user-visible semantic effect.
In particular, in ON UPDATE SET NULL/SET DEFAULT cases, it seemed like
this could change the set of referencing columns that get set to NULL
or to their defaults. So the next question was which behavior actually
conforms to the SQL standard, and the answer to that is ... disturbing.

The code in ri_triggers.c was written against SQL92's definition of ON
UPDATE SET NULL/SET DEFAULT, which is (ignoring the MATCH PARTIAL case,
which we don't implement):

6) If an <update rule> is specified and a non-null value of a ref-
erenced column in the referenced table is updated to a value
that is distinct from the current value of that column, then

a) If <match type> is not specified or if FULL is specified,
then

ii) If the <update rule> specifies SET NULL, then

Case:

1) If <match type> is not specified, then in all matching
rows the referencing column that corresponds with the
referenced column is set to the null value.

2) If <match type> specifies FULL, then in all matching
rows each referencing column is set to the null value.

iii) If the <update rule> specifies SET DEFAULT, then in all
matching rows the referencing column that corresponds with
the referenced column is set to the default value specified
in the General Rules of Subclause 11.5, "<default clause>".

Note that only in the MATCH FULL + SET NULL case does it say to set
*all* the referencing columns in each matching row. Otherwise, you are
only supposed to change columns that correspond to referenced columns
that were changed. It's notable that SET NULL and SET DEFAULT have
different behaviors here.

On the other hand, in SQL:2008 I find (some boilerplate text omitted):

10) If a non-null value of a referenced column RC in the
referenced table is updated to a value that is distinct from the
current value of RC, then, for every member F of the subtable
family of the referencing table:

Case:

a) If M specifies SIMPLE or FULL, then

Case:

ii) If UR specifies SET NULL, then

Case:

1) If M specifies SIMPLE, then each matching row
MR in F is paired with the candidate replacement
row NMR, formed by copying MR and setting each
referencing column in the copy to the null
value. MR is identified for replacement by NMR
in F.

2) If M specifies FULL, then each matching row
MR in F is paired with the candidate replacement
row NMR, formed by copying MR and setting each
referencing column in the copy to the null
value. MR is identified for replacement by NMR
in F.

iii) If UR specifies SET DEFAULT, then each matching row
MR in F is paired with the candidate replacement row
NMR, formed by copying MR and setting each referencing
column in the copy to the default value specified in the
General Rules of Subclause 11.5, "<default clause>". MR
is identified for replacement by NMR in F.

So far as I can see, this says to set *all* referencing columns to
nulls or their defaults, in all four cases, whether the corresponding
referenced column was one that changed or not. This is very clearly
different from what SQL92 says. It's also rather curious that they
distinguish two "cases" for SET NULL when the texts are exactly alike.

It looks to me like this change occurred in SQL:2003, although
SQL:1999's version of the text is such badly written pseudo-mathematical
gobbledygook that it's a bit hard to tell which behavior they meant.
However, neither of those specs list any change in referential
constraint behavior as being an acknowledged incompatibility with the
prior standard. Have the SQL committee simply failed to notice that in
whacking this text around they changed the meaning? Which behavior is
actually implemented by other RDBMSes?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?
Date: 2012-06-16 20:18:48
Message-ID: 10168.1339877928@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Have the SQL committee simply failed to notice that in
> whacking this text around they changed the meaning? Which behavior is
> actually implemented by other RDBMSes?

If anyone is up for actually trying this, here is a script to test the
behavior in question:

create table pp (f1 int, f2 int, primary key (f1,f2));
create table cmssn (f1 int, f2 int,
foreign key (f1,f2) references pp(f1,f2) on update set null);
create table cmfsn (f1 int, f2 int,
foreign key (f1,f2) references pp(f1,f2) match full on update set null);
create table cmssd (f1 int default 0, f2 int default 0,
foreign key (f1,f2) references pp(f1,f2) on update set default);
create table cmfsd (f1 int default 0, f2 int default 0,
foreign key (f1,f2) references pp(f1,f2) match full on update set default);

insert into pp values (11, 22);
insert into pp values (11, 0);
insert into pp values (0, 0);

insert into cmssn values (11, 22);
insert into cmfsn values (11, 22);
insert into cmssd values (11, 22);
insert into cmfsd values (11, 22);

update pp set f2 = f2 + 1 where f2 > 0;

select * from cmssn;
select * from cmfsn;
select * from cmssd;
select * from cmfsd;

In Postgres this produces

f1 | f2
----+----
11 |
(1 row)

f1 | f2
----+----
|
(1 row)

f1 | f2
----+----
11 | 0
(1 row)

f1 | f2
----+----
0 | 0
(1 row)

which shows that we are self-consistent but not actually compliant with
either old or new wordings of the spec :-(

The only other SQL DB I have handy is mysql 5.5.24, which shows up
pretty unimpressively: it gives a syntax error on the cmssd definition,
which would be all right because the manual says the innodb storage
engine doesn't support SET DEFAULT, except it *doesn't* give a syntax
error for creating cmfsd. Then, the update fails claiming that cmfsn's
FK constraint is violated, so they evidently don't implement that case
correctly. After removing cmfsn, the update fails again claiming that
cmfsd's FK constraint is violated, so yeah they are telling the truth
when they say SET DEFAULT doesn't work. The upshot is that only the
MATCH SIMPLE SET NULL case works at all in current mysql, and that
produces the result

mysql> select * from cmssn;
+------+------+
| f1 | f2 |
+------+------+
| NULL | NULL |
+------+------+
1 row in set (0.00 sec)

so they are nulling all the referencing columns in this case, which
matches the more recent specs but is clearly contrary to SQL92.

Anybody have DB2, or something else that might be thought to be pretty
close to spec-compliant?

regards, tom lane


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?
Date: 2012-06-16 20:28:45
Message-ID: CAEZATCX3-GUhpaBWDCB+-nLE1a1OU5OP=n1n+W9+Rc1o3FS5ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 June 2012 21:18, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Anybody have DB2, or something else that might be thought to be pretty
> close to spec-compliant?
>

I have an Oracle DB, but they're not exactly known for spec
compliance. In fact they dodge this entire issue by not supporting ON
UPDATE actions at all :-)

Regards,
Dean


From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?
Date: 2012-06-16 23:17:26
Message-ID: CAHHcreqRYo9Y_Xf3azUFJCZ-einno4pDohd8inATV2xTTeWFAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/6/16 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> I wrote:
>> Have the SQL committee simply failed to notice that in
>> whacking this text around they changed the meaning?  Which behavior is
>> actually implemented by other RDBMSes?
>
> If anyone is up for actually trying this, here is a script to test the
> behavior in question:
>
> create table pp (f1 int, f2 int, primary key (f1,f2));
> create table cmssn (f1 int, f2 int,
>    foreign key (f1,f2) references pp(f1,f2) on update set null);
> create table cmfsn (f1 int, f2 int,
>    foreign key (f1,f2) references pp(f1,f2) match full on update set null);
> create table cmssd (f1 int default 0, f2 int default 0,
>    foreign key (f1,f2) references pp(f1,f2) on update set default);
> create table cmfsd (f1 int default 0, f2 int default 0,
>    foreign key (f1,f2) references pp(f1,f2) match full on update set default);
>
> insert into pp values (11, 22);
> insert into pp values (11, 0);
> insert into pp values (0, 0);
>
> insert into cmssn values (11, 22);
> insert into cmfsn values (11, 22);
> insert into cmssd values (11, 22);
> insert into cmfsd values (11, 22);
>
> update pp set f2 = f2 + 1 where f2 > 0;
>
> select * from cmssn;
> select * from cmfsn;
> select * from cmssd;
> select * from cmfsd;
>
> In Postgres this produces
>
>  f1 | f2
> ----+----
>  11 |
> (1 row)
>
>  f1 | f2
> ----+----
>    |
> (1 row)
>
>  f1 | f2
> ----+----
>  11 |  0
> (1 row)
>
>  f1 | f2
> ----+----
>  0 |  0
> (1 row)
>
> which shows that we are self-consistent but not actually compliant with
> either old or new wordings of the spec :-(
>
> The only other SQL DB I have handy is mysql 5.5.24, which shows up
> pretty unimpressively: it gives a syntax error on the cmssd definition,
> which would be all right because the manual says the innodb storage
> engine doesn't support SET DEFAULT, except it *doesn't* give a syntax
> error for creating cmfsd.  Then, the update fails claiming that cmfsn's
> FK constraint is violated, so they evidently don't implement that case
> correctly.  After removing cmfsn, the update fails again claiming that
> cmfsd's FK constraint is violated, so yeah they are telling the truth
> when they say SET DEFAULT doesn't work.  The upshot is that only the
> MATCH SIMPLE SET NULL case works at all in current mysql, and that
> produces the result
>
> mysql> select * from cmssn;
> +------+------+
> | f1   | f2   |
> +------+------+
> | NULL | NULL |
> +------+------+
> 1 row in set (0.00 sec)
>
> so they are nulling all the referencing columns in this case, which
> matches the more recent specs but is clearly contrary to SQL92.
>
> Anybody have DB2, or something else that might be thought to be pretty
> close to spec-compliant?

I tryed in a MS SQL Server 2012 via SQLFiddle [1]. I could only create
'cmssn' and 'cmssd' tables because as I can see in [2] MS SQL Server
2012 doesn't supports MATCH syntax.

The result was:

select * from cmssn;
F1 | F2
(null) | (null)

select * from cmssd;
F1 | F2
0 | 0

The test is in [3], and there you can try other RDBMS, just create the
schema on the left panel and testing selects on the right.

[1] http://sqlfiddle.com
[2] http://msdn.microsoft.com/en-us/library/ms174979.aspx
[3] http://sqlfiddle.com/#!6/ac7db/1

Regards.
--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?
Date: 2012-06-18 00:44:51
Message-ID: 23930.1339980291@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Anybody have DB2, or something else that might be thought to be pretty
> close to spec-compliant?

Remarkably enough, the DB2 10.1 manuals at www.ibm.com say that it
doesn't support ON UPDATE SET NULL or ON UPDATE SET DEFAULT. I'm
disappointed in them :-(. But anyway it seems that we'll not get
that much guidance from looking at other SQL implementations, and
what precedents there are suggest that people are using the set-all-
the-columns interpretation.

After reflection it seems clear to me that set-all-the-columns is
in fact an improvement for the SET DEFAULT case, regardless of match
style. If we set only some of them, you get a mishmash of old and
new column values which is rather unlikely to match any row of the
referenced table. If we always set all of them, then (at least for
constant default values) only one "fallback" entry is required in
the referenced table. This can be seen in my example script upthread,
where I had to make a bogus referenceable entry "11, 0" to prevent
an RI failure on the MATCH SIMPLE update. Having just the one fallback
entry "0, 0" definitely seems saner from an application standpoint.

I'm less sold on set-all-the-columns for the MATCH SIMPLE SET NULL
case. In this match style, setting any referencing column to null
is sufficient to prevent an RI failure, and it could be argued that
zapping all of them discards data that might be useful. But it does
have the advantage of predictability.

From an implementation standpoint, set-all-the-columns is definitely
easier to deal with: we won't need ri_OneKeyEqual at all any more,
and RI_FKey_setnull_upd no longer has the problem of having to deal
with variant plans depending on which columns it needs to zap.
So I'm attracted to it on that basis, but I don't want to let
implementation concerns drive the decision.

On balance I think we ought to switch to set-all-the-columns, though
only in 9.3+ --- a back-patched behavioral change doesn't seem like a
good idea.

Any objections, or anyone want to do more research before we decide?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SQL standard changed behavior of ON UPDATE SET NULL/SET DEFAULT?
Date: 2012-06-18 03:35:49
Message-ID: 15938.1339990549@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> On balance I think we ought to switch to set-all-the-columns, though
> only in 9.3+ --- a back-patched behavioral change doesn't seem like a
> good idea.

And here is a draft patch for that. I was interested to find that the
documentation already claims that all columns are set in the relevant
cases (so the docs changes here are just wordsmithing and
clarification).

regards, tom lane

Attachment Content-Type Size
ref-action-all-columns.patch text/x-patch 28.7 KB