BUG #4417: Foreign keys do not work after altering table/column names

Lists: pgsql-bugs
From: "Benjamin Bihler" <benjamin(dot)bihler(at)gmx(dot)de>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #4417: Foreign keys do not work after altering table/column names
Date: 2008-09-15 09:30:37
Message-ID: 200809150930.m8F9UbFf051056@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 4417
Logged by: Benjamin Bihler
Email address: benjamin(dot)bihler(at)gmx(dot)de
PostgreSQL version: 8.3.1
Operating system: SUSE Linux
Description: Foreign keys do not work after altering table/column
names
Details:

We have a SQL-script to create tables and constraints, insert data into
these tables, alter the tables and again insert some data. There is one
"master" script file that includes subscripts with

\i Subscript.sql

These scripts do the following:

We create tables A and B with a foreign key stating that A.column references
B.column.

We rename table B into B_NEW and the column B.column into B_NEW.column_new.

Afterwards we want to update values in A.column. We get an error message
(translated from German):

Relation "public.B" does not exist.
CONTEXT: SQL statement "SELECT 1 FROM ONLY "public"."B" x WHERE "B.column"
OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"

and

Updating or Deletion in table ... violates foreign key contraint.
DETAIL: There is still a reference to ... in table ...

These scripts work correctly with PostgreSQL 8.2.5.

Back to PostgreSQL 8.3.1: If we change the master script, so that it does
not include the subscripts to alter table B and to update A.column and we
call these subscripts separately after the master script has finished, they
WORK!

I mean:

psql -f Master_complete.sql DATABASE_NAME does not work!

psql -f Master_without_altering_B_and_updating_A.sql DATABASE_NAME
psql -f Altering_B_and_updating_A.sql DATABASE_NAME works!

Unfortunately it is not possible to give more details, since these scripts
are part of a commercial product.


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Benjamin Bihler <benjamin(dot)bihler(at)gmx(dot)de>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4417: Foreign keys do not work after altering table/column names
Date: 2008-09-15 11:31:22
Message-ID: 48CE478A.9070707@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Benjamin Bihler wrote:
> Unfortunately it is not possible to give more details, since these scripts
> are part of a commercial product.

It's unlikely that anyone can help you without more details. I'd suggest
that you try to reduce it to a smaller stand-alone, anonymizing it by
changing table and column names if necessary, and verify that you can
still reproduce the problem with the reduced script. And then post that
script to the mailing list.

Alternatively, there's many support companies out there, including the
company I work for, that can help you under a NDA.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Benjamin Bihler <benjamin(dot)bihler(at)gmx(dot)de>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4417: Foreign keys do not work after altering table/column names
Date: 2008-09-15 13:22:21
Message-ID: 24519.1221484941@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Benjamin Bihler wrote:
>> Unfortunately it is not possible to give more details, since these scripts
>> are part of a commercial product.

> It's unlikely that anyone can help you without more details.

Although the OP could certainly have worked a bit harder at providing a
self-contained example, it's not hard to reproduce a problem:

regression=# create table atab(col int primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "atab_pkey" for table "atab"
CREATE TABLE
regression=# create table btab(col int primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "btab_pkey" for table "btab"
CREATE TABLE
regression=# alter table atab add column bref int references btab;
ALTER TABLE
regression=# insert into atab values(1,2);
ERROR: insert or update on table "atab" violates foreign key constraint "atab_bref_fkey"
DETAIL: Key (bref)=(2) is not present in table "btab".
regression=# insert into btab values(2);
INSERT 0 1
regression=# insert into atab values(1,2);
INSERT 0 1
regression=# alter table btab rename to b_new;
ALTER TABLE
regression=# alter table b_new rename col to col_new;
ALTER TABLE
regression=# update atab set bref = 3;

At this point 8.2 gives the expected error:

ERROR: insert or update on table "atab" violates foreign key constraint "atab_bref_fkey"
DETAIL: Key (bref)=(3) is not present in table "b_new".

but in CVS HEAD I get:

ERROR: relation "public.btab" does not exist
LINE 1: SELECT 1 FROM ONLY "public"."btab" x WHERE "col" OPERATOR(pg...
^
QUERY: SELECT 1 FROM ONLY "public"."btab" x WHERE "col" OPERATOR(pg_catalog.=) $1 FOR SHARE OF x

What is evidently happening is that the plancache decides it needs to
redo the cached plan for this query, but it's regenerating it from
a static version of the query text.

8.2 is more or less accidentally failing to fail: it's got a cached
plan, which it is too stupid to try to replan, but that plan still
works because it's referring to B by OID and attnum. There are huge
numbers of cases where 8.2 will fail miserably on alterations of the
tables involved in the FK, but simple renaming isn't one of them.
Whereas 8.3 handles all the other cases and falls down on renaming :-(

So I guess what we'd need to fix this is some sort of wart added to the
plancache mechanism that would allow the RI triggers to regenerate their
query text, not only update the derived plan, when a plan invalidation
event occurs.

My first thought was some kind of callback function to construct the
query text afresh, but that seems pretty baroque. It'd be simpler
to just have an option to let plancache.c return a failure indication
when its plan is obsolete, whereupon ri_triggers.c discards that plan
and builds a fresh one. [ pokes around... ] Hm, the fly in that
ointment is that ri_triggers and plancache aren't communicating directly
but through SPI. I'm really loath to change the SPI API for this;
is there another way?

regards, tom lane


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Benjamin Bihler <benjamin(dot)bihler(at)gmx(dot)de>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4417: Foreign keys do not work after altering table/column names
Date: 2008-09-15 13:52:05
Message-ID: 48CE6885.9000703@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> Benjamin Bihler wrote:
>>> Unfortunately it is not possible to give more details, since these scripts
>>> are part of a commercial product.
>
>> It's unlikely that anyone can help you without more details.
>
> Although the OP could certainly have worked a bit harder at providing a
> self-contained example, it's not hard to reproduce a problem:
>
> ...

Oh, good.

> My first thought was some kind of callback function to construct the
> query text afresh, but that seems pretty baroque. It'd be simpler
> to just have an option to let plancache.c return a failure indication
> when its plan is obsolete, whereupon ri_triggers.c discards that plan
> and builds a fresh one. [ pokes around... ] Hm, the fly in that
> ointment is that ri_triggers and plancache aren't communicating directly
> but through SPI. I'm really loath to change the SPI API for this;
> is there another way?

ri_triggers.c could register a callback to invalidate entries in its
hash table with CacheRegisterRelcacheCallback() ?

Adding a new function, say SPI_is_plan_valid(plan), doesn't seem too bad
to me either.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Benjamin Bihler <benjamin(dot)bihler(at)gmx(dot)de>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4417: Foreign keys do not work after altering table/column names
Date: 2008-09-15 14:31:15
Message-ID: 25782.1221489075@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> My first thought was some kind of callback function to construct the
>> query text afresh, but that seems pretty baroque. It'd be simpler
>> to just have an option to let plancache.c return a failure indication
>> when its plan is obsolete, whereupon ri_triggers.c discards that plan
>> and builds a fresh one. [ pokes around... ] Hm, the fly in that
>> ointment is that ri_triggers and plancache aren't communicating directly
>> but through SPI. I'm really loath to change the SPI API for this;
>> is there another way?

> ri_triggers.c could register a callback to invalidate entries in its
> hash table with CacheRegisterRelcacheCallback() ?

> Adding a new function, say SPI_is_plan_valid(plan), doesn't seem too bad
> to me either.

It's actually looking kind of nasty. We don't know for sure that the
plan is still valid until RevalidateCachedPlan has succeeded in locking
all the tables involved. (We could look to see if it's already invalid
at some earlier point, but that just leaves us with race conditions.)
And exposing that part of the procedure to a SPI caller would be pretty
horrid --- note the comments in spi.c about not wanting any daylight between
RevalidateCachedPlan and PortalDefineQuery.

The callback solution won't be any fun either. Aside from being a bit
baroque, it would require nontrivial code refactoring in ri_triggers.c
to pull the query text generation code out into separate subroutines.

One possibility is that it looks like all the RI triggers actually
acquire locks on both relations before they do anything interesting.
So maybe there really wouldn't be any race condition if we execute an
is_plan_valid probe at the time of ri_FetchPreparedPlan. It's a bit
shaky but might be better than trying to solve it in a "cleaner"
fashion; certainly anything more invasive than that wouldn't seem like
a good candidate for back-patching.

regards, tom lane