DROP FUNCTION failure: cache lookup failed for relation X

Lists: pgsql-hackerspgsql-patches
From: Michael Fuhr <mike(at)fuhr(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: DROP FUNCTION failure: cache lookup failed for relation X
Date: 2007-01-22 04:45:19
Message-ID: 20070122044519.GA62207@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I've found a situation that causes DROP FUNCTION to fail (tested
in 8.1.6, 8.2.1, and 8.3devel):

CREATE TABLE foo (id integer);

CREATE FUNCTION foofunc() RETURNS trigger AS $$
BEGIN
RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Then in concurrent sessions:

A: BEGIN;

A: CREATE TRIGGER footrig BEFORE INSERT ON foo
FOR EACH ROW EXECUTE PROCEDURE foofunc();

B: DROP TABLE foo; -- blocks pending A's commit

A: COMMIT; -- B's DROP TABLE completes

A: SELECT tgrelid FROM pg_trigger WHERE tgname = 'footrig';
tgrelid
---------
66153
(1 row)

A: DROP FUNCTION foofunc();
ERROR: cache lookup failed for relation 66153

Apparently the row in pg_trigger that A committed wasn't deleted
by B's DROP TABLE, presumably because B didn't have visibility to
to the trigger when its DROP TABLE statement began. This case is
admittedly contrived but I did stumble across it in a test environment.

--
Michael Fuhr


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP FUNCTION failure: cache lookup failed for relation X
Date: 2007-01-22 07:07:00
Message-ID: 6709.1169449620@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Michael Fuhr <mike(at)fuhr(dot)org> writes:
> I've found a situation that causes DROP FUNCTION to fail (tested
> in 8.1.6, 8.2.1, and 8.3devel):

Ugh ... I haven't traced this through in detail, but I'm pretty sure
the problem arises from the fact that dependency.c traces through
auto/internal dependencies before actually starting to do the deletions
(and thus before acquiring locks). Can we fix this without multiple
scans of the dependency tree (probably costing O(N^lots))?

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP FUNCTION failure: cache lookup failed for
Date: 2007-01-25 23:05:08
Message-ID: 200701252305.l0PN58U14783@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Is this a TODO?

---------------------------------------------------------------------------

Tom Lane wrote:
> Michael Fuhr <mike(at)fuhr(dot)org> writes:
> > I've found a situation that causes DROP FUNCTION to fail (tested
> > in 8.1.6, 8.2.1, and 8.3devel):
>
> Ugh ... I haven't traced this through in detail, but I'm pretty sure
> the problem arises from the fact that dependency.c traces through
> auto/internal dependencies before actually starting to do the deletions
> (and thus before acquiring locks). Can we fix this without multiple
> scans of the dependency tree (probably costing O(N^lots))?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP FUNCTION failure: cache lookup failed for relation X
Date: 2007-01-27 22:53:42
Message-ID: 23660.1169938422@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I wrote:
> Michael Fuhr <mike(at)fuhr(dot)org> writes:
>> I've found a situation that causes DROP FUNCTION to fail (tested
>> in 8.1.6, 8.2.1, and 8.3devel):
>> http://archives.postgresql.org/pgsql-hackers/2007-01/msg00937.php

> Ugh ... I haven't traced this through in detail, but I'm pretty sure
> the problem arises from the fact that dependency.c traces through
> auto/internal dependencies before actually starting to do the deletions
> (and thus before acquiring locks).

I looked into this a bit more, and found that it's actually a pretty
general issue with the dependency.c code: we delete objects depending
on a target object before we delete the target itself. Which is fine,
except that we don't try to take out any lock on the target object until
we get to the object-type-specific subroutine that's supposed to delete
it.

I think we could fix this for tables by acquiring lock on a table at the
instant it's put into a list for deletion inside dependency.c. That
would be enough to fix Michael's problem instance, but what of other
types of objects? There doesn't seem to be anything preventing somebody
from, say, deleting a function at the same time someone else is creating
an operator depending on the function. We mostly don't take locks on
non-table objects while working with them, and for the most part this is
fairly sane because those objects are defined by a single system catalog
row anyway: either you see the row or you don't. But this means that
the depended-on object could be gone by the time you finish adding a
dependency on it.

It seems a general solution would involve having dependency.c take
exclusive locks on all types of objects (not only tables) as it scans
them and decides they need to be deleted later. And when adding a
pg_depend entry, we'd need to take a shared lock and then recheck to
make sure the object still exists. This would be localized in
dependency.c, but it still seems like quite a lot of mechanism and
cycles added to every DDL operation. And I'm not at all sure that
we'd not be opening ourselves up to deadlock problems.

I'm a bit tempted to fix only the table case and leave the handling of
non-table objects as is. Comments?

regards, tom lane


From: Richard Troy <rtroy(at)ScienceTools(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DROP FUNCTION failure: cache lookup failed for relation
Date: 2007-01-28 20:44:46
Message-ID: Pine.LNX.4.33.0701281238000.30496-100000@denzel.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> It seems a general solution would involve having dependency.c take
> exclusive locks on all types of objects (not only tables) as it scans
> them and decides they need to be deleted later. And when adding a
> pg_depend entry, we'd need to take a shared lock and then recheck to
> make sure the object still exists. This would be localized in
> dependency.c, but it still seems like quite a lot of mechanism and
> cycles added to every DDL operation. And I'm not at all sure that
> we'd not be opening ourselves up to deadlock problems.
>
> I'm a bit tempted to fix only the table case and leave the handling of
> non-table objects as is. Comments?
>
> regards, tom lane

The taking of DDL locks is very unlikely to create a performance problem
for anyone as DML statements typically far outnumber DDL statements.
Further, in my experience, DDL statements are very carefully thought
through and are usually either completely automated by well crafted
programs or are performed by one person at a time - the DBA. I therefore
conclude that any deadlock risk is triflingly small and would be a
self-inflicted circumstance.

Richard

--
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
rtroy(at)ScienceTools(dot)com, http://ScienceTools.com/


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP FUNCTION failure: cache lookup failed for relation X
Date: 2007-02-01 02:38:36
Message-ID: 200702010238.l112cao08697@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Uh, where are we on this?

---------------------------------------------------------------------------

Tom Lane wrote:
> I wrote:
> > Michael Fuhr <mike(at)fuhr(dot)org> writes:
> >> I've found a situation that causes DROP FUNCTION to fail (tested
> >> in 8.1.6, 8.2.1, and 8.3devel):
> >> http://archives.postgresql.org/pgsql-hackers/2007-01/msg00937.php
>
> > Ugh ... I haven't traced this through in detail, but I'm pretty sure
> > the problem arises from the fact that dependency.c traces through
> > auto/internal dependencies before actually starting to do the deletions
> > (and thus before acquiring locks).
>
> I looked into this a bit more, and found that it's actually a pretty
> general issue with the dependency.c code: we delete objects depending
> on a target object before we delete the target itself. Which is fine,
> except that we don't try to take out any lock on the target object until
> we get to the object-type-specific subroutine that's supposed to delete
> it.
>
> I think we could fix this for tables by acquiring lock on a table at the
> instant it's put into a list for deletion inside dependency.c. That
> would be enough to fix Michael's problem instance, but what of other
> types of objects? There doesn't seem to be anything preventing somebody
> from, say, deleting a function at the same time someone else is creating
> an operator depending on the function. We mostly don't take locks on
> non-table objects while working with them, and for the most part this is
> fairly sane because those objects are defined by a single system catalog
> row anyway: either you see the row or you don't. But this means that
> the depended-on object could be gone by the time you finish adding a
> dependency on it.
>
> It seems a general solution would involve having dependency.c take
> exclusive locks on all types of objects (not only tables) as it scans
> them and decides they need to be deleted later. And when adding a
> pg_depend entry, we'd need to take a shared lock and then recheck to
> make sure the object still exists. This would be localized in
> dependency.c, but it still seems like quite a lot of mechanism and
> cycles added to every DDL operation. And I'm not at all sure that
> we'd not be opening ourselves up to deadlock problems.
>
> I'm a bit tempted to fix only the table case and leave the handling of
> non-table objects as is. Comments?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP FUNCTION failure: cache lookup failed for relation X
Date: 2007-02-01 06:05:51
Message-ID: 12192.1170309951@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Uh, where are we on this?

Still in the think-about-it mode, personally ... my proposed fix is
certainly much too invasive to consider back-patching, so unless someone
comes up with a way-simpler idea, it's 8.3 material at best ...

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Fuhr <mike(at)fuhr(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DROP FUNCTION failure: cache lookup failed for relation X
Date: 2007-02-20 22:29:38
Message-ID: 200702202229.l1KMTcY16338@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Added to TODO:

* Increase locking when DROPing objects so dependent objects cannot
get dropped while the DROP operation is happening

http://archives.postgresql.org/pgsql-hackers/2007-01/msg00937.php

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Uh, where are we on this?
>
> Still in the think-about-it mode, personally ... my proposed fix is
> certainly much too invasive to consider back-patching, so unless someone
> comes up with a way-simpler idea, it's 8.3 material at best ...
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Cache lookup failed for relation X [was: DROP FUNCTION cache lookup failed for relation X]
Date: 2007-11-01 08:19:24
Message-ID: 47298C0C.50508@paradise.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
>
> Still in the think-about-it mode, personally ... my proposed fix is
> certainly much too invasive to consider back-patching, so unless someone
> comes up with a way-simpler idea, it's 8.3 material at best ...
>
>
I ran into a variant of this today - simply creating and dropping a
table repeatedly while doing \d from another session:

Session 1:

perl -e 'while (1) {print "drop table if exists z0; \n create table z0
(a int, b int);\n drop table z0;\n"}' | psql cache > z0.log 2>&1

Session 2:

psql cache
=# \d
ERROR: cache lookup failed for relation 945897 (in RelationIsVisible,
namespace.c:406)

The previous discussion centered around working on on locking in
dependency.c whilst dropping related objects - but does this apply when
there is just one? Anyway I tried to understand what was happening and
the attached rather hacky patch seems to cure the behaviour - So I've
submitted it as a discussion aid, rather than 'the way of fixing
this'... since I'm hoping there is a better way :-)

regards

Mark

Attachment Content-Type Size
namspace.c.patch text/x-patch 2.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Cache lookup failed for relation X [was: DROP FUNCTION cache lookup failed for relation X]
Date: 2007-11-01 12:08:20
Message-ID: 2901.1193918900@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> writes:
> The previous discussion centered around working on on locking in
> dependency.c whilst dropping related objects - but does this apply when
> there is just one? Anyway I tried to understand what was happening and
> the attached rather hacky patch seems to cure the behaviour - So I've
> submitted it as a discussion aid, rather than 'the way of fixing
> this'... since I'm hoping there is a better way :-)

AFAICS this just makes the window a bit narrower.

regards, tom lane