"SMgrRelation hashtable corrupted" failure identified

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: "SMgrRelation hashtable corrupted" failure identified
Date: 2005-01-10 16:09:58
Message-ID: 13484.1105373398@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We've seen a few reports of the above-mentioned error message from
PG 8.0 testers, but up till now no one had come up with a reproducible
test case. I've now found a trivial example:

session 1: create table a1 (f1 varchar(128));
session 2: insert into a1 values('abc');
session 1: alter table a1 alter column f1 type varchar(256);
session 2: insert into a1 values('abcd');
session 2 fails with ERROR: SMgrRelation hashtable corrupted
continued use of session 2 leads to a crash

Many if not all scenarios involving a rewriting ALTER TABLE on a
table in active use by other backends will fail like this.
I believe there are probably similar failures involving CLUSTER,
though a quick try didn't show it. This seems clearly to be a
"must fix for 8.0" bug.

The basic problem is that when ALTER TABLE tries to swap the physical
files associated with the original table and the temp version of the
table, it sends out relcache inval events for all four combinations
of table OID and relfilenode. Because inval.c is a bit cavalier about
the ordering of inval events, the one that session 2 sees first is the
one for <temp table OID, old relfilenode>. It does not find a relcache
entry for the temp table OID, but it does find an smgr table entry for
the relfilenode, which it proceeds to drop. Now there is a dangling
smgr reference in its relcache, so when it next gets hit with a
relcache clear event for the original table OID, boom!

I fooled around with trying to patch this by enforcing the "right"
processing order of inval events, but that doesn't work (it just moves
the failure into the sending backend, which it turns out would need
a different processing order to avoid crashing). It would be a horribly
fragile solution anyway.

I now think that the only reasonable fix is to directly attack the
problem of dangling relcache references to smgr table entries. What we
can do is add a concept of an "owning pointer" to an smgr entry, that
is an "SMgrRelation *myowner" field, and have smgrclose do
something like
if (reln->myowner)
*(reln->myowner) = NULL;
For smgr table entries associated with a relcache entry, the relcache
code would set this field as a back link to its rel->rd_smgr pointer.
With this setup, an smgr-level clear would correctly unhook from the
relcache even if the clear did not come directly through the relcache.
This would simplify RelationCacheInvalidateEntry and
LocalExecuteInvalidationMessage, which could then treat relcache clear
and smgr clear as independent operations.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-01-10 16:20:17 Re: [HACKERS] [BUGS] More SSL questions..
Previous Message Bruce Momjian 2005-01-10 15:39:36 Re: [HACKERS] [BUGS] More SSL questions..