Re: Going for "all green" buildfarm results

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-18 14:06:52
Message-ID: 8105.1155910012@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
> Tom Lane wrote:
>> Vacuum's always had a race condition: it makes a list of rel OIDs and
>> then tries to vacuum each one. It narrows the window for failure by
>> doing a SearchSysCacheExists test before relation_open, but there's
>> still a window for failure.

> hmm yeah - missed the VACUUM; part of the regression diff.
> Still this means we will have to live with (rare) failures once in a
> while during that test ?

I thought of what seems a pretty simple solution for this: make VACUUM
lock the relation before doing the SearchSysCacheExists, ie instead
of the existing code

if (!SearchSysCacheExists(RELOID,
ObjectIdGetDatum(relid),
0, 0, 0))
// give up

lmode = vacstmt->full ? AccessExclusiveLock : ShareUpdateExclusiveLock;

onerel = relation_open(relid, lmode);

do

lmode = vacstmt->full ? AccessExclusiveLock : ShareUpdateExclusiveLock;

LockRelationOid(relid, lmode);

if (!SearchSysCacheExists(RELOID,
ObjectIdGetDatum(relid),
0, 0, 0))
// give up

onerel = relation_open(relid, NoLock);

Once we're holding lock, we can be sure there's not a DROP TABLE in
progress, so there's no race condition anymore. It's OK to take a
lock on the OID of a relation that no longer exists, AFAICS; we'll
just drop it again immediately (the "give up" path includes transaction
exit, so there's not even any extra code needed).

This wasn't possible before the recent adjustments to the relation
locking protocol, but now it looks trivial ... am I missing anything?

Perhaps it is worth folding this test into a "conditional_relation_open"
function that returns NULL instead of failing if the rel no longer
exists. I think there are potential uses in CLUSTER and perhaps REINDEX
as well as VACUUM.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-08-18 14:15:05 Re: Autovacuum on by default?
Previous Message Andrew Dunstan 2006-08-18 14:01:20 Re: Going for "all green" buildfarm results