Rethinking locking for database create/drop vs connection startup

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Rethinking locking for database create/drop vs connection startup
Date: 2006-05-03 20:15:43
Message-ID: 25537.1146687343@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This is motivated by Jim Buttafuoco's recent gripe about not being
able to connect while a DROP DATABASE is in progress:
http://archives.postgresql.org/pgsql-hackers/2006-05/msg00074.php
The whole area is really pretty grotty anyway --- the existing interlock
does not prevent an incoming connection from trying to connect to the
victim database, only make sure that we detect it later. This is not
very good, for two reasons. One is that you'll most likely get a very
unfriendly error message due to attempts to access already-missing
system catalogs; when I experimented just now I got

psql: FATAL: could not open relation 1663/104854/1259: No such file or directory

which is really not the way I'd like to report "database foo just got
deleted". The other problem is that I'm not entirely convinced that a
backend trying to do this won't leave any permanent problems behind,
most likely in the form of dirty shared buffers for subsequently-deleted
system catalogs in the victim database. ReverifyMyDatabase tries to
clean that up by doing DropDatabaseBuffers, but that only helps if we
get as far as ReverifyMyDatabase.

It strikes me that we now have a decent tool for solving the problem,
which is LockSharedObject() --- that is, there exists a locktag
convention whereby we can "take a lock" on a database as such, rather
than having to use table-level locks on pg_database as proxy. The
locktag would be in the form of an OID so it would identify a DB by
OID. If dropdb() takes such a lock before it checks for active
backends, then the connection sequence can look like this:

1. read pg_database flat file to find out OID of target DB
2. initialize far enough to be able to start a transaction,
and do so
3. take a shared lock on the target DB by OID
4. re-read pg_database flat file and verify DB still exists

If step 4 fails to find the DB in the flat file, then we can bomb
out before we've made any attempt to touch catalogs of the target
DB. This ensures both a reasonable error message, and no pollution
of shared buffers. If we get past step 4 then we don't have to worry
about concurrent dropdb() anymore. (The shared lock will only last
until we commit the startup transaction, but that's OK --- once we
are listed in the PGPROC array we don't need the lock anymore.)

It's slightly annoying to have to read the flat file twice, but
for reasonable numbers of databases per installation I don't think
this will pose any material performance penalty. The file will
certainly still be sitting in kernel disk cache.

It's still necessary to serialize CREATE/DROP DATABASE commands against
each other, to ensure that only one backend tries to write the flat file
at a time, but with this scheme they'd not need to block connections
being made to unrelated databases.

Thoughts, better ideas?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonah H. Harris 2006-05-03 20:23:59 Re: [SoC] Relation between project "XML improvements" and "pgxml"
Previous Message elein 2006-05-03 19:08:48 Re: Is a SERIAL column a "black box", or not?