"Bogus data in lock file" shouldn't be FATAL?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: "Bogus data in lock file" shouldn't be FATAL?
Date: 2010-08-16 15:18:32
Message-ID: 4987.1281971912@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This complaint:
http://archives.postgresql.org/pgsql-admin/2010-08/msg00111.php

seems to suggest that this code in CreateLockFile() is not well-thought-out:

if (other_pid <= 0)
elog(FATAL, "bogus data in lock file \"%s\": \"%s\"",
filename, buffer);

as it means that a corrupted (empty, in this case) postmaster.pid file
prevents the server from starting until somebody intervenes manually.

I think that the original concern was that if we couldn't read valid
data out of postmaster.pid then we couldn't be sure if there were a
conflicting postmaster running. But if that's the plan then
CreateLockFile is violating it further down, where it happily skips the
PGSharedMemoryIsInUse check if it fails to pull shmem ID numbers from
the file.

We could perhaps address that risk another way: after having written
postmaster.pid, try to read it back to verify that it contains what we
wrote, and abort if not. Then, if we can't read it during startup,
it's okay to assume there is no conflicting postmaster.

Or, given the infrequency of complaints, maybe it's better not to touch
this. Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Itagaki Takahiro 2010-08-16 15:20:34 Re: Committers info for the git migration - URGENT!
Previous Message Stephen Frost 2010-08-16 15:15:20 Re: security label support, part.2