Re: Proposal to add a QNX 6.5 port to PostgreSQL

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Baker, Keith [OCDUS Non-J&J]" <KBaker9(at)its(dot)jnj(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal to add a QNX 6.5 port to PostgreSQL
Date: 2014-08-18 13:59:46
Message-ID: 20140818135946.GB461982@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 18, 2014 at 09:01:20AM -0400, Robert Haas wrote:
> On Sat, Aug 16, 2014 at 3:28 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> >> I'd be afraid that a secondary mechanism that mostly-but-not-really
> >> works could do more harm by allowing us to miss bugs in the primary,
> >> pipe-based locking mechanism than the good it would accomplish.
> >
> > Users do corrupt their NFS- and GFS2-hosted databases today. I would rather
> > have each process hold only an fcntl() lock than hold only the FIFO file
> > descriptor. There's no such dichotomy, so let's have both.
>
> Meh. We can do that, but I think that will provide us with only the
> it-works-until-it-doesn't level of protection. Granted, that's more
> than zero, but does anyone advocate wearing seatbelts for the first 60
> minutes you're in the car and then taking them off after that? I
> think that with a sufficiently long-running server the chances of the
> lock somehow getting released approach certainty. But I'm not going
> to fight this one tooth and nail.

In case it wasn't clear, I advocate both using the FIFO defense and holding
fcntl locks throughout the life of every PostgreSQL process having a shared
memory attachment. I grant that this raises the chance of a shortcoming in
one mechanism remaining undiscovered. However, we already know that each by
itself has limitations. I don't like the prospect of accepting a known hole
to help discover unknown holes.

We could have the would-be new postmaster, when it hits a fcntl lock conflict,
proceed with the FIFO check anyway. If the FIFO check says "go" after the
fcntl check said "stop", emit a message about the apparent bug. (That's
oversimplified; it needs looping to account for the case of the old postmaster
exiting concurrently.)

> A bigger question in my view is what to do with the existing
> mechanism. The main advantage of making a change like this is that we
> could finally dispense with System V shared memory completely. But we
> risk encountering systems where the battle-tested System V mechanism
> works and this new one either fails to work at all (server won't
> start) or fails to work as desired (interlock broken). So it's
> tempting to think we should have a GUC or control-file setting to
> control which mechanism gets used. Of course for QNX, the actual
> subject of this thread, System V won't be an option, but other people
> might like a big red button they can push if the new code turns out to
> be less than we're hoping.

A GUC sounds fine to me, as would using the sysv interlock unconditionally for
a couple more releases before removing it.

Thanks,
nm

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sawada Masahiko 2014-08-18 14:16:45 After switching primary server while using replication slot.
Previous Message Heikki Linnakangas 2014-08-18 13:55:11 Re: WAL format and API changes (9.5)