Re: Report: race conditions in WAL replay routines

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Report: race conditions in WAL replay routines
Date: 2012-02-06 09:13:05
Message-ID: CA+U5nM+ZPOFnhUHFa=ttyn7HoQyohmp1AHSExHPsNzYoC38V0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 5, 2012 at 10:23 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> Please post the patch rather than fixing directly. There's some subtle
>> stuff there and it would be best to discuss first.
>
> Here's a proposed patch for the issues around unlocked updates of
> shared-memory state.  After going through this I believe that there is
> little risk of any real problems in the current state of the code; this
> is more in the nature of future-proofing against foreseeable changes.
> (One such is that we'd discussed fixing the age() function to work
> during Hot Standby.)  So I suggest applying this to HEAD but not
> back-patching.

All looks very good to me. Agreed.

> Except for one thing.  I realized while looking at the NEXTOID replay
> code that it is completely broken: it only advances
> ShmemVariableCache->nextOid when that's less than the value in the WAL
> record.  So that comparison fails if the OID counter wraps around during
> replay.  I've fixed this in the attached patch by just forcibly
> assigning the new value instead of trying to be smart, and I think
> probably that aspect of it needs to be back-patched.

Ouch! Well spotted.

Suggest fixing that as a separate patch; looks like backpatch to 8.0.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-02-06 09:21:02 Re: Report: race conditions in WAL replay routines
Previous Message Simon Riggs 2012-02-06 09:05:19 Re: 16-bit page checksums for 9.2