Re: Inadequate thought about buffer locking during hot standby replay

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inadequate thought about buffer locking during hot standby replay
Date: 2012-11-10 20:59:13
Message-ID: CA+U5nMLAuMzfOd4eFmU-MBGWFey8frJkYGXMiqy-OMwTFthfMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 November 2012 18:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> I'm inclined to think that we need to fix this by getting rid of
>> RestoreBkpBlocks per se, and instead having the per-WAL-record restore
>> routines dictate when each full-page image is restored (and whether or
>> not to release the buffer lock immediately). That's not going to be a
>> small change unfortunately :-(
>
> Here's a WIP patch that attacks it in this way. I've only gone as far as
> fixing btree_xlog_split() for the moment, since the main point is to see
> whether the coding change seems reasonable. At least in this function,
> it seems that locking considerations are handled correctly as long as no
> full-page image is used, so it's pretty straightforward to tweak it to
> handle the case with full-page image(s) as well. I've not looked
> through any other replay functions yet --- some of them may need more
> invasive hacking. But so far this looks pretty good.

Looks fine, but we need a top-level comment in btree code explaining
why/how it follows the very well explained rules in the README.

> Note that this patch isn't correct yet even for btree_xlog_split(),
> because I've not removed the RestoreBkpBlocks() call in btree_redo().
> All the btree replay routines will have to get fixed before it's
> testable at all.

I was just about to write you back you missed that...

> One thing that seems a bit annoying is the use of zero-based backup
> block indexes in RestoreBackupBlock, when most (not all) of the callers
> are referencing macros with one-based names (XLR_BKP_BLOCK_1 etc).
> That's a bug waiting to happen. We could address it by changing
> RestoreBackupBlock to accept a one-based argument, but what I'm thinking
> of doing instead is getting rid of the one-based convention entirely;
> that is, remove XLR_BKP_BLOCK_1 through XLR_BKP_BLOCK_4 and instead have
> all code use the XLR_SET_BKP_BLOCK() macro, which is zero-based. One
> advantage of doing that is that it'll force inspection of all code
> related to this.

I wouldn't do that in a back branch, but I can see why its a good idea.

Thanks for fixing.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-11-10 21:24:22 Re: Inadequate thought about buffer locking during hot standby replay
Previous Message Tom Lane 2012-11-10 18:16:46 Re: Inadequate thought about buffer locking during hot standby replay