Re: WAL replay bugs

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL replay bugs
Date: 2014-06-02 12:55:35
Message-ID: CAB7nPqQhdfirs1v9097V+3APKG6ZVmCChyD=sgb6Qv+DhYEzJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 23, 2014 at 9:43 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> And here is the tool itself. It consists of two parts:
>
> 1. Modifications to the backend to write the page images
> 2. A post-processing tool to compare the logged images between master and
> standby.
Having that into Postgres at the disposition of developers would be
great, and I believe that it would greatly reduce the occurrence of
bugs caused by WAL replay during recovery. So, with the permission of
the author, I have been looking at this facility for a cleaner
integration into Postgres.

Roughly, this utility is made of three parts:
1) A set of masking functions that can be used on page images to
normalize them. This is used to put magic numbers or enforce flag
values to make page content consistent across nodes. This is for
example the case of the free space between pd_lower and pd_upper,
pd_flags, etc. Of course this depends on the type of page (btree,
heap, etc.).
2) Facility to memorize, analyze if they have been modified, and flush
page images to a dedicated file. This interacts with the buffer
manager mainly.
3) Facility to reorder page images within the same WAL record as
master/standby may not write them in the same order on a standby or a
master due to for example lock released in different order. This is
part of the binary analyzing the diffs between master and standby.

As of now, 2) is integrated in the backend, 1) and 3) are part of the
contrib module. However I am thinking that 1) and 2) should be done in
core using an ifdef similar to CLOBBER_FREED_MEMORY, to mask the page
images and write them in a dedicated file (in global/ ?), while 3)
would be fine as a separate binary in contrib/. An essential thing to
add would be to have a set of regression tests that developers and
buildfarm machines could directly use.

Perhaps there are parts of what is proposed here that could be made
more generalized, like the masking functions. So do not hesitate if
you have any opinion on the matter.

Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2014-06-02 13:10:27 Re: backup_label revisited
Previous Message ash 2014-06-02 12:52:06 Re: Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?