Re: Production block comparison facility

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Production block comparison facility
Date: 2014-07-20 08:31:26
Message-ID: CA+U5nMLb2g0Rreatc_HJb6VDMgh9DKfnya1YkZFckptcaZXeaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The block comparison facility presented earlier by Heikki would not be
able to be used in production systems. ISTM that it would be desirable
to have something that could be used in that way.

ISTM easy to make these changes

* optionally generate a FPW for every WAL record, not just first
change after checkpoint
full_page_writes = 'always'

* when an FPW arrives, optionally run a check to see if it compares
correctly against the page already there, when running streaming
replication without a recovery target. We could skip reporting any
problems until the database is consistent
full_page_write_check = on

The above changes seem easy to implement.

With FPW compression, this would be a usable feature in production.

Comments?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-22 07:49:12
Message-ID: CAB7nPqQmb7HGoi-8BEM2+UughLfMRCHJdWsKa7MzG9NsSo+u_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 20, 2014 at 5:31 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> The block comparison facility presented earlier by Heikki would not be
> able to be used in production systems. ISTM that it would be desirable
> to have something that could be used in that way.
>
> ISTM easy to make these changes
>
> * optionally generate a FPW for every WAL record, not just first
> change after checkpoint
> full_page_writes = 'always'
>
> * when an FPW arrives, optionally run a check to see if it compares
> correctly against the page already there, when running streaming
> replication without a recovery target. We could skip reporting any
> problems until the database is consistent
> full_page_write_check = on
>
> The above changes seem easy to implement.
>
> With FPW compression, this would be a usable feature in production.
>
> Comments?

This is an interesting idea, and it would be easier to use than what
has been submitted for CF1. However, full_page_writes set to "always"
would generate a large amount of WAL even for small records,
increasing I/O for the partition holding pg_xlog, and the frequency of
checkpoints run on system. Is this really something suitable for
production?
Then, looking at the code, we would need to tweak XLogInsert for the
WAL record construction to always do a FPW and to update
XLogCheckBufferNeedsBackup. Then for the redo part, we would need to
do some extra operations in the area of
RestoreBackupBlock/RestoreBackupBlockContents, including masking
operations before comparing the content of the FPW and the current
page.

Does that sound right?
--
Michael


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-22 11:28:03
Message-ID: CA+U5nM+Sy6mnYApn5RyL8u9L2xBJdziMJCQ=S9rr_+f7h_9p=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22 July 2014 08:49, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
> On Sun, Jul 20, 2014 at 5:31 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> The block comparison facility presented earlier by Heikki would not be
>> able to be used in production systems. ISTM that it would be desirable
>> to have something that could be used in that way.
>>
>> ISTM easy to make these changes
>>
>> * optionally generate a FPW for every WAL record, not just first
>> change after checkpoint
>> full_page_writes = 'always'
>>
>> * when an FPW arrives, optionally run a check to see if it compares
>> correctly against the page already there, when running streaming
>> replication without a recovery target. We could skip reporting any
>> problems until the database is consistent
>> full_page_write_check = on
>>
>> The above changes seem easy to implement.
>>
>> With FPW compression, this would be a usable feature in production.
>>
>> Comments?
>
> This is an interesting idea, and it would be easier to use than what
> has been submitted for CF1. However, full_page_writes set to "always"
> would generate a large amount of WAL even for small records,
> increasing I/O for the partition holding pg_xlog, and the frequency of
> checkpoints run on system. Is this really something suitable for
> production?

For critical systems, yes, I think it is.

It would be possible to make that user selectable for particular
transactions or tables.

> Then, looking at the code, we would need to tweak XLogInsert for the
> WAL record construction to always do a FPW and to update
> XLogCheckBufferNeedsBackup. Then for the redo part, we would need to
> do some extra operations in the area of
> RestoreBackupBlock/RestoreBackupBlockContents, including masking
> operations before comparing the content of the FPW and the current
> page.
>
> Does that sound right?

Yes, it doesn't look very much code because it fits well with existing
approaches.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Greg Stark <stark(at)mit(dot)edu>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-22 11:54:58
Message-ID: CAM-w4HPk0mmt1SPFmNBJdd6affOEB23UAoW8KE1EnEqbZ7SqDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

If you're always going FPW then there's no point in the rest of the record.
The point here was to find problems so that users could run normally with
confidence.

The cases you might want to run in the mode you describe are the build farm
or integration testing. When treating your application on the next release
of postgres it would be nice to have tests for the replication in your
workload given the experience in 9.3.

Even without the constant full page writes a live production system could
do a FPW comparison after a FPW if it was in a consistent state. That would
give standbys periodic verification at low costs.

--
greg
On 22 Jul 2014 12:28, "Simon Riggs" <simon(at)2ndquadrant(dot)com> wrote:

> On 22 July 2014 08:49, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
> > On Sun, Jul 20, 2014 at 5:31 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
> >> The block comparison facility presented earlier by Heikki would not be
> >> able to be used in production systems. ISTM that it would be desirable
> >> to have something that could be used in that way.
> >>
> >> ISTM easy to make these changes
> >>
> >> * optionally generate a FPW for every WAL record, not just first
> >> change after checkpoint
> >> full_page_writes = 'always'
> >>
> >> * when an FPW arrives, optionally run a check to see if it compares
> >> correctly against the page already there, when running streaming
> >> replication without a recovery target. We could skip reporting any
> >> problems until the database is consistent
> >> full_page_write_check = on
> >>
> >> The above changes seem easy to implement.
> >>
> >> With FPW compression, this would be a usable feature in production.
> >>
> >> Comments?
> >
> > This is an interesting idea, and it would be easier to use than what
> > has been submitted for CF1. However, full_page_writes set to "always"
> > would generate a large amount of WAL even for small records,
> > increasing I/O for the partition holding pg_xlog, and the frequency of
> > checkpoints run on system. Is this really something suitable for
> > production?
>
> For critical systems, yes, I think it is.
>
> It would be possible to make that user selectable for particular
> transactions or tables.
>
> > Then, looking at the code, we would need to tweak XLogInsert for the
> > WAL record construction to always do a FPW and to update
> > XLogCheckBufferNeedsBackup. Then for the redo part, we would need to
> > do some extra operations in the area of
> > RestoreBackupBlock/RestoreBackupBlockContents, including masking
> > operations before comparing the content of the FPW and the current
> > page.
> >
> > Does that sound right?
>
> Yes, it doesn't look very much code because it fits well with existing
> approaches.
>
> --
> Simon Riggs http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-22 12:46:13
Message-ID: CA+U5nMKPoc6Z32Zu+3xrpf+MdYcEA6h1j=6LRHnSMRwKT367Cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22 July 2014 12:54, Greg Stark <stark(at)mit(dot)edu> wrote:
> If you're always going FPW then there's no point in the rest of the record.

I think its a simple matter to mark them XLP_BKP_REMOVABLE and to skip
any optimization of remainder of WAL records.

> The point here was to find problems so that users could run normally with
> confidence.

Yes, but a full overwrite mode would provide an even safer mode of operation.

> The cases you might want to run in the mode you describe are the build farm
> or integration testing. When treating your application on the next release
> of postgres it would be nice to have tests for the replication in your
> workload given the experience in 9.3.
>
> Even without the constant full page writes a live production system could do
> a FPW comparison after a FPW if it was in a consistent state. That would
> give standbys periodic verification at low costs.

Yes, the two options I proposed are somewhat independent of each other.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-23 14:14:25
Message-ID: CAB7nPqR4vxdKijP+Du82vOcOnGMvutq-gfqiU2dsH4bsM77hYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 22, 2014 at 4:49 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> Then, looking at the code, we would need to tweak XLogInsert for the
> WAL record construction to always do a FPW and to update
> XLogCheckBufferNeedsBackup. Then for the redo part, we would need to
> do some extra operations in the area of
> RestoreBackupBlock/RestoreBackupBlockContents, including masking
> operations before comparing the content of the FPW and the current
> page.
>
> Does that sound right?
>

I have spent some time digging more into this idea and finished with the
patch attached, doing the following: addition of a consistency check when
FPW is restored and applied on a given page.

The consistency check is made of two phases:
- Apply a mask on the FPW and the current page to eliminate potential
conflicts like hint bits for example.
- Check that the FPW is consistent with the current page, aka the current
page does not contain any new information that the FPW taken has not. This
is done by checking the masked portions of the FPW and the current page.
Also some more details:
- If an inconsistency is found, a WARNING is simply logged.
- The consistency check is done if current page is not empty, and if
database has reached a consistent state.
- The page masking API is taken from the WAL replay patch that was
submitted in CF1 and plugged in as an independent set of API.
- In masking stuff, to facilitate if a page is used by a sequence relation
SEQ_MAGIC as well as the its opaque data structure are renamed and moved
into sequence.h.
- To facilitate debugging and comparison, the masked FPW and current page
are also converted into hex.
Things could be refactored and improved for sure, but this patch is already
useful as-is so I am going to add it to the next commit fest.

Comments are welcome.
Regards,
--
Michael

Attachment Content-Type Size
0001-Add-facility-to-check-FPW-consistency-at-WAL-replay.patch text/plain 18.0 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-23 15:35:13
Message-ID: CA+U5nMJacYOcQhWOmA_gdEEfTGhoJLUtXj9yQUAQsynp8UXqCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 23 July 2014 15:14, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:

> I have spent some time digging more into this idea and finished with the
> patch attached

Thank you for investigating the idea. I'll review by Monday.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-24 11:35:04
Message-ID: CAB7nPqQ9pSZUjX-oYioFM2F_qqZwKd2ZsMfab8c53Zhnv+sBQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 24, 2014 at 12:35 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 23 July 2014 15:14, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>
>> I have spent some time digging more into this idea and finished with the
>> patch attached
>
> Thank you for investigating the idea. I'll review by Monday.
OK, thanks. Here are a couple of things that are not really necessary
for the feature but I did to facilitate tests with the patch as well
as its review:
- Some information is logged to the user as DEBUG1 even if the current
page and FDW are consistent. It may be better removed.
- FPW/page consistency check is done after converting them to hex.
This is done only this way to facilitate viewing the page diffs with a
debugger. A best method would be to perform the checks using
MASK_MARKER (which should be moved to bufmask.h btw). It may be better
to put all this hex magic within a WAL_DEBUG ifdef.
Regards,
--
Michael


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-24 11:36:45
Message-ID: 20140724113645.GC16857@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-07-24 20:35:04 +0900, Michael Paquier wrote:
> - FPW/page consistency check is done after converting them to hex.
> This is done only this way to facilitate viewing the page diffs with a
> debugger. A best method would be to perform the checks using
> MASK_MARKER (which should be moved to bufmask.h btw). It may be better
> to put all this hex magic within a WAL_DEBUG ifdef.

Can't you just do "p/x whatever" in the debugger to display things in
hex?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-24 11:40:51
Message-ID: CAB7nPqSjZzBnBRnO0P-8vf4-gdrrGbkP_kW9WHj4H_nDzLu7eA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 24, 2014 at 8:36 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-07-24 20:35:04 +0900, Michael Paquier wrote:
>> - FPW/page consistency check is done after converting them to hex.
>> This is done only this way to facilitate viewing the page diffs with a
>> debugger. A best method would be to perform the checks using
>> MASK_MARKER (which should be moved to bufmask.h btw). It may be better
>> to put all this hex magic within a WAL_DEBUG ifdef.
>
> Can't you just do "p/x whatever" in the debugger to display things in
> hex?
Well yes :p
--
Michael


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-29 10:30:12
Message-ID: 53D777B4.3040300@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07/23/2014 05:14 PM, Michael Paquier wrote:
> On Tue, Jul 22, 2014 at 4:49 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
> wrote:
>
>> Then, looking at the code, we would need to tweak XLogInsert for the
>> WAL record construction to always do a FPW and to update
>> XLogCheckBufferNeedsBackup. Then for the redo part, we would need to
>> do some extra operations in the area of
>> RestoreBackupBlock/RestoreBackupBlockContents, including masking
>> operations before comparing the content of the FPW and the current
>> page.
>>
>> Does that sound right?
>>
>
> I have spent some time digging more into this idea and finished with the
> patch attached, doing the following: addition of a consistency check when
> FPW is restored and applied on a given page.
>
> The consistency check is made of two phases:
> - Apply a mask on the FPW and the current page to eliminate potential
> conflicts like hint bits for example.
> - Check that the FPW is consistent with the current page, aka the current
> page does not contain any new information that the FPW taken has not. This
> is done by checking the masked portions of the FPW and the current page.
> Also some more details:
> - If an inconsistency is found, a WARNING is simply logged.
> - The consistency check is done if current page is not empty, and if
> database has reached a consistent state.
> - The page masking API is taken from the WAL replay patch that was
> submitted in CF1 and plugged in as an independent set of API.
> - In masking stuff, to facilitate if a page is used by a sequence relation
> SEQ_MAGIC as well as the its opaque data structure are renamed and moved
> into sequence.h.
> - To facilitate debugging and comparison, the masked FPW and current page
> are also converted into hex.
> Things could be refactored and improved for sure, but this patch is already
> useful as-is so I am going to add it to the next commit fest.

I don't understand how this works. A full-page image contains the new
page contents *after* the WAL-logged operation. For example, in a heap
insert, the full-page image contains the new tuple. How can you compare
that with what's on the disk already?

ISTM you'd need to log two full-page images for every WAL record. A
before image and an after image. Then you could do a lot of checking:

1. the before image should match what's on disk already
2. the result after applying the WAL record should match the after image.

That would be more handy than the approach I used, where the page images
are logged to a separate file. You wouldn't need to deal with any new
files, as all the data is in the WAL. Verification would be done
directly in the standby, with no need to run any extra programs.

- Heikki


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-29 13:00:26
Message-ID: CAB7nPqTZR7EKgQHPFjbbp_vo4MpW8BLiz-0RqqdApFFSEo-MpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 29, 2014 at 7:30 PM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> I don't understand how this works. A full-page image contains the new page
> contents *after* the WAL-logged operation. For example, in a heap insert,
> the full-page image contains the new tuple. How can you compare that with
> what's on the disk already?

An exact match of the FPW and the current page is not done, the patch
as it stands now checks if a FPW is consistent with the content of
current page by checking if it does not include changes that diverge
from what the FPW has.
For example for a heap insert, if current page has N records
pointer1/tup1..pointerN/tupN, FPW should only contain (N+1) records
pointer1/tup1..pointer(N+1)/tup(N+1). After applying the mask at block
recovery, process simply checks that the FPW and current page contain
the first N records, marking FPW and current page as inconsistent if
the current page has some garbage like some extra tuple entries not in
the FPW. I am sure you have arguments against that though...

> ISTM you'd need to log two full-page images for every WAL record. A before
> image and an after image.
The after image is the current FPW, so there is nothing else to do for
it. But for the before buffer, what do you think about using
ReadBufferExtended with RBM_NORMAL? We could grab its content from
disk in XLogInsert only when we are sure that a backup block is added.

> Then you could do a lot of checking:
> 1. the before image should match what's on disk already
> 2. the result after applying the WAL record should match the after image.
A WAL record can contain up to XLR_MAX_BKP_BLOCKS backup blocks.
Should we double it from 4 to 8?

> That would be more handy than the approach I used, where the page images are
> logged to a separate file. You wouldn't need to deal with any new files, as
> all the data is in the WAL. Verification would be done directly in the
> standby, with no need to run any extra programs.
In this case, would it better to control that with a GUC? Making that
the default will increase the amount of WAL for all types of
applications, except if couple with FPW compression...
Regards,
--
Michael


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-31 05:59:45
Message-ID: CA+U5nMJHRkM-Pz1LG3792TaYtFeMMdKtq4KSyhVezo0EzbVvRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 July 2014 11:30, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

> I don't understand how this works. A full-page image contains the new page
> contents *after* the WAL-logged operation. For example, in a heap insert,
> the full-page image contains the new tuple. How can you compare that with
> what's on the disk already?
>
> ISTM you'd need to log two full-page images for every WAL record. A before
> image and an after image. Then you could do a lot of checking:
>
> 1. the before image should match what's on disk already
> 2. the result after applying the WAL record should match the after image.
>
> That would be more handy than the approach I used, where the page images are
> logged to a separate file. You wouldn't need to deal with any new files, as
> all the data is in the WAL. Verification would be done directly in the
> standby, with no need to run any extra programs.

It doesn't matter whether we take a before or after image of the page.

What is important is that we make the check on the standby at the same
point as the full page was taken on the master. After all, the pages
are marked as removable.

Given the pages are after images, then we just make the check after
applying WAL.

So I don't see the need for two full page images.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-31 06:45:20
Message-ID: CAB7nPqS8dF7HDsGU3HSOkusfEZp_XyPP8mAwwL5Z6AeC4YVLEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 31, 2014 at 2:59 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 29 July 2014 11:30, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
>
>> I don't understand how this works. A full-page image contains the new page
>> contents *after* the WAL-logged operation. For example, in a heap insert,
>> the full-page image contains the new tuple. How can you compare that with
>> what's on the disk already?
>>
>> ISTM you'd need to log two full-page images for every WAL record. A before
>> image and an after image. Then you could do a lot of checking:
>>
>> 1. the before image should match what's on disk already
>> 2. the result after applying the WAL record should match the after image.
>>
>> That would be more handy than the approach I used, where the page images are
>> logged to a separate file. You wouldn't need to deal with any new files, as
>> all the data is in the WAL. Verification would be done directly in the
>> standby, with no need to run any extra programs.
>
> It doesn't matter whether we take a before or after image of the page.
>
> What is important is that we make the check on the standby at the same
> point as the full page was taken on the master. After all, the pages
> are marked as removable.
>
> Given the pages are after images, then we just make the check after
> applying WAL.
>
> So I don't see the need for two full page images.
By doing so you definitely need an additional mode for full-page
writes: one certifying that process does not apply this FPW because it
wants to compare it to current page after applying the WALs. This
increases the footprint of the feature on code because all the code
paths where RestoreBackupBlock is called need to be bypassed.
--
Michael


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-31 07:07:52
Message-ID: CA+U5nML-TXpjUVbYyCwY7STqEWBcoKLw_6NaEOyPVxbR3KiKng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 31 July 2014 07:45, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:

>> So I don't see the need for two full page images.

> By doing so you definitely need an additional mode for full-page
> writes: one certifying that process does not apply this FPW because it
> wants to compare it to current page after applying the WALs. This
> increases the footprint of the feature on code because all the code
> paths where RestoreBackupBlock is called need to be bypassed.

Yeh, it looks like you need to do CheckBackupBlock() exactly as many
times as you do RestoreBackupBlock(), with the sequence of actions
being RestoreBackupBlock(), apply WAL then CheckBackupBlock(). That
will work without much code churn, it will be just a one line addition
in a few dozen places.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-31 08:02:57
Message-ID: CAB7nPqSazOSnSbcE5fqjF=WFS_pEuZbno9drPdqS1sEFf9j_ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 31, 2014 at 4:07 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Yeh, it looks like you need to do CheckBackupBlock() exactly as many
> times as you do RestoreBackupBlock(), with the sequence of actions
> being RestoreBackupBlock(), apply WAL then CheckBackupBlock(). That
> will work without much code churn, it will be just a one line addition
> in a few dozen places.
Additionally, as this is a recovery-only feature, I was thinking that
it would be better to control it with a parameter of recovery.conf.
Let's call it check_full_page_writes for example. Thoughts?
--
Michael


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-08-13 03:39:21
Message-ID: CAB7nPqQMq=4eJAK317mxZ4Has0i+1rSLBQU29zx18JwLB2j1OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jul 23, 2014 at 11:14 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> Things could be refactored and improved for sure, but this patch is already
> useful as-is so I am going to add it to the next commit fest.

After some more investigation, I am going to mark this patch as
"Returned with feedback" for the time being (mainly to let it show up
on the commit fest app and for the sake of archives), Mainly for two
reasons:
- We can do better than what I sent: instead of checking if the FPW
and the current page are somewhat consistent, we could actually check
if the current page is equal with the FPW after applying WAL on it. In
order to do that, we would need to bypass the FPW replay and to apply
WAL on the current page (if the page is already initialized), then
control RestoreBackupBlock (or its equivalent) that with an additional
flag to tell that block is "not restored, but can get WAL applied to
it safely". Then a comparison with the FPW contained in the WAL record
can be made.
- The patch of Heikki to change the WAL APIs and track more easily the
blocks changes is going to make this implementation far easier. It
also improves the status checks on which block has been restored, so
it is more easily extensible for what could be done here.

Regards,
--
Michael