Re: Production block comparison facility

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Production block comparison facility
Date: 2014-07-22 11:28:03
Message-ID: CA+U5nM+Sy6mnYApn5RyL8u9L2xBJdziMJCQ=S9rr_+f7h_9p=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22 July 2014 08:49, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
> On Sun, Jul 20, 2014 at 5:31 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> The block comparison facility presented earlier by Heikki would not be
>> able to be used in production systems. ISTM that it would be desirable
>> to have something that could be used in that way.
>>
>> ISTM easy to make these changes
>>
>> * optionally generate a FPW for every WAL record, not just first
>> change after checkpoint
>> full_page_writes = 'always'
>>
>> * when an FPW arrives, optionally run a check to see if it compares
>> correctly against the page already there, when running streaming
>> replication without a recovery target. We could skip reporting any
>> problems until the database is consistent
>> full_page_write_check = on
>>
>> The above changes seem easy to implement.
>>
>> With FPW compression, this would be a usable feature in production.
>>
>> Comments?
>
> This is an interesting idea, and it would be easier to use than what
> has been submitted for CF1. However, full_page_writes set to "always"
> would generate a large amount of WAL even for small records,
> increasing I/O for the partition holding pg_xlog, and the frequency of
> checkpoints run on system. Is this really something suitable for
> production?

For critical systems, yes, I think it is.

It would be possible to make that user selectable for particular
transactions or tables.

> Then, looking at the code, we would need to tweak XLogInsert for the
> WAL record construction to always do a FPW and to update
> XLogCheckBufferNeedsBackup. Then for the redo part, we would need to
> do some extra operations in the area of
> RestoreBackupBlock/RestoreBackupBlockContents, including masking
> operations before comparing the content of the FPW and the current
> page.
>
> Does that sound right?

Yes, it doesn't look very much code because it fits well with existing
approaches.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2014-07-22 11:54:58 Re: Production block comparison facility
Previous Message Andres Freund 2014-07-22 10:23:59 Re: [bug fix] Suppress "autovacuum: found orphan temp table" message