Regarding Checkpoint Redo Record

Lists: pgsql-hackers
From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: 'PG Hackers' <pgsql-hackers(at)postgresql(dot)org>
Subject: Regarding Checkpoint Redo Record
Date: 2012-01-04 06:42:26
Message-ID: B6907F5537A1430AABC57E0C928ABE4A@china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Why PostgreSQL needs to write WAL record for Checkpoint when it maintains
same information in pg_control file?

This may be required in case we need information about more than one
checkpoint as pg_control can hold information of only recent checkpoint. But
I could not think of a case where more than one checkpoint information will
be required.

Could anybody let me know the cases where it is required.


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: 'PG Hackers' <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Regarding Checkpoint Redo Record
Date: 2012-01-04 15:56:55
Message-ID: 4F0476C7.2040306@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04.01.2012 08:42, Amit Kapila wrote:
> Why PostgreSQL needs to write WAL record for Checkpoint when it maintains
> same information in pg_control file?

I guess it wouldn't be strictly necessary...

> This may be required in case we need information about more than one
> checkpoint as pg_control can hold information of only recent checkpoint. But
> I could not think of a case where more than one checkpoint information will
> be required.

There is a pointer in the control file to the previous checkpoint, too.
It's not normally needed, but we fall back to that if the latest
checkpoint cannot be read for some reason, like disk failure. If you
have a disk failure and cannot read the latest checkpoint, chances are
that you have a corrupt database anyway, but at least we try to recover
as much as we can using the previous checkpoint.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Regarding Checkpoint Redo Record
Date: 2012-01-04 16:02:03
Message-ID: CA+U5nMLcxEhdAU_EJ1VuyesZM-zaTiqB35u6JXSVRT_k2hhA_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 4, 2012 at 3:56 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 04.01.2012 08:42, Amit Kapila wrote:
>>
>> Why PostgreSQL needs to write WAL record for Checkpoint when it maintains
>> same information in pg_control file?
>
>
> I guess it wouldn't be strictly necessary...

Apart from replicated standbys, which need that info for running restartpoints.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Regarding Checkpoint Redo Record
Date: 2012-01-04 18:00:06
Message-ID: CA+Tgmoa4GGiUFW+SvDhqXzGjN-xLcSpMJyXBONVt58tGDPOrcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 4, 2012 at 11:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, Jan 4, 2012 at 3:56 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 04.01.2012 08:42, Amit Kapila wrote:
>>>
>>> Why PostgreSQL needs to write WAL record for Checkpoint when it maintains
>>> same information in pg_control file?
>>
>>
>> I guess it wouldn't be strictly necessary...
>
> Apart from replicated standbys, which need that info for running restartpoints.

Yeah.

But, the OP makes me wonder: why can a standby only perform a
restartpoint where the master performed a checkpoint? It seems like a
standby ought to be able to create a restartpoint anywhere, just by
writing everything, flushing it to disk, and update pg_control. I
assume there's some reason that doesn't work, I just don't know what
it is...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Regarding Checkpoint Redo Record
Date: 2012-01-04 21:06:45
Message-ID: 22044.1325711205@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> But, the OP makes me wonder: why can a standby only perform a
> restartpoint where the master performed a checkpoint? It seems like a
> standby ought to be able to create a restartpoint anywhere, just by
> writing everything, flushing it to disk, and update pg_control.

Perhaps, but then crash restarts would have to accept start pointers
that point at any random place in the WAL. I like the additional error
checking of verifying that there's a checkpoint recod there. Also
I think the full-page-write mechanism would no longer protect against
torn pages during replay if you did that.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Regarding Checkpoint Redo Record
Date: 2012-01-04 21:26:47
Message-ID: CA+TgmoZC=oijDYkWtg3dBX+ZcXFy-Wfeu3QDD8rJbpkJ71TRtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 4, 2012 at 4:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> But, the OP makes me wonder: why can a standby only perform a
>> restartpoint where the master performed a checkpoint?  It seems like a
>> standby ought to be able to create a restartpoint anywhere, just by
>> writing everything, flushing it to disk, and update pg_control.
>
> Perhaps, but then crash restarts would have to accept start pointers
> that point at any random place in the WAL.  I like the additional error
> checking of verifying that there's a checkpoint recod there.

I could go either way on that one, but...

> Also
> I think the full-page-write mechanism would no longer protect against
> torn pages during replay if you did that.

...this is a very good point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company