Re: [9.3 bug] disk space in pg_xlog increases during archive recovery

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Andres Freund" <andres(at)2ndquadrant(dot)com>, "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <hlinnakangas(at)vmware(dot)com>, "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [9.3 bug] disk space in pg_xlog increases during archive recovery
Date: 2014-02-12 12:23:54
Message-ID: 60EB3E4B873541D393B6364D1F008EDA@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: "Andres Freund" <andres(at)2ndquadrant(dot)com>
> On 2014-02-02 23:50:40 +0900, Fujii Masao wrote:
>> Right. If standby_mode is enabled, checkpoint_segment can trigger
>> the restartpoint. But the problem is that the timing of restartpoint
>> depends on not only the checkpoint parameters (i.e.,
>> checkpoint_timeout and checkpoint_segments) that are used during
>> archive recovery but also the checkpoint WAL that was generated
>> by the master.
>
> Sure. But we really *need* all the WAL since the last checkpoint's redo
> location locally to be safe.
>
>> For example, could you imagine the case where the master generated
>> only one checkpoint WAL since the last backup and it crashed with
>> database corruption. Then DBA decided to perform normal archive
>> recovery by using the last backup. In this case, even if DBA reduces
>> both checkpoint_timeout and checkpoint_segments, only one
>> restartpoint can occur during recovery. This low frequency of
>> restartpoint might fill up the disk space with lots of WAL files.
>
> I am not sure I understand the point of this scenario. If the primary
> crashed after a checkpoint, there won't be that much WAL since it
> happened...
>
>> > If the issue is that you're not using standby_mode (if so, why?), then
>> > the fix maybe is to make that apply to a wider range of situations.
>>
>> I guess that he is not using standby_mode because, according to
>> his first email in this thread, he said he would like to prevent WAL
>> from accumulating in pg_xlog during normal archive recovery (i.e., PITR).
>
> Well, that doesn't necessarily prevent you from using
> standby_mode... But yes, that might be the case.
>
> I wonder if we shouldn't just always look at checkpoint segments during
> !crash recovery.

Maybe we could consider in that direction, but there is a problem. Archive
recovery slows down compared to 9.1, because of repeated restartpoints.
Archive recovery should be as fast as possible, because it typically applies
dozens or hundreds of WAL files, and the DBA desires immediate resumption of
operation.

So, I think we should restore 9.1 behavior for archive recovery. The
attached patch keeps restored archived WAL in pg_xlog/ only during standby
recovery. It is based on Fujii-san's revison of the patch, with
AllowCascadeReplication() condition removed from two if statements.

Regards
MauMau

Attachment Content-Type Size
wal_increase_in_pitr_v4.patch application/octet-stream 5.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2014-02-12 12:28:22 Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Previous Message Andres Freund 2014-02-12 11:57:59 Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease