Re: Miscalculation in IsCheckpointOnSchedule()

Lists: pgsql-patches
From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org
Subject: Miscalculation in IsCheckpointOnSchedule()
Date: 2007-11-14 05:26:12
Message-ID: 20071114132759.30E4.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

I run long DBT-2 with 8.3beta2 and saw checkpoint spikes periodically.
The progress against WAL segments consumption *jumped up* in such cases.

It seems to be a miscalculation in IsCheckpointOnSchedule().
xrecoff is uint32, therefore the difference of two xrecoffs could
be between -4G and +4G. We should not cast it to int32, that domain
is [-2G, +2G).

Here is a patch to fix it, casting xrecoff to double directly.

Index: src/backend/postmaster/bgwriter.c
===================================================================
--- src/backend/postmaster/bgwriter.c (HEAD)
+++ src/backend/postmaster/bgwriter.c (working copy)
@@ -718,7 +718,7 @@
recptr = GetInsertRecPtr();
elapsed_xlogs =
(((double) (int32) (recptr.xlogid - ckpt_start_recptr.xlogid)) * XLogSegsPerFile +
- ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
+ ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
CheckPointSegments;

if (progress < elapsed_xlogs)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Miscalculation in IsCheckpointOnSchedule()
Date: 2007-11-14 05:32:14
Message-ID: 14221.1195018334@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /

Surely this makes matters worse, not better. What happens near a segment
boundary crossing?

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Miscalculation in IsCheckpointOnSchedule()
Date: 2007-11-14 06:09:55
Message-ID: 20071114144003.30E7.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
> > + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
>
> Surely this makes matters worse, not better. What happens near a segment
> boundary crossing?

Here is the dumped progres information by the attached patch
(only for debug purpose).

cur prog. xlog prog. time prog.
[-400ms] 0.030503 0.019247 0.031158 (diff xlogid=0, xrecoff=82665472)
[-200ms] 0.031176 0.019957 0.031839 (diff xlogid=0, xrecoff=85712896)
[*] 0.031860 1.020706 0.032521 (diff xlogid=1, xrecoff=105709568)

> recptr.xrecoff - ckpt_start_recptr.xrecoff

recptr.xrecoff is reset to 0 or so when xlogid is bumped up. At that time,
if ckpt_start_recptr.xrecoff is greater than 2G, we cannot represent
the difference with int32 because the value is less than -2G.
Casting double after int32 does not help us.

We use ( xlogid * 255 * 16MB + xrecoff ) as a base value for the calculation.
If we interprets xrecoff=105709568 at [*] as "minus uint32", we can calcurate
it correctly as below:

without fix with fix
[-400ms] 82665472 82665472 (same)
[-200ms] 85712896 85712896 (same)
[*] 4383899648 88932353 (= 1*255*16MB - (0xFFFFFFFF - 105709568) )

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
83xlogdbg.patch application/octet-stream 1.5 KB

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Miscalculation in IsCheckpointOnSchedule()
Date: 2007-11-14 11:47:08
Message-ID: 473AE03C.9040406@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> - ((double) (int32) (recptr.xrecoff - ckpt_start_recptr.xrecoff)) / XLogSegSize) /
>> + ((double) recptr.xrecoff - (double) ckpt_start_recptr.xrecoff) / XLogSegSize) /
>
> Surely this makes matters worse, not better. What happens near a segment
> boundary crossing?

Hmm. There seems to be another little bug in there. XLogSegsPerFile is
defined as 0xffffffff/XLogSegSize, which is 255 with default settings.
It should be 256. That leads to negative elapsed_xlogs estimates at xlog
file boundaries. XLogCheckpointNeeded suffers from it too; the number of
segments consumed since last checkpoint is off by one per each xlogid,
so we trigger the checkpoint a little bit too late.

Otherwise, the patch looks good to me. I also tested the calculation
with a little C program (attached), and it seems to work on segment and
xlog file boundaries just fine.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
xlog-arithmetic.c text/x-csrc 1.4 KB

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To:
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Miscalculation in IsCheckpointOnSchedule()
Date: 2007-11-14 12:08:38
Message-ID: 473AE546.6080703@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Heikki Linnakangas wrote:
> Tom Lane wrote:
>> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>>> - ((double) (int32) (recptr.xrecoff -
>>> ckpt_start_recptr.xrecoff)) / XLogSegSize) /
>>> + ((double) recptr.xrecoff - (double)
>>> ckpt_start_recptr.xrecoff) / XLogSegSize) /
>>
>> Surely this makes matters worse, not better. What happens near a segment
>> boundary crossing?
>
> Hmm. There seems to be another little bug in there. XLogSegsPerFile is
> defined as 0xffffffff/XLogSegSize, which is 255 with default settings.
> It should be 256. That leads to negative elapsed_xlogs estimates at xlog
> file boundaries. XLogCheckpointNeeded suffers from it too; the number of
> segments consumed since last checkpoint is off by one per each xlogid,
> so we trigger the checkpoint a little bit too late.

I'll take that back. We intentionally don't use the last possible
segment of each xlog file, to avoid overflows. Sorry for the noise.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Miscalculation in IsCheckpointOnSchedule()
Date: 2007-11-14 21:15:33
Message-ID: 15711.1195074933@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Surely this makes matters worse, not better. What happens near a segment
>> boundary crossing?

> Here is the dumped progres information by the attached patch
> (only for debug purpose).

Oh, I take that back. I was thinking that conversion of unsigned to
double usually does the Wrong Thing, but in this context it seems to be
the right thing.

regards, tom lane